为python中的每一列设置唯一的缩写

2024-04-27 17:56:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我在csv文件中有这样的数据

Ad Group
Annuity Calculator
Tax Deferred Annuity
Annuity Tables
annuities calculator
annuity formula
Annuities Explained
Deferred Annuies Calculator
Current Annuity Rates
Forbes.com
Annuity Definition
fixed income
Immediate fixed Annuities
Deferred Variable Annuities
401k Rollover
Deferred Annuity Rates
Deferred Annuities
Immediate Annuities Definition
Immediate Variable Annuities
Variable Annuity
Aig Annuities
Retirement Income
retirment system
Online Financial Planner
Certified Financial Planner

我想为每一列设置一个唯一的缩写。例如:

  • 年金计算器=annca
  • 年金计算器

你能帮我找出用python最好的方法吗。你知道吗

谢谢


Tags: 文件csv数据variablecalculator计算器fixedfinancial
1条回答
网友
1楼 · 发布于 2024-04-27 17:56:46

你的问题不是很明确,但似乎很有趣。我试了一下。我编写了一个函数,它获取短语列表并返回一个字典,其中缩写用作键。它从每个单词的前两个字母开始,然后将它们连接起来,形成一个候选缩写。如果这个缩写词在使用之前就已经出现了,那么从每个单词的开头到你得到一个独特的缩写词,它会逐渐产生越来越多的字母。然后我用你的样本数据测试了一下。你几乎肯定会想修改它,但它应该给你一些想法:

def makeAbbreviations(headers):
    abbreviations = {}
    for header in headers:
        header = header.lower()
        words = header.split()
        n = max(len(w) for w in words)
        i = 2
        starts = [w[:i] for w in words]
        abbrev = ''.join(starts)

        while abbrev in abbreviations and i <= n:
            i += 1
            for j,w in enumerate(words):
                starts[j] = w[:i]
                abbrev = ''.join(starts)
                if not abbrev in abbreviations: break
        abbreviations[abbrev] = header
    return abbreviations

myHeaders = ['Ad Group', 'Annuity Calculator', 'Tax Deferred Annuity',
             'Annuity Tables', 'annuities calculator', 'annuity formula',
             'Annuities Explained', 'Deferred Annuies Calculator',
             'Current Annuity Rates', 'Forbes.com', 'Annuity Definition',
             'fixed income', 'Immediate fixed Annuities',
             'Deferred Variable Annuities', '401k Rollover',
             'Deferred Annuity Rates', 'Deferred Annuities',
             'Immediate Annuities Definition', 'Immediate Variable Annuities',
             'Variable Annuity', 'Aig Annuities', 'Retirement Income', 'retirment system',
             'Online Financial Planner', 'Certified Financial Planner']

d = makeAbbreviations(myHeaders)
for (k,v) in d.items(): print(k,v,sep = " = ")

输出:

imande = immediate annuities definition
adgr = ad group
fiin = fixed income
40ro = 401k rollover
resy = retirment system
vaan = variable annuity
devaan = deferred variable annuities
rein = retirement income
imvaan = immediate variable annuities
fo = forbes.com
imfian = immediate fixed annuities
dean = deferred annuities
anca = annuity calculator
cuanra = current annuity rates
annca = annuities calculator
onfipl = online financial planner
aian = aig annuities
ande = annuity definition
anfo = annuity formula
cefipl = certified financial planner
tadean = tax deferred annuity
deanca = deferred annuies calculator
anex = annuities explained
anta = annuity tables
deanra = deferred annuity rates

相关问题 更多 >