如何在Python3中生成组合列表?

2024-06-07 09:15:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要创建一个函数,从文本生成一个列表:

text = '^to[by, from] all ^appearances[appearance]'

list = ['to all appearances', 'to all appearance', 'by all appearances', 
        'by all appearance', 'from all appearances', 'from all appearance']

也就是说,括号内的值应该替换前面的单词,它紧跟在^之后。我想有五个参数的函数,你可以看到下面。。。你知道吗

我的代码(不起作用)

def addSubstitution(buf, substitutions, val1='[', val2=']', dsym=',', start_p="^"):
    for i in range(1, len(buf), 2):
        buff = []
        buff.extend(buf)
        if re.search('''[^{2}]+[{0}][^{1}{0}]+?[{1}]'''.format(val1, val2, start_p,     buff[i]):
            substrs = re.split('['+val1+']'+'|'+'['+val2+']'+'|'+dsym, buff[i])
            for substr in substrs:
                if substr:
                    buff[i] = substr
                    addSubstitution(buff, substitutions, val1, val2, dsym, start_p)
        return
    substitutions.add(''.join(buf))
    pass

def getSubstitution(text, val1='[', val2=']', dsym=',', start_p="^"):
    pattern = '''[^{2}]+[{0}][^{1}{0}]+?[{1}]'''.format(val1, val2, start_p)
    texts = re.split(pattern,text)
    opttexts = re.findall(pattern,text)
    buff = []
    p = iter(texts)
    t = iter(opttexts)
    buf = []
    while True:
        try:
            buf.append(next(p))
            buf.append(next(t))
        except StopIteration:
            break
     substitutions = set()
     addSubstitution(buf, substitutions, val1, val2, dsym, start_p)
     substitutions = list(substitutions)
     substitutions.sort(key=len)
     return substitutions

Tags: totextfromrebyallstartbuff
1条回答
网友
1楼 · 发布于 2024-06-07 09:15:48

一种方法是这样的(我跳过了字符串操作代码):

text = '^to[by, from] all ^appearances[appearance]'

步骤1:标记化text,如下所示:

tokenizedText = ['^to[by, from]', 'all', '^appearances[appearance]']

第二步:准备一个我们需要笛卡尔积的所有单词的列表(以^开头的单词)。你知道吗

combinationList = []
for word in tokenizedText:
    if word[0] == '^': # split the words into a list, and add them to `combinationList`.

combinationList = [['to', 'by', 'from'], ['appearances', 'appearance']]

步骤3:使用itertools.product(...)执行笛卡尔积:

for substitution in itertools.product(*combinationList):
    counter = 0
    sentence = []
    for word in tokenizedInput:
        if word[0] == '^':
            sentence.append(substitution[counter])
            counter += 1
        else:
            sentence.append(word)
   print ' '.join(sentence)    # Or append this to a list if you want to return all substitutions.

相关问题 更多 >

    热门问题