如何从单词列表中不带空格的字符串中获取多个子字符串?

2024-04-23 08:38:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我想“解码”一条编码信息。每个字母都用唯一的5/6/7/8长“单词”编码。我有一本这些代码的字典('a':'qwert',…)。我试着像这样解码信息:

#example dictionary
d={'a': '00101', 'b': '10001011', 'c': '01100', 'd': '1111110', 'e': '01001', 'f': '010000', 'g': '1100100', 'h': '00010010', 'i': '0000000', 'j': '1101010', 'k': '110001', 'l': '101010', 'm': '10010', 'n': '100001', 'o': '111101', 'p': '11100111', 'q': '011110', 'r': '010001', 's': '1110010', 't': '1110011', 'u': '111000', 'v': '11100', 'w': '00110101', 'x': '011111', 'y': '0111100', 'z': '0111000', ' ': '11101011', '!': '00111101', ',': '11111', '-': '000100', '.': '0110111', ':': '11010', '?': '10110110', ';': '00000101', '0': '10001', '1': '000101', '2': '101011', '3': '11011001', '4': '10010111', '5': '1011000', '6': '0100000', '7': '000001', '8': '10111010', '9': '001110'}
#coded word
coded = '0001001001001101010101010111101' #coded and d is the input, 'hello' is the expected output


def get_key(v):
    for key, value in d.items():
         if v == value:
            return key
 
    return None
 
def decode(text):
    l = []
    while len(text)> 0:
        for i in range(5,9):
            if get_key(text[0:i]) != None:
                l.append(get_key(text[0:i]))
                text = text[i:]
        
    return ''.join(str(i) for i in l)

inverse = {v:k for k,v in d.items()}

def decode(text):
    l = []
    while len(text)> 0:
        for i in range(5,9):
            if text[0:i] == inverse.keys():
                l.append(inverse.value())
                text = text[i:]
        
    return ''.join(str(i) for i in l)

但是我的代码不起作用。运行此代码需要相当长的时间,我认为它也会返回错误的字母。我不知道如何纠正它。你能帮我修改代码吗


Tags: key代码textin信息编码forget
1条回答
网友
1楼 · 发布于 2024-04-23 08:38:48

我对这一点的处理方式与您不同,这里有大量的优化空间。关键是此函数,它将返回子字符串的索引列表:

# function to get all the indexes of a word in a string of words
def all_word_indexes(string, substring):
    x = [len(i) for i in string.split(substring)[:-1]]
    return [sum(x[:i+1]) + i*len(substring) for i in range(len(x))]

其余部分非常简单,如下所示:

  • 从单词词典中获取单词列表
  • 遍历列表,查找每个示例索引
  • 在“find”列表中迭代,创建一个元组列表,其中包含单词索引和实际单词
  • 按索引编号对这些字母进行排序,因为它们在原始词典中是最新的,以便获得字母的正确顺序
  • 迭代此列表,将其与反向键value->;值,键字典

在这里:

secret_dict = {
    'a':'nation',
    'b':'variation',
    'c':'investment',
    'd':'exam',
    'e':'patience',
    'f':'inspection',
    'g':'significance',
    'h':'recipe',
    'i':'consequence',
    'j':'speaker',
    'k':'historian',
    'l':'leadership',
    'm':'meaning',
    'n':'marriage',
    'o':'month',
    'p':'loss',
    'q':'volume',
    'r':'environment',
    's':'cheek',
    't':'database',
    'u':'country',
    'v':'teacher',
    'w':'bonus',
    'x':'football',
    'y':'grocery',
    'z':'income',
    ' ':'banana'
}

# create words list from the the values of the dictionary
my_words_list = list(secret_dict.values())

# switch the values to keys and vice versa
secret_dict_reversed = dict((v,k) for k,v in secret_dict.items()) # reverse the dictionary to be value -> key

# coded phrase is hello world
coded_word = ['recipe', 'patience', 'leadership', 'leadership', 'month', 'banana', 'bonus', 'month', 'environment', 'leadership', 'exam'] # list of words, if you have a string then split it with mystring.split()
decoded_word = ''

coded_string = 'recipepatienceleadershipleadershipmonthbananabonusmonthenvironmentleadershipexam'

results = []

# function to get all the indexes of a word in a string of words
def all_word_indexes(string, substring):
    x = [len(i) for i in string.split(substring)[:-1]]
    return [sum(x[:i+1]) + i*len(substring) for i in range(len(x))]

# iterate through each word in the list
for word in my_words_list:
    word_indexes = all_word_indexes(coded_string, word)
    if len(word_indexes) > 0:
        for idx in word_indexes:
            # returns a list of tuples with word and their index or indexes
            results.append((idx, word))

# sort these results
sorted_results = list(sorted(results, key=lambda elem: elem[0]))
sorted_word_results = [x[1] for x in sorted_results]

# traverse the words to get the letters
for word in sorted_word_results:
    letter = secret_dict_reversed[word]
    decoded_word = decoded_word + letter

print(decoded_word)
# hello world

相关问题 更多 >