Python使用多个delimeter拆分字符串,将dictionary delimeter作为键返回,其余项作为值返回

2024-05-29 00:06:38 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我想知道是否有什么东西拿走了一个字符串,将它拆分成多个delimeter,但它没有返回列表,而是返回了一个字典,其中包含delimeter用来拆分字符串的内容,然后是未拆分的字符串,直到下一个delimeter。例如,考虑这个列表:

Food to make:

1. Cake
    a. eggs
    b. flour
    c. milk
    d. etc
2. Salad
    a. lettuce
    b. spinach
    c. cheese
    d. ham
    e. etc

以下是未格式化的列表:

GroceryList = "1. Cake a. eggs b. flour c. milk d. etc 2. Salad a. lettuce b. spinach c. cheese d. ham e. etc"

当我运行脚本时,我需要将其拆分为字母数字值(和句点),然后将其作为字典返回。理想情况下,我希望能够通过使用列表来设置dilemeters(my_str = "123test123" my_str.split(["1", "3"])将字符串拆分为值“1”和“3”,返回一个dict{"1#1": "2", "3#1": "test", "1#2": "2", "3#2": ""})。我知道任何重复都会在字典中被覆盖,因此必须有一个唯一的ID与之关联,如下所示:

{"#1": "Food to make:",
"1.#1": "Cake",
"a.#1": "eggs",
"b.#1": "flour",
"c.#1": "milk",
"d.#1": "etc",
"2.#2": "Salad",
"a.#2": "lettuce",
"b.#2": "spinach",
"c.#2": "cheese",
"d.#2": "ham",
"e.#2": "etc"}

我不认为会有一个本机函数来实现这一点,但鉴于我对python不太熟悉(我正在运行python 3.8),我想我可以尝试一下这个问题

我已经将映射和lambda函数视为尝试实现这一目标的替代方法,但我甚至不知道从何处着手解决这样的问题,因此,如果有一些本机功能可以完成此任务,那么这将是最好的

谢谢大家!

---编辑--

下面是我将实际处理的示例输入:

M 10 315
L 110 215
A 30 50 0 0 1 162.55 162.45
L 172.55 152.45
A 30 50 -45 0 1 215.1 109.9
L 315 10

Tags: 字符串列表字典foodetceggshamcake
2条回答

用法:

  1. 将类另存为文件。。。我叫我的StringSplitter.py

  2. import StringSplitter as SS

  3. ss = SS.StringSplitter("123test123", ["1", "3"])

  4. ss.split()

  5. ss.getSplit()ss.toFile()写入名为“split.txt”的文件

返回: [{'delimiter': '1', 'start': 0, 'end': 1, 'content': '2'}, {'delimiter': '3', 'start': 2, 'end': 3, 'content': 'testing'}, {'delimiter': '1', 'start': 10, 'end': 11, 'content': '2'}, {'delimiter': '3', 'start': 12, 'end': 13, 'content': ''}]

使用模式分隔符+内容重新构造字符串时,将生成: 123testing123

class StringSplitter:
    def __init__(self, string=None, delimeter=None, caseSensitive=True):
        self.string = string
        self.splitted = []
        self.delimeter = delimeter
        self.caseSensitive = caseSensitive

    def getSplit(self):
        return self.splitted

    def toFile(self):
        with open("./split.txt", "w") as file:
            file.writelines(str(self.splitted))

    def split(self):
        i = 0
        delCount = len(self.delimeter)
        strLen = len(self.string)
        split = []

        #loop through all chars in string
        while i < strLen:
            j = 0
            #loop over all possible delimiters
            while j < delCount:
                #get the delimiters
                searchitem = self.delimeter[j]
                compChar = self.string[i]
                if self.caseSensitive != True:
                    searchitem = searchitem.lower()
                    compChar = compChar.lower()
                #if the delimiter at its char 0 is the same as the string at i
                if searchitem[0] == compChar:
                    compItem = self.string[i:i + len(searchitem)]
                    if self.caseSensitive != True:
                        compItem = compItem.lower()
                    #check to see if the whole delimiter is matched at the rest of the string starting at i
                    if compItem == searchitem:
                        searchitem = self.string[i:i + len(searchitem)]
                        #then if there wasn't a match at the first character when a match was found,
                        #take the stuff up to the first match and make a dict out of it
                        #example: "string", ["i"] => [{"": "str"},{"i": "ng"}]
                        #for the purpose of this project, this is probably unnecessary
                        if len(split) == 0 and i > 0:
                            split.append({"delimiter": "", "start": 0, "end": i, "content": self.string[0: i]})
                            split.append({"delimiter": searchitem, "start": i, "end": i + len(searchitem), "content": ""})
                        else:
                            #add the delimiter and the starting and ending location of the of the delimeter
                            if len(split) > 0:
                                split[-1]["content"] = self.string[split[-1]["end"]: i]
                            split.append({"delimiter": searchitem, "start": i, "end": i + len(searchitem), "content": ""})
                        #break the loop
                        j = delCount + 1
                        #if len(split) > 1:
                        #    split[-2]["content"] = self.string[int(split[-2]["end"]):int(split[-1]["start"])]
                    else:
                        #keep searching
                        j += 1
                else:
                    #keep searching
                    j += 1
            #keep searching
            i += 1

        if len(split) > 1:
            split[-1]["content"] = self.string[int(split[-1]["end"]):]
        else:
            split[0]["content"] = self.string[int(split[0]["end"]):]
        self.splitted = split

如果有人想要这个,我已经进一步更新了它,但是还没有在这里发布完整的代码。请联系我,我们可以想出一个转移密码的方法。它包括一些其他用于操作字符串的方法

试试这个-

import re
import string

alp = ' '+string.ascii_lowercase

#split by digits and then split by words
items = [re.split('\w\.',i) for i in re.split('\d\.', GroceryList)][1:]

#iterate over list of lists while keeping track of the index with enumerate
#then for the inner index return, return corresponding alphabet
#finally apply dict transformation
result = dict([(alp[l]+'#'+str(i),m.strip()) for i,j in enumerate(items,1) for l,m in enumerate(j)])
result
{' #1': 'Cake',
 'a#1': 'eggs',
 'b#1': 'flour',
 'c#1': 'milk',
 'd#1': 'etc',
 ' #2': 'Salad',
 'a#2': 'lettuce',
 'b#2': 'spinach',
 'c#2': 'cheese',
 'd#2': 'ham',
 'e#2': 'etc'}

相关问题 更多 >

    热门问题