解析格式化文本fi的python方法

game ( name "Chess (1981)(M.C. Rakaska, S.W. Huggins) [Strategy, Chess].zip" file ( name Chess.bas size 19129 date 2007/01/31 19:46:20 crc 50577473 ) file ( name Chess.exe size 46464 date 1998/12/25 19:46:00 crc 826d1c0d ) file ( name file_id.diz size 198 date 2014/11/23 07:53:32 crc 72399680 ) )

2条回答

网友

1楼 · 编辑于 2024-06-06 03:32:36

最后，在阅读了这里和那里和许多测试后，我来到了这个。在

这是一个更大的循环的提取，我在这里解析文件中的其他类型的数据。来自其他语言（c++、Pascal、PHP）的python3.x处理字符串的方式对我来说有点奇怪。我发现，从ZIP文件内部读取时，我得到的字符串是二进制格式（字节），所以我必须在很多站点中使用“b”，但这可能是另一个问题的素材。在

不管怎样，这对我很有效。在

gdregex = re.compile(r"file\s*\(\s*name\s+(.+)\s+size\s+(\d+)\s+date\s+(.+)\s+crc\s+([0-9a-fA-F]{8})\s*\)", re.IGNORECASE | re.VERBOSE | re.MULTILINE)

with zipfile.ZipFile(dat_file) as datz:
    with datz.open('filetoprocess.txt') as datf:
        for line in datf:
            line=line.strip().lower()
            if line.startswith(b"game ("):
            # New entry
                is_new_entry = True
                entry = []
                continue
            if is_new_entry:
                if line.startswith(b"name"):
                    gamename = str(line[len("name")+2:])
                if line.startswith(b"file"):
                    line = str(line)
                    gamedatarx = gdregex.split(line,0)
                    entry = []
                    entry.append(gamename)
                    entry.extend(gamedatarx[1:4])
                    print(entry)
            if line==b')' and is_new_entry:
                is_new_entry = False

网友

2楼 · 编辑于 2024-06-06 03:32:36

基本上这取决于数据的不规则性。我在这里可以看到两种并发症

游戏名称用双引号括起来。但是如果这个名字也包含双qoutes怎么办？
如果文件名包含空格怎么办？

似乎这两个问题都可以通过使用内置的re模块中的常规表达式得到充分处理。因此，请记住Zen of Python，在这种情况下，没有必要使其更复杂并使用完整的解析器。在

相关问题更多 >

编程相关推荐

热门问题

热门文章