尝试删除空字节,但删除所有文本

2024-04-18 12:59:37 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我试图从一些文本中删除空字节。 我写了三个函数,我认为它们做同样的事情。 他们最终都给了我空白文件并删除了所有输入

以下是带有空字节的示例输入:

T:  14/01/2015 22:27:05**\00**||||END_OF_RECORD <- ** so you can see it (I can see it in my ubuntu text editor)
T:  14/01/2015 22:27:05 ||||END_OF_RECORD <- what my IDE shows is a box there

下面是我写的试图修复这些文件的代码,但它们最终都是空的

from pathlib import Path

# Removes null bytes from the txt files
def removeNULLBytes():
    for p in workspace.glob('*.csv'):
        new = Path(workspace, p.name)
        new = new.with_suffix('.csv')
        with p.open() as infile, new.open('wb') as outfile:
            fileName = infile.name
            with open(fileName, 'rb') as in_file:
                data = in_file.read()
                # data = str(data, encoding='utf8', errors='ignore')
                data = (data.replace(b'\x00', b''))
                outfile.write(data)


def removeNULLs():
    for p in workspace.glob('*.csv'):
        new = Path(workspace, p.name)
        new = new.with_suffix('.csv')
        with p.open() as infile, new.open('w') as outfile:
            fileName = infile.name
            with open(fileName, 'r') as in_file:
                data = in_file.read()
                # data = str(data, encoding='utf8', errors='ignore')
                data = (data.replace(u"\u0000", ""))
                outfile.write(data)

def removeNull():
    for p in workspace.glob('*.csv'):
        new = Path(workspace, p.name)
        new = new.with_suffix('.csv')
        with p.open() as infile, new.open('w') as outfile:
            for line in infile.read():
                newline = ''.join([i if not u"\u0000" else "" for i in line])
                data = (line.replace(line, newline))
                outfile.writelines(data)

if __name__ == '__main__':
    workspace = Path('/home/')
    # removeNULLBytes()
    removeNull()
    # removeNULLs()

任何建议都将不胜感激。谢谢


Tags: csvpathnameinnewfordataas