正在读取文件以检查相同数量的分隔符

def isValid(fileName): with open(fileName,'rb') as infile: for lineNumber,line in enumerate(infile,1): count = line.count(',') if lineNumber > 1 and prevCount != count: # this line does not contain the same number of delimiters return False prevCount = count return True

3条回答

网友

1楼 · 编辑于 2024-05-15 12:46:13

我刚刚注意到-如果你想坚持简单的逻辑-原始代码可以压缩一点：

def isValid(fileName):
with open(fileName,'r') as infile:
    count = infile.readline().count(',')
    for line in infile:
        if line.count(',') != count:
            return False
return True

不需要保留前一行的计数，因为只有一个差异就可以决定它。所以只保留第一行的delim计数。
然后，文件需要以文本文件（“r”）而不是二进制文件的形式打开。
最后，通过在循环之前预取第一行，我们可以放弃对enumerate的调用。你知道吗

网友

2楼 · 编辑于 2024-05-15 12:46:13

可以改用all和生成器表达式：

with open(file_name) as your_file:
    start = your_file.readline().count(',') # initial count
    print all(i.count(',') == start for i in your_file)

网友

3楼 · 编辑于 2024-05-15 12:46:13

我提出了一种不同的方法（没有代码）：
1以二进制文件的形式读取文件，以64 KB为单位
2计算区块中的行尾标记数
三。计算区块中分隔符的数量，但仅计算到最后一个EOL标记的位置
4如果两个数字不能等分，则停止并返回False
5在EOF时，返回True

由于您必须处理最后一个EOL标记和块末尾之间的“重叠”，因此逻辑比“暴力”方法要复杂一些。但在处理GBs时，它可能会得到回报。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章