删除带制表符的行

0 absinth Bohemian-style absinth Bohemian-style or Czech-style absinth (also called anise-free absinthe, or just “absinth” without the “e”) is an ersatz version of the traditional spirit absinthe, though is more accurately described as a kind of wormwood bitters. It is produced mainly in the Czech Republic, from which it gets its designations as “Bohemian” or “Czech,” although not all absinthe from the Czech Republic is Bohemian-style. 1 acidophilus milk Sweet acidophilus milk is consumed by individuals who suffer from lactose intolerance or maldigestion, which occurs when enzymes (lactase) cannot break down lactose (milk sugar) in the intestine. To aid digestion in those with lactose intolerance, milk with added bacterial cultures such as "Lactobacillus acidophilus" ("acidophilus milk") and bifidobacteria ("a/B milk") is available in some areas. High Activity of Lactobacillus Acidophilus Milk 2 adobo Adobo Adobo (Spanish: marinade, sauce, or seasoning) is the immersion of raw food in a stock (or sauce) composed variously of paprika, oregano, salt, garlic, and vinegar to preserve and enhance its flavor. In the Philippines, the name "adobo" was given by the Spanish colonists to an indigenous cooking method that also uses vinegar, which although superficially similar had developed independent of Spanish influence.

Bohemian-style absinth Bohemian-style or Czech-style absinth (also called anise-free absinthe, or just “absinth” without the “e”) is an ersatz version of the traditional spirit absinthe, though is more accurately described as a kind of wormwood bitters. It is produced mainly in the Czech Republic, from which it gets its designations as “Bohemian” or “Czech,” although not all absinthe from the Czech Republic is Bohemian-style. Sweet acidophilus milk is consumed by individuals who suffer from lactose intolerance or maldigestion, which occurs when enzymes (lactase) cannot break down lactose (milk sugar) in the intestine. To aid digestion in those with lactose intolerance, milk with added bacterial cultures such as "Lactobacillus acidophilus" ("acidophilus milk") and bifidobacteria ("a/B milk") is available in some areas. High Activity of Lactobacillus Acidophilus Milk Adobo Adobo (Spanish: marinade, sauce, or seasoning) is the immersion of raw food in a stock (or sauce) composed variously of paprika, oregano, salt, garlic, and vinegar to preserve and enhance its flavor. In the Philippines, the name "adobo" was given by the Spanish colonists to an indigenous cooking method that also uses vinegar, which although superficially similar had developed independent of Spanish influence.

3条回答

网友

1楼 · 编辑于 2024-04-29 14:42:12

grep -v '\t' file

。。。。。。。。。。。。你知道吗

网友

2楼 · 编辑于 2024-04-29 14:42:12

您的代码正常，您可以尝试优化只在字符串开头查找：

if `\t' not in l[:5]: fout.write(l)

如果子字符串的长度取决于最大记录数，那么它可能会对不匹配的长字符串产生影响，谁知道呢。。。你知道吗

此外，您可能希望测试mawk、grep等，如

# Edit : the following won't work. it strips also blank lines
# mawk -F"\t" "NF==1"  original > stripped
grep -vF "\t"        original > stripped
sed -e "/\t/d"       original > stripped

看看它是否比python解决方案快。你知道吗

测试

在我的系统里，有一个重复复制你的文件。它的尺寸是1418973184 我有大约的时间如下：grep1.6s、sed6.4s、python4.6s。你知道吗

附录

我用mawk测试了Jidder awk解决方案（在评论中给出），我的近似时间是3.2s。。。获胜者是grep -vF

测试成绩单

执行之间的运行时间相差0.1秒，这里我只报告每个命令的一个运行时间。。。为了接近结果，人们不能做出明确的决定。你知道吗

另一方面，不同的工具给出的结果与实验误差相差甚远，我认为我们可以得出一些结论。。。你知道吗

% ls -l original 
-rw-r--r-- 1 boffi boffi 1418973184 Dec  8 21:33 original
% cat doit.py
from sys import stdout
with open('original', 'r') as fin:
  for line in fin:
    if '\t' in line: continue
    else: stdout.write(line)
% time wc -l original 
15731133 original

real    0m0.407s
user    0m0.184s
sys     0m0.220s
% time python doit.py | wc -l
12584034

real    0m5.334s
user    0m4.880s
sys     0m1.428s
% time grep -vF "       "  original | wc -l
12584035

real    0m1.527s
user    0m1.112s
sys     0m1.400s
% time grep -v "        "  original | wc -l
12584035

real    0m1.556s
user    0m1.120s
sys     0m1.436s
% time sed -e "/\t/d"  original | wc -l
12584034

real    0m6.481s
user    0m6.104s
sys     0m1.404s
% time mawk '!/\t/'  original | wc -l
12584035

real    0m3.059s
user    0m2.608s
sys     0m1.488s
% time gawk '!/\t/'  original | wc -l
12584035

real    0m9.148s
user    0m8.680s
sys     0m1.468s
%

我的示例文件有一个截断的最后一行，因此python和sed之间的行数相差一倍，而其他所有工具都是如此。你知道吗

网友

3楼 · 编辑于 2024-04-29 14:42:12

你可以用sed做这个

sed '/\t/d' 'my_file'

查找“\t”并删除包含它的行

测试

附录

测试成绩单

相关问题更多 >

编程相关推荐

热门问题

热门文章