打印包含指定关键词的csv文件行
我刚开始学习Python,但我想对一些csv文件进行数据分析。我想打印出csv文件中只包含某些关键词的行。我用第一个代码块打印了所有有效的行。接下来,我想从这些行中打印出包含关键词的那些。谢谢你的帮助。
csv.field_size_limit(sys.maxsize)
invalids = 0
valids = 0
for f in ['1.csv']:
reader = csv.reader(open(f, 'rU'), delimiter='|', quotechar='\\')
for row in reader:
try:
print row[2]
valids += 1
except:
invalids += 1
print 'parsed %s records. ignored %s' % (valids, invalids)
包含关键词的:
for w in ['ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar', 'volkswagen','chevrolet','chrysler']:
我想我需要用一个if语句来过滤我之前的代码,但我已经为这个问题挣扎了好几个小时,还是没能搞定。
1 个回答
0
你的猜测是对的。你只需要用一个if语句来筛选出符合条件的行,检查每个字段是否和关键词匹配。下面是你该怎么做的(我还对你的代码做了一些改进,并在注释中解释了这些改进):
# First, create a set of the keywords. Sets are faster than a list for
# checking if they contain an element. The curly brackets create a set.
keywords = {'ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar',
'volkswagen','chevrolet','chrysler'}
csv.field_size_limit(sys.maxsize)
invalids = 0
valids = 0
for filename in ['1.csv']:
# The with statement in Python makes sure that your file is properly closed
# (automatically) when an error occurs. This is a common idiom.
# In addition, CSV files should be opened only in 'rb' mode.
with open(filename, 'rb') as f:
reader = csv.reader(f, delimiter='|', quotechar='\\')
for row in reader:
try:
print row[2]
valids += 1
# Don't use bare except clauses. It will catch
# exceptions you don't want or intend to catch.
except IndexError:
invalids += 1
# The filtering is done here.
for field in row:
if field in keywords:
print row
break
# Prefer the str.format() method over the old style string formatting.
print 'parsed {0} records. ignored {1}'.format(valids, invalids)