Python:分析两个日期之间的CSV数据并按升序打印:

2024-04-24 10:02:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我对Python还比较陌生,所以请原谅,如果这个问题可能是一个简单的修复或错误。如果你看下面的代码,我试图从CSV文件解析数据。特别是,我尝试以升序分析在两个日期之间创建的用户。在这两个日期之间创建的任何用户都应该按升序打印出来。我的日期列row[1]在unix时间。还有一个word列row[8]也应该打印出来。目标是,当日期按升序解析时,打印字列row[8]形成一个特定的短语。问题是当我执行Pycharm中当前的代码时,我在第15行收到一个IndexError: list out of range。我知道Panda可以更好地处理CSV文件,但我尽量避免为这项任务学习Panda。你知道吗

import csv
from datetime import datetime, date
import sys

start_date = date(2014, 6, 22)
end_date = date(2014, 7, 22)

# Read csv data into memory filtering rows by the date in column 2 (row[1]).
csv_data = []
with open('sample.csv', newline='') as f:
reader = csv.reader(f, delimiter='\t')
header = next(reader)
csv_data.append(header)
for row in reader:
    creation_date = date.fromtimestamp(int(row[1]))
    if start_date <= creation_date <= end_date:
        csv_data.append(row)

if csv_data:  # Anything found?
# Print the results in ascending date order.
print(" ".join(csv_data[0]))
# Converting the timestamp to int may not be necessary (but doesn't hurt)
for row in sorted(csv_data[1:], key=lambda r: int(r[1])): 
    print(" ".join(row))

Tags: 文件csvthe代码用户inimportdata
2条回答

看起来您试图访问的数据行中没有一个值(因为此行只有一个值)。 您可以将崩溃的代码包装到try/except中,并查看失败的行:

for row in reader: 
    try:
        creation_date = date.fromtimestamp(int(row[1]))
    except IndexError:
        print("Cannot get value for row: {}".format(row))
        continue

    if start_date <= creation_date <= end_date:
        csv_data.append(row)

这应该能让您首先了解为什么它会在这里崩溃(也许您的数据不是以制表符分隔的?)你知道吗

您共享的csv,分隔。所以当你说

  reader = csv.reader(f, delimiter='\t') // returns a single column

你应该把它换成

reader = csv.reader(f, delimiter=',')

实际代码:

import csv
from datetime import datetime, date
import sys

start_date = date(2014, 6, 22)
end_date = date(2014, 7, 22)

# Read csv data into memory filtering rows by the date in column 2 (row[1]).
csv_data = []
with open('sample_data.csv','r') as f:
 reader = csv.reader(f, delimiter='\t')
 header = next(reader)
 csv_data.append(header)
 for row in reader:
    creation_date = date.fromtimestamp(int(row[1]))
    if start_date <= creation_date <= end_date:
        csv_data.append(row)

 if csv_data:  # Anything found?
    # Print the results in ascending date order.
    print(" ".join(csv_data[0]))
    # Converting the timestamp to int may not be necessary (but doesn't hurt)
    for row in sorted(csv_data[1:], key=lambda r: int(r[1])): 
        print(" ".join(row))

相关问题 更多 >