当我运行以下代码时，我得到错误：ValueError：基数为10的int（）的文本无效：“（1，0，'Friday'）”

import calendar import datetime infile_csv = 'C:/pythonscripts/NYC-2016-Summary.csv' def read_from_csv(input_csvfile, duration=False, month=False, hour=False, day_of_week=False): # assign columns name if duration==True: col_name='duration' elif month==True: col_name='month' elif hour==True: col_name='hour' elif day_of_week==True: col_name='day_of_week' # init lists for output n_ridership4column = [] n_ridership_sub = [] n_ridership_cust = [] with open(infile_csv, 'r') as f_in: filereader = csv.DictReader(f_in) for row in filereader: n_ridership4column.append(row[col_name]) if row['user_type'] == 'Subscriber': n_ridership_sub.append(row[col_name]) else: n_ridership_cust.append(row[col_name]) return n_ridership4column, n_ridership_sub, n_ridership_cust # using the function above to get monthly ridership monthwise = list(map(int, read_from_csv(infile_csv, month=True)[0])) monthwise_sub = list(map(int, read_from_csv(infile_csv, month=True)[1])) monthwise_cust = list(map(int, read_from_csv(infile_csv, month=True)[2]))

fig, ax = plt.subplots() bins = [i for i in range(1,14)] #upper bound is 14 to accomodate bin for december #### Plotting monthly total along with customers and subscribers stacked ax.hist(monthwise, bins=bins, edgecolor='k', align='left', label='Total Ridership', stacked= True) ax.hist(monthwise_sub, bins=bins, edgecolor='k', align='left', label='Subscribers', stacked=True) ax.hist(monthwise_cust, bins=bins, edgecolor='k', align='left', label='Customer', stacked=True) ax.set_xticks(bins[:-1]) ax.set_xticklabels(list(calendar.month_abbr[i] for i in bins[:-1])) plt.title('Monthly Ridership in NYC', fontsize=16) plt.xlabel('Monthly', fontsize=14) plt.ylabel('Rides', fontsize=14) plt.xticks(fontsize=12) plt.yticks(fontsize=12) plt.legend() plt.show()

duration month hour day_of_week user_type 13.98333333 (1, 0, 'Friday') (1, 0, 'Friday') Customer 11.43333333 (1, 0, 'Friday') (1, 0, 'Friday') Subscriber 5.25 (1, 0, 'Friday') (1, 0, 'Friday') Subscriber 12.31666667 (1, 0, 'Friday') (1, 0, 'Friday') Subscriber 20.88333333 (1, 0, 'Friday') (1, 0, 'Friday') Customer 8.75 (1, 0, 'Friday') (1, 0, 'Friday') Subscriber 10.98333333 (1, 0, 'Friday') (1, 0, 'Friday') Subscriber 7.733333333 (1, 1, 'Friday') (1, 1, 'Friday') Subscriber 3.433333333 (1, 1, 'Friday') (1, 1, 'Friday') Subscriber 7.083333333 (1, 1, 'Friday') (1, 1, 'Friday') Customer 13.3 (1, 2, 'Friday') (1, 2, 'Friday') Subscriber 9.733333333 (1, 2, 'Friday') (1, 2, 'Friday') Subscriber 8.416666667 (1, 2, 'Friday') (1, 2, 'Friday') Subscriber

1条回答

网友

1楼 · 发布于 2024-06-16 10:33:54

错误消息表示您正试图将一个非数字的值解析为整数。当您要求Python做它不能做的事情（将一个数字除以零、引用一个未声明的变量等）时，它会抛出一个错误。通常，错误信息非常清楚，不过当您刚刚学习Python时，有时需要使用google。你知道吗

在一定程度上，不管哪个程序写的这个坏掉的伪CSV是错误的，都应该被修复或替换。为了使CSV有用，需要将其规范化为每个字段一个数据，尽管您有时会看到这一原则遭到违反。以特定于Python的格式编写复合字段至少是错误的，在这种情况下很可能是一个bug。你知道吗

此外，示例数据中有一列比示例标题所显示的少。另一方面，第2列和第3列似乎总是相同的，而且似乎是由符合标题中第2列、第3列和第4列的明显期望值的值组成的。你知道吗

您的代码很奇怪，因为它似乎每次要提取列时都会读取该文件。如果您的输入文件太大，无法一次放入内存，那么这可能有点道理；但是如果您的问题或代码中的注释中没有任何此类问题，我建议将所有列读入内存一次。这也应该使你的程序至少快一个数量级。你知道吗

DictReader已经负责将其输入收集到OrderedDict中，因此循环中的append只是复制这个Python库已经为您执行的工作。你知道吗

如果你被这个坏掉的CSV困住了，也许这样的东西能满足你的需要。你知道吗

def parse_broken_csv(filename):
    rows = []
    with open(filename, 'r') as fh:
        reader = csv.reader(fh, delimiter='\t')
        headers = reader.__next__()
        for duration, tpl, _, cust in reader:
            month, hour, dow = tpl.strip('()').split(', ')
            rows.append({
                'duration': float(duration),
                'month': int(month),
                'hour': int(hour),
                'day_of_week': dow.strip("'"),
                'cust': cust})
    return rows

rows = parse_broken_csv(r'NYC-2016-Summary.csv')

monthwise = [row['month'] for row in rows]
monthwise_sub = [row['month'] for row in rows if row['cust'] == 'Subscriber']
monthwise_cust = [row['month'] for row in rows if row['cust'] == 'Customer']

对于您发布的示例CSV，rows的值为

[
 {'duration': 13.98333333, 'month': 1, 'day_of_week': 'Friday', 'cust': 'Customer', 'hour': 0},
 {'duration': 11.43333333, 'month': 1, 'day_of_week': 'Friday', 'cust': 'Subscriber', 'hour': 0},
 {'duration': 5.25, 'month': 1, 'day_of_week': 'Friday', 'cust': 'Subscriber', 'hour': 0},
 {'duration': 12.31666667, 'month': 1, 'day_of_week': 'Friday', 'cust': 'Subscriber', 'hour': 0},
 {'duration': 20.88333333, 'month': 1, 'day_of_week': 'Friday', 'cust': 'Customer', 'hour': 0},
 {'duration': 8.75, 'month': 1, 'day_of_week': 'Friday', 'cust': 'Subscriber', 'hour': 0},
 {'duration': 10.98333333, 'month': 1, 'day_of_week': 'Friday', 'cust': 'Subscriber', 'hour': 0}
]

monthwise的值是

[1, 1, 1, 1, 1, 1, 1]

相关问题更多 >

编程相关推荐

热门问题

热门文章