在CSV文件中追加单元格后找到五个不同类别的最小值和最大值
大家好,我有个问题。我刚开始学习最小值和最大值。
我在找每个类别的五列数据的最小值和最大值时遇到了困难。
这是我现在的情况:
我把一个包含26列数据的csv文件中的5列数据移动到了一个txt文件里。
比如说,csv文件里的数据是这样的:
state car motorcycle van airplane bike
Maine 35.5 8.1 5.7 21.0% 33.2%
Michigan 47.9 9.1 5.5 20.40% 25.2%
Washington 52.5 1.2 4.6 3.50% 24.7%
Denver 21.8 20.5 5.3 2.90% 30.9%
我想要把最小值和最大值整理成这样:
min max
car Denver: 21.8 Washington: 52.5
motor Washington: 1.2 Denver: 20.5
van Washington 4.6 Maine: 5.7
airplane Denver 2.90% Maine 21.0%
bike Washington 24.7% Maine 33.2% -
这是我现在的结果:
import csv
import string, re
import operator
output = []
data = []
csv_string = []
data_file = []
try:
with open('data.csv', 'r') as csv_string:
for line in csv_string:
cells = line.split(",")
output.append((cells[0], cells[1], cells[5], cells[7], cells[11], cells[13]))
for lines in output:
#state = cells[0]
zmin = cells[1] #car = cells[1]
ymin = cells[1]
xmin = cells[5] #motor = cells[5]
wmin = cells[5]
vmin = cells[7] #van = cells[7]
zmax = cells[7]
ymax = cells[11] #airplane = cells[11]
xmax = cells[11]
wmax = cells[13] #bike = cells[13]
vmax = cells[13]
if cells[1] < xmin:
zmin = cells[1]
if cells[1] > xmax:
zmax = cells[1]
if cells[5] < ymin:
ymin = cells[5]
if cells[5] > ymax:
ymax = cells[5]
if cells[7] < zmin:
xmin = cells[7]
if cells[7] > zmax:
xmax = cells[7]
if cells[11] < zmin:
wmin = cells[11]
if cells[11] > zmax:
wmax = cells[11]
if cells[13] < zmin:
vmin = cells[13]
if cells[13] > zmax:
vmax = cells[13]
outstring = ' '
for item in output:
for cell in item:
outstring += "{0:<35}".format(cell) #Width/Distance of each row
outstring += "\n"
print(outstring)
print('Min: ',zmin,ymin,xmin,wmin,vmin)
print('Max: ',state,zmax,ymax,xmax,wmax,vmax)
finally:
f.close()
try:
f_write = open('output.txt', 'w') #creates the file
try:
f_write.writelines(outstring)
finally:
f.close()
我不太确定我哪里做错了。我一直在阅读关于最小值和最大值的内容,但我不明白在处理csv文件并添加5列数据时,这些知识是怎么应用的。
如果有人能给我一些指导,非常感谢你的帮助。
程序输出的数字是错误的。
print('Min: ',zmin,ymin,xmin,wmin,vmin)
47.9, 8.1, 5.5, 20.40%, 25.2%
print('Max: ',state,zmax,ymax,xmax,wmax,vmax)
21.8, 9.1, 4.6, 20.40%, 30.9%
2 个回答
4
使用pandas
这个库,它专门用来处理数据,这样的任务就简单多了:
import pandas as pd
c = lambda x: float(x.strip('%'))
df = pd.read_csv(f,sep='\s+', converters = {'bike':c, 'airplane':c})
vehicles = df.columns[1:] #['car', 'motorcycle', 'van', 'airplane', 'bike']
max_v = zip(df['state'][df[vehicles].idxmax().values],
df[vehicles].max().values.astype('|S4'))
min_v = zip(df['state'][df[vehicles].idxmin().values],
df[vehicles].min().values.astype('|S4'))
max_i = [': '.join(tup) for tup in max_v]
min_i = [': '.join(tup) for tup in min_v]
print pd.DataFrame({'min':min_i, 'max':max_i}, index=vehicles)
输出:
max min
car Washington: 52.5 Denver: 21.8
motorcycle Denver: 20.5 Washington: 1.2
van Maine: 5.7 Washington: 4.6
airplane Maine: 21.0 Denver: 2.9
bike Maine: 33.2 Washington: 24.7
2
你可以通过使用Python自带的csv
模块来完成很多需要的操作。下面是如何找到数据中某些字段(或者说列)的最小值和最大值的方法。示例中的data.csv
文件只包含了我们关心的字段,但实际上它可以包含所有26列的数据,而代码只会处理FIELDS
列表中列出的字段。
import csv
ID = 'state'
FIELDS = ['car', 'motorcycle', 'van', 'airplane', 'bike']
MIN_ID, MIN, MAX_ID, MAX = 0, 1, 2, 3 # indices of data in min_maxes records
with open('data.csv', 'rb') as csv_file:
csv_dict_reader = csv.DictReader(csv_file, delimiter=',')
# initialize min and max values from first row of csv file
row = csv_dict_reader.next()
min_maxes = {field: [row[ID], float(row[field])]*2 for field in FIELDS}
# update min and max values with data from remaining rows of csv file
for row in csv_dict_reader:
for id, value, min_max_rec in (
(row[ID], float(row[field]), min_maxes[field]) for field in FIELDS):
if value < min_max_rec[MIN]:
min_max_rec[MIN_ID] = id
min_max_rec[MIN] = value
if value > min_max_rec[MAX]:
min_max_rec[MAX_ID] = id
min_max_rec[MAX] = value
print ' min max'
for field in FIELDS:
min_max_rec = min_maxes[field]
print '{:10} {:12}{:4.1f} {:12}{:4.1f}'.format(field,
min_max_rec[MIN_ID]+':', min_max_rec[MIN],
min_max_rec[MAX_ID]+':', min_max_rec[MAX])
输入(简化版的data.csv
文件):
state,car,motorcycle,van,airplane,bike
Maine,35.5,8.1,5.7,21.0,33.2
Michigan,47.9,9.1,5.5,20.40,25.2
Washington,52.5,1.2,4.6,3.,24.7
Denver,21.8,20.5,5.3,2.90,30.9
输出:
min max
car Denver: 21.8 Washington: 52.5
motorcycle Washington: 1.2 Denver: 20.5
van Washington: 4.6 Maine: 5.7
airplane Denver: 2.9 Maine: 21.0
bike Washington: 24.7 Maine: 33.2