使用Python pandas向特定CSV行添加列
我想通过用一个给定的字典来匹配ID,来合并CSV文件中的行。
我有一个字典: l= {2.80215: [376570], 0.79577: [378053], 22667183: [269499]}
我有一个CSV文件。
A B C D
2000-01-03 -0.59885 -0.18141 -0.68828 -0.77572
2000-01-04 0.83935 0.15993 0.95911 -1.12959
2000-01-05 2.80215 -0.10858 -1.62114 -0.20170
2000-01-06 0.71670 -0.26707 1.36029 1.74254
2000-01-07 -0.45749 0.22750 0.46291 -0.58431
2000-01-10 -0.78702 0.44006 -0.36881 -0.13884
2000-01-11 0.79577 -0.09198 0.14119 0.02668
2000-01-12 -0.32297 0.62332 1.93595 0.78024
2000-01-13 1.74683 -1.57738 -0.02134 0.11596
输出应该是:
A B C D E F
2000-01-03 -0.59885 -0.18141 -0.68828 -0.77572 0
2000-01-04 0.83935 0.15993 0.95911 -1.12959 0
2000-01-05 2.80215 -0.10858 -1.62114 -0.20170 376570
2000-01-06 0.71670 -0.26707 1.36029 1.74254 0
2000-01-07 -0.45749 0.22750 0.46291 -0.58431 0
2000-01-10 -0.78702 0.44006 -0.36881 -0.13884 0
2000-01-11 0.79577 -0.09198 0.14119 0.02668 378053
2000-01-12 -0.32297 0.62332 1.93595 0.78024 0
2000-01-13 1.74683 -1.57738 -0.02134 0.11596 0
我尝试这样做:
import pandas
data = panda.read_csv("thisfiel.csv")
data["F"] = data["B"].apply(lambda x: l[x])
但是我没有得到想要的结果。
2 个回答
2
如果你是一个数据框(DataFrame),你可以进行一个叫做合并(merge)的操作:
In [11]: l_df = pd.DataFrame.from_dict(l, orient='index')
In [12]: l_df.columns = ['F']
In [13]: l_df
Out[13]:
F
2.80215 376570
0.79577 378053
22667183.00000 269499
在A列和l_df
的索引上进行合并:
In [14]: merged = df.merge(l_df, left_on='A', right_index=True, how='left')
In [15]: merged
Out[15]:
A B C D F
2000-01-03 -0.59885 -0.18141 -0.68828 -0.77572 NaN
2000-01-04 0.83935 0.15993 0.95911 -1.12959 NaN
2000-01-05 2.80215 -0.10858 -1.62114 -0.20170 376570
2000-01-06 0.71670 -0.26707 1.36029 1.74254 NaN
2000-01-07 -0.45749 0.22750 0.46291 -0.58431 NaN
2000-01-10 -0.78702 0.44006 -0.36881 -0.13884 NaN
2000-01-11 0.79577 -0.09198 0.14119 0.02668 378053
2000-01-12 -0.32297 0.62332 1.93595 0.78024 NaN
2000-01-13 1.74683 -1.57738 -0.02134 0.11596 NaN
注意:这里的NaN表示缺失值,你可以用fillna来填补这些缺失值:
In [16]: merged['F'].fillna(0, inplace=True)
1
这样做:
def getVal(x):
try:
return l[x][0]
except KeyError:
return 0
data['F'] = data['B'].map(getVal)