Pandas:合并dfs时出现内存错误

2024-04-25 21:17:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在合并到大数据帧,并得到一个内存错误。我想知道这个错误是因为糟糕的编码还是因为数据帧太大了。数据帧为dfy:34.7 Mb和df:2.2 Mb。你知道吗

dfy = pd.read_csv('Thesis/CRSP/CampaignFin14/pacs14.txt', header=None, \
names=['cycle', '2', '3', 'cid', 'amount', 'date', 'catcode', 'type', 'di', 'feccandid'], \
usecols=['cycle', 'cid', 'amount', 'date', 'catcode', 'type', 'di', 'feccandid'])

dfy.head()



   cycle    cid       amount          date  catcode type    di  feccandid
0   2014    N00029285   1000    05/15/2014  E1600   24K     D   H8TX22107
1   2014    N00026722   5000    10/22/2013  G4600   24K     D   H4TX28046
2   2014    N00030676      4    03/26/2014  C2100   24Z     D   H0MO07113
3   2014    N00032088   1000    05/06/2014  F1100   24K     D   H0OH06189

df = pd.read_csv('Thesis/MapLight_data/mpl_data114.csv', header=None, names=\
    ['session', 'prefix', 'number', 'organization_id', 'name', 'disposition', 'catcode'], usecols=\
             ['session', 'prefix', 'number', 'disposition', 'catcode'])

df.head()

session prefix number disposition catcode
0   114     H   131     support     J6200
1   114     H   138     oppose      L1100
2   114     H   140     support       NaN
df_merge = pd.merge(dfy, df, on='catcode')

Tags: csv数据dfdateprefixsessiontypeamount