按条件拆分数据帧

2024-05-29 11:23:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据帧:

               0       1       2       3              4
0          1.JPG     NaN     NaN     NaN            NaN
1           2883  2957.0  3412.0  3340.0  miscellaneous
2           3517  3007.0  4062.0  3371.0  miscellaneous
3           5678  3158.0  6299.0  3423.0  miscellaneous
4           1627  3287.0  2149.0  3694.0  miscellaneous
5           2894  3272.0  3421.0  3664.0  miscellaneous
6           3525  3271.0  4064.0  3672.0  miscellaneous
7           4759  3337.0  5321.0  3640.0  miscellaneous
8           6141  3289.0  6664.0  3654.0  miscellaneous
9           1017  3598.0  1539.0  3979.0  miscellaneous
10          1624  3586.0  2155.0  3993.0  miscellaneous
11          2252  3612.0  2777.0  3967.0  miscellaneous
12          3211  3548.0  3735.0  3944.0  miscellaneous
13          6052  3616.0  6572.0  3983.0  miscellaneous
14           691  3911.0  1204.0  4223.0  miscellaneous
15         2.JPG     NaN     NaN     NaN            NaN
16         3.JPG     NaN     NaN     NaN            NaN
17          5384  2841.0  5963.0  3095.0  miscellaneous
18          5985  2797.0  6611.0  3080.0  miscellaneous
19          3512  3012.0  4025.0  3366.0  miscellaneous
20          5085  2974.0  5587.0  3367.0  miscellaneous
21          2593  3224.0  3148.0  3469.0  miscellaneous
22          1044  3630.0  1511.0  3928.0  miscellaneous
23          4764  3619.0  5283.0  3971.0  miscellaneous
24          5103  3613.0  5635.0  3928.0  miscellaneous

我想把这个数据帧分成多个csv,这样:第一个csv应该命名为1.csv,并且所有数据都低于1.jpg,依此类推。 例如,导出的CSV应该是:

1.csv文件

^{pr2}$

2.csv(此csv应为空)

3.csv公司

5384    2841    5963    3095    miscellaneous
5985    2797    6611    3080    miscellaneous
3512    3012    4025    3366    miscellaneous
5085    2974    5587    3367    miscellaneous
2593    3224    3148    3469    miscellaneous
1044    3630    1511    3928    miscellaneous
4764    3619    5283    3971    miscellaneous
5103    3613    5635    3928    miscellaneous

如何使用python和pandas来实现这一点?在


Tags: 文件csv数据pandas公司nan命名miscellaneous
2条回答
# program splits one big csv file into individiual image csv 's 
import pandas as pd
import numpy as np
df = pd.read_csv('results.csv', header=None)
#df1 = df.replace(np.nan, '1', regex=True)

print(df)

for n,g in df.assign(grouper = df[0].where(df[1].isnull())
                                      .ffill().astype('category'))\
             .dropna().groupby('grouper'):
    g.drop('grouper', axis=1).to_csv(n+'.csv',float_format="%.0f", header=None, index=False )

这将产生所需的结果

您可以使用:

for n,g in df.assign(grouper = df['0'].where(df['1'].isnull())
                                      .ffill().astype('category'))\
             .dropna().groupby('grouper'):
    g.drop('grouper', axis=1).to_csv(n+'.csv', header=None, index=False)

注意:使用astype('category')拾取没有记录的组

输出!dir *.JPG.csv

^{pr2}$

列出1的内容。日文.csv在

2883,2957.0,3412.0,3340.0,miscellaneous
3517,3007.0,4062.0,3371.0,miscellaneous
5678,3158.0,6299.0,3423.0,miscellaneous
1627,3287.0,2149.0,3694.0,miscellaneous
2894,3272.0,3421.0,3664.0,miscellaneous
3525,3271.0,4064.0,3672.0,miscellaneous
4759,3337.0,5321.0,3640.0,miscellaneous
6141,3289.0,6664.0,3654.0,miscellaneous
1017,3598.0,1539.0,3979.0,miscellaneous
1624,3586.0,2155.0,3993.0,miscellaneous
2252,3612.0,2777.0,3967.0,miscellaneous
3211,3548.0,3735.0,3944.0,miscellaneous
6052,3616.0,6572.0,3983.0,miscellaneous
691,3911.0,1204.0,4223.0,miscellaneous

相关问题 更多 >

    热门问题