在Pandas中如何将序列转换为数据帧

2024-05-26 07:47:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个CSV我需要根据一个专栏来比较它们。我需要在一个csv中放入匹配的行,在另一个csv中放入不匹配的行。 所以,我在第二个csv中创建了该列的索引,并循环了第一个。在

df1 = pd.read_csv(file1,nrows=100)
df2 = pd.read_csv(file2,nrows=100)
df2.set_index('crc', inplace = True)
matched_list = []
non_matched_list = []
    for _, row in df1.iterrows():
        try:
            x = df2.loc[row['crc']]    
            matched_list.append(x)
        except KeyError:
            non_matched_list.append(row)

这里的x是一个系列,格式如下

^{pr2}$

我的输出csv应该是以下格式

policyID,statecode,county,eq_site_limit,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
114455,FL,CLAY COUNTY,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1

对于所有匹配和不匹配的系列。我该怎么做? 我不能摆脱索引在第二个csv作为性能在重要。在

以下是两个csv文件的内容。 文件1:

policyID,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
114455,FL,CLAY COUNTY,589658,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1
448094,FL,CLAY COUNTY,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,Residential,Masonry,3
206893,FL,CLAY COUNTY,745689.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0,0,0,30.089579,-81.700455,Residential,Wood,1
333743,FL,CLAY COUNTY,0,12563.76,0,0,79520.76,86854.48,0,0,0,0,30.063236,-81.707703,Residential,Wood,3
172534,FL,CLAY COUNTY,0,254281.5,0,254281.5,254281.5,246144.49,0,0,0,0,30.060614,-81.702675,Residential,Wood,1
785275,FL,CLAY COUNTY,0,515035.62,0,0,515035.62,884419.17,0,0,0,0,30.063236,-81.707703,Residential,Masonry,3
995932,FL,CLAY COUNTY,0,19260000,0,0,19260000,20610000,0,0,0,0,30.102226,-81.713882,Commercial,Reinforced Concrete,1
223488,FL,CLAY COUNTY,328500,328500,328500,328500,328500,348374.25,0,16425,0,0,30.102217,-81.707146,Residential,Wood,1
433512,FL,CLAY COUNTY,315000,315000,315000,315000,315000,265821.57,0,15750,0,0,30.118774,-81.704613,Residential,Wood,1
142071,FL,CLAY COUNTY,705600,705600,705600,705600,705600,1010842.56,14112,35280,0,0,30.100628,-81.703751,Residential,Masonry,1

文件2:

policyID,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
119736,FL,CLAY COUNTY,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1
448094,FL,CLAY COUNTY,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,Residential,Masonry,3
206893,FL,CLAY COUNTY,190724.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0,0,0,30.089579,-81.700455,Residential,Wood,1
333743,FL,CLAY COUNTY,0,79520.76,0,0,79520.76,86854.48,0,0,0,0,30.063236,-81.707703,Residential,Wood,3
172534,FL,CLAY COUNTY,0,254281.5,0,254281.5,254281.5,246144.49,0,0,0,0,30.060614,-81.702675,Residential,Wood,1
785275,FL,CLAY COUNTY,0,51564.9,0,0,515035.62,884419.17,0,0,0,0,30.063236,-81.707703,Residential,Masonry,3
995932,FL,CLAY COUNTY,0,457962,0,0,19260000,20610000,0,0,0,0,30.102226,-81.713882,Commercial,Reinforced Concrete,1
223488,FL,CLAY COUNTY,328500,328500,328500,328500,328500,348374.25,0,16425,0,0,30.102217,-81.707146,Residential,Wood,1
433512,FL,CLAY COUNTY,315000,315000,315000,315000,315000,265821.57,0,15750,0,0,30.118774,-81.704613,Residential,Wood,1
142071,FL,CLAY COUNTY,705600,705600,705600,705600,705600,1010842.56,14112,35280,0,0,30.100628,-81.703751,Residential,Masonry,1
253816,FL,CLAY COUNTY,831498.3,831498.3,831498.3,831498.3,831498.3,1117791.48,0,0,0,0,30.10216,-81.719444,Residential,Masonry,1
894922,FL,CLAY COUNTY,0,24059.09,0,0,24059.09,33952.19,0,0,0,0,30.095957,-81.695099,Residential,Wood,1

编辑: 添加了示例csv


Tags: csvsitefrpointlimitcountyflwood
1条回答
网友
1楼 · 发布于 2024-05-26 07:47:27

我想你可以这样做:

df1.loc[df1.crc.isin(df2.index)].to_csv('/path/to/matched.csv', index=False)
df1.loc[~df1.crc.isin(df2.index)].to_csv('/path/to/unmatched.csv', index=False)

而不是循环。。。在

演示:

^{pr2}$

结果:

在匹配.csv公司名称:

policyID,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
448094,FL,CLAY COUNTY,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0.0,0,0,30.063935999999998,-81.70766400000001,Residential,Masonry,3
333743,FL,CLAY COUNTY,0.0,12563.76,0.0,0.0,79520.76,86854.48,0,0.0,0,0,30.063236,-81.70770300000001,Residential,Wood,3
172534,FL,CLAY COUNTY,0.0,254281.5,0.0,254281.5,254281.5,246144.49,0,0.0,0,0,30.060614,-81.702675,Residential,Wood,1
785275,FL,CLAY COUNTY,0.0,515035.62,0.0,0.0,515035.62,884419.17,0,0.0,0,0,30.063236,-81.70770300000001,Residential,Masonry,3
995932,FL,CLAY COUNTY,0.0,19260000.0,0.0,0.0,19260000.0,20610000.0,0,0.0,0,0,30.102226,-81.713882,Commercial,Reinforced Concrete,1
223488,FL,CLAY COUNTY,328500.0,328500.0,328500.0,328500.0,328500.0,348374.25,0,16425.0,0,0,30.102217,-81.707146,Residential,Wood,1
433512,FL,CLAY COUNTY,315000.0,315000.0,315000.0,315000.0,315000.0,265821.57,0,15750.0,0,0,30.118774,-81.704613,Residential,Wood,1
142071,FL,CLAY COUNTY,705600.0,705600.0,705600.0,705600.0,705600.0,1010842.56,14112,35280.0,0,0,30.100628000000004,-81.703751,Residential,Masonry,1

在不匹配.csv公司名称:

policyID,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
114455,FL,CLAY COUNTY,589658.0,498960.0,498960.0,498960.0,498960.0,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1
206893,FL,CLAY COUNTY,745689.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0.0,0,0,30.089578999999997,-81.700455,Residential,Wood,1

相关问题 更多 >