将现有pandas数据框中的某些行复制到

2024-05-16 22:11:48 发布

您现在位置:Python中文网/ 问答频道 /正文

“城市”栏的副本必须以“BH”开头。 复制的df.index应与原始索引相同 例如-

              STATE            CITY
315           KA               BLR
423           WB               CCU
554           KA               BHU
557           TN               BHY

# state_df is new dataframe, df is existing
state_df = pd.DataFrame(columns=['STATE', 'CITY'])      
for index, row in df.iterrows():
    city = row['CITY']

    if(city.startswith('BH')):
        append row from df to state_df # pseudocode

作为熊猫和Python的新手,我需要在伪代码方面获得最有效的帮助。


Tags: citydfindexis副本rowbhstate
3条回答

^{}^{}的溶液:

print (df['CITY'].str.startswith('BH'))
315    False
423    False
554     True
557     True

state_df = df[df['CITY'].str.startswith('BH')]
print (state_df)
    STATE CITY
554    KA  BHU
557    TN  BHY

如果只需要复制某些列,请添加^{}

state_df = df.loc[df['CITY'].str.startswith('BH'), ['STATE']]
print (state_df)
    STATE
554    KA
557    TN

计时

#len (df) = 400k
df = pd.concat([df]*100000).reset_index(drop=True)


In [111]: %timeit (df.CITY.str.startswith('BH'))
10 loops, best of 3: 151 ms per loop

In [112]: %timeit (df.CITY.str.contains('^BH'))
1 loop, best of 3: 254 ms per loop

删除了for循环,最后写下: state_df=df.loc[df['CTYNAME'].str.startswith('Washington'),cols_to_copy]

For循环可能较慢,但需要检查

试试这个:

In [4]: new = df[df['CITY'].str.contains(r'^BH')].copy()

In [5]: new
Out[5]:
    STATE CITY
554    KA  BHU
557    TN  BHY

What if I need to copy only some columns of the row and not the entire row

cols_to_copy = ['STATE']
new = df.loc[df.CITY.str.contains(r'^BH'), cols_to_copy].copy()

In [7]: new
Out[7]:
    STATE
554    KA
557    TN

相关问题 更多 >