将多列与Numpy和panda进行比较

2024-04-29 04:31:42 发布

您现在位置:Python中文网/ 问答频道 /正文

你好,我的目标是找到哪条线路(Nom_ci) 我找不到正确的路径,我正试图找到正确的方法, 我用了一套IF ELIF。。。但时间是巨大的

你能帮我找到最好的方法吗

提前谢谢

import pandas as pd
import numpy as np
import re



cycling = pd.DataFrame(
    {
        'Comp_ci': [1, 2, 3, 3, 3, 3, 3, 2, 1, 1], 
        'Nom_ci': ['RONCQ_A2_OPTI_SRV_S3', 
                 'RONCQ_A3_SRV_S3, RONCQ_A2_OPTI_SRV_S3', 
                 'RONCQ_A2_TEMP_SRV_S3, RONCQ_A3_SRV_S3, RONCQ_A2_OPTI_SRV_S3', 
                 'RONCQ_A2_SRV_PC_S3, RONCQ_A2_TEMP_SRV_S3, RONCQ_A3_SRV_S3', 
                 'RONCQ_A2_PC_SRV_S3, RONCQ_A2_SRV_S3, RONCQ_A2_TEMP_SRV_S3', 
                 'RONCQ_A2_OPTI_SRV_S3, RONCQ_A2_PC_SRV_S3, RONCQ_A2_SRV_S3', 
                 'RONCQ_A3_SRV_S3, RONCQ_A2_OPTI_SRV_S3, RONCQ_A2_PC_SRV_S3', 
                 'RONCQ_A2_TEMP_SRV_S3, RONCQ_A3_SRV_S3', 
                 'RONCQ_A2_SRV_S3',
                 'RONCQ_A2_PC_SRV_S3'],
        'result hope':['autre','RONCQ_A3_VSR_S3','RONCQ_A3_VSR_S3','RONCQ_A3_VSR_S3','RONCQ_A2_VSR_S3','RONCQ_A2_VSR_S3','RONCQ_A3_VSR_S3','RONCQ_A3_VSR_S3','RONCQ_A2_VSR_S3','autre']
    }
)
print(cycling)

condition=((cycling['Count RSF Circuit']==1) & 
           (cycling['Nom ConcatSet'][0].str.contains("_OPTI").eq(False)) & 
           (cycling['Nom ConcatSet'][0].str.contains("_TEMP").eq(False))&
           (cycling['Nom ConcatSet'][0].str.contains("_PC").eq(False)))


cycling['col3'] = np.where(condition, cycling['Nom ConcatSet'], 'autre')
print(cycling)

Tags: importcia2s3nomtempa3srv
1条回答
网友
1楼 · 发布于 2024-04-29 04:31:42

编辑: 好的,我想我已经理解了你想要达到的目标:是吗

temp = cycling.Nom_ci.str.split(', +') # will split on ',' or ' ' (using regex)
print(temp)
print('-'*50)

temp = temp.explode() #will explode the lists to one serie (do note that the indexes are kept untouched)
print(temp)
print('-'*50)

temp = temp.to_frame() #will convert your serie to a dataframe
print(type(temp))
print('-'*50)

temp['match'] = temp['Nom_ci'].str.contains('(_TEMP)|(_PC)|(_OPTI)')==False #will get you a boolean serie (using regex) from your patterns, which will allow you to select the desired strings
print(temp)
print('-'*50)

temp =  temp[temp.match==True] #do select the rows corresponding to your criteria (note that the indexes are still untouched)
print(temp)
print('-'*50)

temp.rename({'Nom_ci':'col3'}, axis=1, inplace=True) #rename your column to whatever you want
print(temp)
print('-'*50)

temp.drop('match', inplace=True, axis=1) #drop the "match" column which is now useless
print(temp)
print('-'*50)

cycling = cycling.join(temp) #join the dataframes based on indexes
print(temp)
print('-'*50)

cycling['col3'].fillna('autre', inplace=True) #fill the "nan" values with "autres"
print(cycling)

相关问题 更多 >