使用loc更新pandas数据帧中的行不正常

2024-06-06 22:56:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个名为output的数据帧-

RAW_ENTITY_NAME   ENTITY_TYPE       ENTITY_NAME        IS_MAIN
01-03-2017        TNRMATDT          01 03 2017         1
04-02-2017        TNRSTRTDT         04 02 2017         1
documents         TNRTYPE           SIGHT              1
documents         TNRDOCSBY         NOT FOUND          1
accept            TNRDTL            accept             1 
23                TNRDAYS           23                 1

打印(数据框类型())

^{pr2}$

注意-ENTITY_TYPE = TNRTYPEENTITY_NAME = SIGHT和{}在数据帧中只出现一次。在

如果ENTITY_TYPE是TNRTYPE,ENTITY_NAME=SIGHT,并且是_MAIN=1,我想更新一些值。在

temp = output.loc[(output['IS_MAIN'] == 1) & (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']
temp = temp.reset_index(drop=True)
temp = temp[0]
if (temp == 'SIGHT'):
   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'] == 'TNRDOCSBY'), 'ENTITY_NAME'] = 'PAYMENT'

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDTL'])),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = 'NOT APPLICABLE'

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDAYS'])),
                                   ['ENTITY_NAME']] = '0'

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDAYS'])),
                                   ['RAW_ENTITY_NAME']] = ''

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRSTRTDT'),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRMATDT'),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

最终输出是-

RAW_ENTITY_NAME   ENTITY_TYPE       ENTITY_NAME        IS_MAIN
    01-03-2017        TNRMATDT          01 03 2017         1
    04-02-2017        TNRSTRTDT         04 02 2017         1
    documents         TNRTYPE           SIGHT              1
    documents         TNRDOCSBY         PAYMENT            1
    NOT APPLICABLE    TNRDTL            NOT APPLICABLE     1 
                      TNRDAYS           0                  1

如您所见,除了前两行,即ENTITY_TYPE=tnrmatt和TNRSTRTDAT,所有内容都在更新。在

我想知道为什么下面的代码没有给出期望的结果。在

output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRSTRTDT'),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRMATDT'),
                                       ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

如果有人能找出我犯下的错误,或者告诉我解决问题的方法,我会很高兴的。在

非常感谢。在


Tags: nameoutputrawismaintypenottemp
2条回答

我也有同样的问题。你要做的就是把列设为数字

df['IS_MAIN'] = df['IS_MAIN'].astype(int)

这会让它成功的。在

对我来说,你的解决方案运行良好,我尝试重写它以提高可读性,而不是重复相同的条件:

temp = output.loc[(output['IS_MAIN'] == '1') & 
                  (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']

#if values in IS_MAIN are integers
#temp = output.loc[(output['IS_MAIN'] == 1) & 
#                  (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']

if (temp.iat[0] == 'SIGHT'):
#more general working if not match condition
#if (next(iter(temp), 'not match') == 'SIGHT'):

    m1 = output['IS_MAIN'] == '1'
    #if values in IS_MAIN are integers
    #m1 = output['IS_MAIN'] == 1
    m2 = output['ENTITY_TYPE'] == 'TNRDOCSBY'
    m3 = output['ENTITY_TYPE'] == 'TNRDTL'
    m4 = output['ENTITY_TYPE'] == 'TNRDAYS'
    m5 = output['ENTITY_TYPE'].isin(['TNRMATDT','TNRSTRTDT'])

    output.loc[m1 & m2, 'ENTITY_NAME'] = 'PAYMENT'

    output.loc[m1 & m3, ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = 'NOT APPLICABLE'

    output.loc[m1 & m4, ['ENTITY_NAME']] = '0'
    output.loc[m1 & m4, ['RAW_ENTITY_NAME']] = ''

    output.loc[m1 & m5, ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

^{pr2}$

相关问题 更多 >