当我们有空值时,将信息从字典映射到数据帧

2024-05-23 07:39:29 发布

您现在位置:Python中文网/ 问答频道 /正文

这是第一个数据帧

Umls                                    Snomed
C0027497/Nausea /Sign or Symptom    Nausea (finding)[FN/422587007] 
C0151786 / Muscle/Sign or Symptom   Muscle weakness [(finding) /FN/26544005]
C2127305 /bitter/ Sign or Symptom    ?
NA                                   NA

我用下面的代码创建了一个字典

^{pr2}$

现在,对于数据帧B:

id     symptom      UMLS                               
1      nausea    C0027497/Nausea /Sign or Symptom
2      muscle     C2127305 /bitter/ Sign or Symptom 
3      headache     
4      pain 
5      bitter     C2127305 /bitter/ Sign or Symptom 

对于字典中可用的“UMLS”列中的任何值,我想创建另一列“Snomed”,其中包括字典中的“Snomed”值。所以数据框C应该是这样的:

  id     symptom      UMLS                                   Snomed                         
    1      nausea    C0027497/Nausea /Sign or Symptom    Nausea (finding)[FN/422] 
    2      muscle    C0151786 / Muscle/Sign or Symptom   Muscle [(fi)/FN/25]
    3      headache        
    4      pain 
    5      bitter     C2127305 /bitter/ Sign or Symptom   ?

有什么帮助吗?谢谢


Tags: or数据字典fnsignbitterfindingsnomed
2条回答

您可以对UMLS列的每个元素使用apply函数,并从字典equiv_snomed中获取值。如果字典中没有键,您可以返回np.nan公司在

如果您的数据帧B命名为df2。那么

df2['Snomed'] = df2['UMLS'].apply(lambda x: equiv_snomed.get(x, np.nan))

参见埃德丘姆对this Stack Overflow question的回答。在

根据您的情况,它看起来像:

import pandas as pd

# create dictionary
d = {'umls1':'snomed1','umls2':'snomed2','umls3':'snomed3'}

# create empty dataframe
columns = ['symptom','umls','snomed']
df = pd.DataFrame(columns = columns)

# fill it with symptoms and with umls, with some umls NULL
df['symptom'] = ['nausea','muscle','headache','pain','bitter']
df.ix[0,'umls'] = 'umls1'
df.ix[1,'umls'] = 'umls2'
df.ix[4,'umls'] = 'umls3'

# add a third column with snomed values from dictionary
df['snomed'] = df['umls'].map(d)

给出以下输出:

^{pr2}$

相关问题 更多 >

    热门问题