Python从文件中读入字符串并将其拆分为列名和值

2024-04-26 22:09:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个具有多行的以下格式的原始数据文件:

NAME: Jack Age : 25   skill : c++ designation : Analyst other comments:this 
is basic info

NAME : Kattie Age: 45 skill: python  designation: director Other Comments: name : Jane Kattie 

我希望输出为:

    name    age skill   designation  other_Comments      name_2 
0   Jack    25  c++     analyst      This is basic Info  NA
1   Kattie  45  python  Director      NA                 Jane Kattie

我尝试过使用下面的代码,但无法处理第2行这样的特殊情况,我是python新手,请建议是否有更好的方法,关键字是确定的值集,但可能会重复多次。你知道吗

代码:

file =pd.read_excel('mydata.xlsx', sheetname="Sheet1", header=None)
file.columns =['data']

for i in range(0,len(file)):
     x=file[file.columns.values [0]][i]  
     name= re.findall(r'Name:(.*?)Age',x)
     Age= re.findall(r'Age(.*?) skill',x)
     skills= re.findall(r'skill(.*?)designation',x)
     other_Comments = re.findall(r'other comments(.*?),x)
     file['Name'][i] = name
     file['Age'][i] = Age
     file['Skill'][i] = skills
     file ['Other_Comments'][i] = other_Comments

Tags: namereagebasiciscommentsskillfile