如何在CSV文件中获取一列,并将文本分隔并保存到python中的其他列

2024-06-11 03:09:40 发布

您现在位置:Python中文网/ 问答频道 /正文

例如,我们有一个csv文件

name       address           age 
vip     bang #@ india     12 
ags     myso ^% india     25 
dhs     bang #@ india     14 
fgn     nyk  @$ bangla    45         

如何做这个和添加到不同的列

^{pr2}$

我使用的代码是

import re
import csv
with open("/home/vipul/Desktop/example.csv", 'rb') as f:
    mycsv = csv.reader(f)
    for row in mycsv:
        text = row[0]
        txt = re.findall(r'(\w+[\s\w]*)\b', text)  
        print txt

Tags: 文件csvtextnameimportretxtage
1条回答
网友
1楼 · 发布于 2024-06-11 03:09:40

使用pandas很容易:

import pandas as pd

# Create dataframe
df = pd.DataFrame({
    "name": ["vip", "ags", "dhs", "fgn"],
    "address": ["bang #@ india", "myso ^% india", "bang %@ india", "nyk @$ bangla"],
    "age": [12, 25, 14, 45]
})

# Split "address" string on spaces, keep first split
# as city, last split as country
df["city"] = df["address"].str.split(" ").str[0]
df["country"] = df["address"].str.split(" ").str[-1]

print df

结果是:

^{pr2}$

编辑:

(可选)仅保留某些列:

^{3}$

结果是:

  name  city country  age
0  vip  bang   india   12
1  ags  myso   india   25
2  dhs  bang   india   14
3  fgn   nyk  bangla   45

编辑2:

您可以使用pandas读取和写入文件,而不是自己创建数据帧:

# Read the dataframe from file:
df = pd.read_csv("input_file.csv", sep=",")

# Split "address" string on spaces, keep first split
# as city, last split as country
df["city"] = df["address"].str.split(" ").str[0]
df["country"] = df["address"].str.split(" ").str[-1]

# Optionally, keep only certain columns
df = df[["name", "city", "country", "age"]]

# Write altered dataframe to file
df.to_csv("output_file.csv", sep=",", index=False)

编辑3:

正如评论中所指出的,拆分两次是不必要的;您可以这样做:

split = df["address"].str.split(" ")
df["city"] = split.str[0]
df["country"] = split.str[-1]

相关问题 更多 >