通过引用值拆分文本

2024-04-29 16:51:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我是python新手,尝试以特定的方式拆分文本,在内部时忽略子字符串中的逗号“”

text='ppr5007801780,https://www.jcpenney.com/p/alfred-dunner-womens-3-4-sleeve-tunic-top/ppr5007801780,JCPenney,58.0,28.99,"https://s7d4.scene7.com/is/image/JCPenney/DP0208201907032983M.tif?wid=350&hei=350&op_usm=.4,.8,0,0&resmode=sharp2",,81730320182,Alfred Dunner Womens 3/4 Sleeve Tunic Top,Closure Type:Pullover Head|Neckline:Collar Neck|Sleeve Length:3/4 Sleeve|Apparel Length:24.5 Inches,alfred dunner,3,5.0,Navy White,"Embroidered, Scalloped",,/d/women,Available,1572644741'

我有上面的字符串,我想使用split(','),忽略引号中的逗号(即getlen(my_list)=19)

我试过使用my_list=text.split(','),但是我得到了23个,我不知道如何使用regex或其他方法。你知道吗

感谢您的帮助


Tags: 字符串texthttpscommylengthlistsplit
2条回答

您可以使用csv模块。为了通过考试csv.reader文件一个字符串,你需要把它变成一个类似文件的对象,这可以通过StringIO实现。你知道吗

import csv
from io import StringIO
text='ppr5007801780,https://www.jcpenney.com/p/alfred-dunner-womens-3-4-sleeve-tunic-top/ppr5007801780,JCPenney,58.0,28.99,"https://s7d4.scene7.com/is/image/JCPenney/DP0208201907032983M.tif?wid=350&hei=350&op_usm=.4,.8,0,0&resmode=sharp2",,81730320182,Alfred Dunner Womens 3/4 Sleeve Tunic Top,Closure Type:Pullover Head|Neckline:Collar Neck|Sleeve Length:3/4 Sleeve|Apparel Length:24.5 Inches,alfred dunner,3,5.0,Navy White,"Embroidered, Scalloped",,/d/women,Available,1572644741'

f = StringIO(text)

list = csv.reader(f, delimiter=',', quotechar = '"')
for r in list:
    print(len(r))

csv阅读器允许您指定一个“quotechar”参数,我相信这基本上解决了您的问题。你知道吗

你可以做:

from io import StringIO
import pandas as pd
pd.read_csv(StringIO(text)).columns.tolist()

相关问题 更多 >