R的读表Python中的等价物

2024-04-19 06:44:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将一些处理工作从R转移到Python。在R中,我使用读表()以读取非常凌乱的CSV文件,并以正确的格式自动拆分记录。E、 g

391788,"HP Deskjet 3050 scanner always seems to break","<p>I'm running a Windows 7 64 blah blah blah........ake this work permanently?</p>

<p>Update: It might have something to do with my computer. It seems to work much better on another computer, windows 7 laptop. Not sure exactly what the deal is, but I'm still looking into it...</p>
","windows-7 printer hp"

被正确地分为4列。1条记录可以拆分成多行,而且到处都是逗号。我只需要:

^{pr2}$

Python中有没有什么东西可以同样出色地做到这一点?在

谢谢!在


Tags: 文件csvtowindows格式记录itcomputer
2条回答

pandas模块还提供许多类似R的函数和数据结构,包括read_csv。这里的优点是数据将以pandas DataFrame的形式读入,这比标准的python list或dict要容易一些(尤其是如果您习惯于R)。下面是一个例子:

>>> from pandas import read_csv
>>> ugly = read_csv("ugly.csv",header=None)
>>> ugly
        0                                              1  \
0  391788  HP Deskjet 3050 scanner always seems to break   

                                                   2                     3  
0  <p>I'm running a Windows 7 64 blah blah blah.....  windows-7 printer hp  

您可以使用csv模块。在

from csv import reader
csv_reader = reader(open("C:/text.txt","r"), quotechar="\"")

for row in csv_reader:
    print row

['391788', 'HP Deskjet 3050 scanner always seems to break', "<p>I'm running a Windows 7 64 blah blah blah........ake this work permanently?</p>\n\n<p>Update: It might have something to do with my computer. It seems to work much better on another computer, windows 7 laptop. Not sure exactly what the deal is, but I'm still looking into it...</p>\n", 'windows-7 printer hp']

输出长度=4

相关问题 更多 >