为python 3和python 2获取类似于读取器的对象的简单接口。
get-reader的Python项目详细描述
获取读卡器
从不同的 数据源。
适用于Python3.8到3.2、2.7和2.6:
fromget_readerimportget_readerreader=get_reader('myfile.csv')forrowinreader:print(', '.join(row))
支持显式文件处理:
fromget_readerimportget_readerwithopen('myfile.csv',newline='')ascsvfile:reader=get_reader(csvfile)forrowinreader:print(', '.join(row))
如果安装了支持包,则自动检测其他数据源:
fromget_readerimportget_reader# From an Excel filereader=get_reader('myfile.xlsx')# requires xlrd package# From a DataFramedf=pd.DataFrame([...])reader=get_reader(df)# requires pandas# From a DBF filereader=get_reader('myfile.dbf')# requires dbfread package
可以直接调用显式构造函数来覆盖自动检测行为:
fromget_readerimportget_reader# From a tab-delimited text filereader=get_reader.from_csv('myfile.txt',delimiter='\t')
安装
您可以使用pip
安装get_reader
,也可以直接在
您自己的项目:
pip install get_reader
虽然excel需要xlrd
和dbfread
,但没有硬依赖项
或dbf文件;在python 2.6、2.7、3.2到3.8、pypy、pypy3和
jython;在apache许可证版本2下免费提供。
要安装可选的附加组件,请使用以下命令:
pip install get_reader[excel,dbf]
参考
获取读卡器(obj,*args,**kwds)
返回一个reader对象,该对象将在
给定的数据如csv.reader()
。
类型obj用于自动确定适当的 处理程序。如果obj是一个字符串,则将其视为 扩展决定了它的内容类型。任何*args和**kwds 传递给基础处理程序。
使用自动检测:
fromget_readerimportget_reader# CSV file.reader=get_reader('myfile.csv')# Excel file.reader=get_reader('myfile.xlsx',worksheet='Sheet2')# Pandas DataFrame.df=pandas.DataFrame([...])reader=get_reader(df)# DBF file.reader=get_reader('myfile.dbf')
如果无法自动确定obj类型,则可以
调用下面列出的“from_...()
”构造函数方法之一。
from_csv(csvfile, encoding='utf-8', **kwds)
Return a reader object which will iterate over lines in the given csvfile. The csvfile can be a string (treated as a file path) or any object which supports the iterator protocol and returns a string each time its
__next__()
method is called---file objects and list objects are both suitable. If csvfile is a file object, it should be opened withnewline=''
.fromget_readerimportget_readerreader=get_reader.from_csv('myfile.tab',delimiter='\t')Using explicit file handling:
fromget_readerimportget_readerwithopen('myfile.csv')ascsvfile:reader=get_reader.from_csv(fh)
from_dicts(records, fieldnames=None)
Return a reader object which will iterate over the given dictionary records. This can be thought of as converting a
csv.DictReader()
into a plain, non-dictionarycsv.reader()
.fromget_readerimportget_readerdictrows=[{'A':1,'B':'x'},{'A':2,'B':'y'},]reader=get_reader.from_dicts(dictrows)This method assumes that record contents are consistent. If the first record is a dictionary, it is assumed that all following records will be dictionaries with matching keys.
from_excel(path, worksheet=0)
Return a reader object which will iterate over lines in the given Excel worksheet. path must specify to an XLSX or XLS file and worksheet should specify the index or name of the worksheet to load (defaults to the first worksheet).
Load first worksheet:
fromget_readerimportget_readerreader=get_reader.from_excel('mydata.xlsx')Specific worksheets can be loaded by name (a string) or index (an integer):
reader=get_reader.from_excel('mydata.xlsx','Sheet 2')
from_pandas(df, index=True)
Return a reader object which will iterate over records in the
pandas.DataFrame
df.
from_dbf(filename, encoding=None, **kwds)
Return a reader object which will iterate over lines in the given DBF file (from dBase, FoxPro, etc.).
from_squint(obj, fieldnames=None)
Return a reader object which will iterate over the records returned from a squint
Select
,Query
, orResult
. If the fieldnames argument is not provided, this function tries to construct names using the values from the underlying object.
在2.0版apache许可下免费授权
(c)版权所有2018--2019肖恩·布朗。