<p>这不是一个CSV,我看不到一个方便的方法来说服<code>read_csv</code>做正确的事情。幸运的是,这里似乎有一条简单的规则。第一个空间之前的东西,然后是后面的东西<code>str.split</code>就是这样做的</p>
<pre><code>import pandas as pd
from pathlib import Path
#in_file = Path("C:/Users/andre/Desktop/bea_api_test/python-bureau-economic-analysis-api-client/testttt/output.txt")
in_file = Path("test.txt")
out_file = in_file.with_name(in_file.stem + "_").with_suffix(".csv")
# test data
open(in_file, "w").write("""\
MT0111500000000 Anniston-Oxford-Jacksonville, AL Metropolitan Statistical Area
MT0112220000000 Auburn-Opelika, AL Metropolitan Statistical Area
MT0113820000000 Birmingham-Hoover, AL Metropolitan Statistical Area""")
# convert to csv
pd.DataFrame([line.strip().split(" ",1) for line in open(in_file)],
columns=["COLUMN1", "COLUMN2"]).to_csv(out_file, index=None, headr=False)
# visual verification
print(open(out_file).read())
</code></pre>
<p>输出</p>
<pre><code>MT0111500000000,"Anniston-Oxford-Jacksonville, AL Metropolitan Statistical Area"
MT0112220000000,"Auburn-Opelika, AL Metropolitan Statistical Area"
MT0113820000000,"Birmingham-Hoover, AL Metropolitan Statistical Area"
</code></pre>
<p>在本例中,我立即编写了csv,以便自动从内存中删除数据帧。您也可以使用CSV模块,一次写一行。这将使用更少的内存,因为它不必将整个文件保存在内存中。由于<code>csv</code>是标准python库的一部分,因此<code>pandas</code>没有外部依赖性。添加一点文件名处理</p>
<pre><code>import csv
from pathlib import Path
#in_file = Path("C:/Users/andre/Desktop/bea_api_test/python-bureau-economic-analysis-api-client/testttt/output.txt")
in_file = Path("test.txt")
out_file = in_file.with_name(in_file.stem + "_").with_suffix(".csv")
# test data
open(in_file, "w").write("""\
MT0111500000000 Anniston-Oxford-Jacksonville, AL Metropolitan Statistical Area
MT0112220000000 Auburn-Opelika, AL Metropolitan Statistical Area
MT0113820000000 Birmingham-Hoover, AL Metropolitan Statistical Area""")
# convert to csv
with open(in_file) as infp, open(out_file, "w") as outfp:
writer = csv.writer(outfp)
writer.writerows(line.strip().split(" ",1) for line in infp)
# visual verification
print(open(out_file).read())
</code></pre>