擅长:python、mysql、java
<p>在进行处理之前,将迭代器标志设置为true并在循环中拆分文件</p>
<p>参考:<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sas.html" rel="nofollow noreferrer">https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sas.html</a></p>
<p>或在执行输出之前在SAS中拆分文件</p>
<hr/>
<p>我认为你正在尝试的是:</p>
<pre class="lang-py prettyprint-override"><code>CHUNK = 10
df=pd.read_sas(path,format='SAS7BDAT',chunksize = CHUNK)
for chunk in df:
# perform compression
# write it out of your memory onto disk to_csv('new_file',
# mode='a', # append mode
# header=False, # don't rewrite the header, you need to init the file with a header
# compression='gzip') # this is more to save space on disk maybe not needed
df=pd.read_csv(new_file)
</code></pre>
<p>您可以尝试压缩循环内的数据,因为否则在合并时它将再次失败:</p>
<ol>
<li>下降柱</li>
<li>下限数值型</li>
<li>范畴</li>
<li>稀疏列</li>
</ol>
<p>参考:<a href="https://pythonspeed.com/articles/pandas-load-less-data/" rel="nofollow noreferrer">https://pythonspeed.com/articles/pandas-load-less-data/</a></p>