将pandas数据帧转换为内存中类似文件的对象？

for date in required_date_range: df = pd.read_sql(sql=query, con=pg_engine, params={'x' : date}) ... do stuff to the columns ... df.to_sql('table_name', pg_engine, index=False, if_exists='append', dtype=final_table_dtypes)

def process_file(conn, table_name, file_object): fake_conn = cms_dtypes.pg_engine.raw_connection() fake_cur = fake_conn.cursor() fake_cur.copy_expert(sql=to_sql % table_name, file=file_object) fake_conn.commit() fake_cur.close() #after doing stuff to the dataframe s_buf = io.StringIO() df.to_csv(s_buf) process_file(cms_dtypes.pg_engine, 'fact_cms_employee', s_buf)

2条回答

网友

1楼 · 编辑于 2024-05-15 18:00:08

我在执行ptrj的解决方案时遇到问题。

我认为这个问题源于熊猫将缓冲区的位置设置到最后。

见下表：

from StringIO import StringIO
df = pd.DataFrame({"name":['foo','bar'],"id":[1,2]})
s_buf = StringIO()
df.to_csv(s_buf)
s_buf.__dict__

# Output
# {'softspace': 0, 'buflist': ['foo,1\n', 'bar,2\n'], 'pos': 12, 'len': 12, 'closed': False, 'buf': ''}

注意位置是12。我必须将pos设置为0以便随后从命令复制到工作

s_buf.pos = 0
cur = conn.cursor()
cur.copy_from(s_buf, tablename, sep=',')
conn.commit()

网友

2楼 · 编辑于 2024-05-15 18:00:08

Python模块io（docs）为类文件对象提供了必要的工具。

import io

# text buffer
s_buf = io.StringIO()

# saving a data frame to a buffer (same as with a regular file):
df.to_csv(s_buf)

编辑。 （我忘了）为了以后从缓冲区读取数据，应该将其位置设置为开头：

s_buf.seek(0)

我不熟悉psycopg2，但根据docs，可以同时使用copy_expert和copy_from，例如：

cur.copy_from(s_buf, table)

（对于Python 2，请参见StringIO。）

相关问题更多 >

编程相关推荐

热门问题

热门文章