PySpark/HIVE:附加到现有选项卡

2024-05-29 02:46:55 发布

您现在位置:Python中文网/ 问答频道 /正文

最基本的问题pyspark/hive问题:

如何追加到现有表?我的尝试如下

from pyspark import SparkContext, SparkConf
from pyspark.sql import HiveContext
conf_init = SparkConf().setAppName('pyspark2')
sc = SparkContext(conf = conf_init)
hive_cxt = HiveContext(sc)

import pandas as pd
df = pd.DataFrame({'a':[0,0], 'b':[0,0]})
sdf = hive_cxt.createDataFrame(df)
sdf.write.mode('overwrite').saveAsTable('database.table') #this line works

df = pd.DataFrame({'a':[1,1,1], 'b':[2,2,2]})
sdf = hive_cxt.createDataFrame(df)
sdf.write.mode('append').saveAsTable('database.table') #this line does not work
#sdf.write.insertInto('database.table',overwrite = False) #this line does not work

谢谢! 山姆


Tags: fromimportdfconflinetablethisdatabase

热门问题