2024-04-25 21:48:07 发布
网友
我有一个panda数据框,其日期格式如下:
发布日期=2018-08-31 我使用panda to_gbq()函数将数据转储到bigquery表中。在转储数据之前,我确保列的格式与表scheme匹配。publishedDate仅在bigquery表中是日期。如何实现类似于:
df['PublishDate'] = df['PublishDate'].astype('?????')
我试过约会时间
但这些都没用!在
我在pandas gbq中找不到日期类型的支持。在
另一个选项是使用bigquery客户机插入:
from google.cloud import bigquery def chunks(l, chunk_size): for i in range(0, len(l), chunk_size): yield l[i:i + chunk_size] CLIENT_ROW_LIMIT = 10000 SCHEMA = [ bigquery.SchemaField('...'), ] def push_with_date(df): client = bigquery.Client(project='...') dataset = client.dataset('...') table_ref = dataset.table('...') rows = [row.tolist() for index, row in df.iterrows()] for i, chunk in enumerate(chunks(rows, CLIENT_ROW_LIMIT)): print('pushing', i) errors = client.insert_rows(table_ref, chunk, SCHEMA) if errors: # Handle raise Exception
阿菲克,熊猫gbq doesn't seem to have support for the DATE type。因此,最好的选择可能是将列导出为时间戳,然后使用SQL查询将其转换为日期。在
df['PublishTimestamp'] = pd.to_datetime( df['PublishDate'], format='%Y-%m-%d', errors='coerce' ) df.to_gbq("YOUR-DATASET.YOUR-TABLE", project_id="YOUR-PROJECT") client = bigquery.Client() job_config = bigquery.QueryJobConfig() table_ref = client.dataset("YOUR-DATASET").table("YOUR-TABLE") job_config.destination = ref_table job_config.write_disposition = "WRITE_TRUNCATE" sql = """ SELECT *, DATE(PublishTimestamp) as PublishDate FROM `YOUR-PROJECT.YOUR-DATASET.YOUR-TABLE` """ query_job = client.query( sql, job_config=job_config ) query_job.result()
我在pandas gbq中找不到日期类型的支持。在
另一个选项是使用bigquery客户机插入:
阿菲克,熊猫gbq doesn't seem to have support for the DATE type。因此,最好的选择可能是将列导出为时间戳,然后使用SQL查询将其转换为日期。在
相关问题 更多 >
编程相关推荐