JSON表架构到bigquery.TableSchema对于BigQuerySin问题的回答

JSON表架构到bigquery.TableSchema对于BigQuerySin

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

andreapierleoni发布的上述代码片段适用于<code>google-cloud-bigquery</code>python客户机的旧版本，例如，<code>google-cloud-bigquery</code>的<code>0.25.0</code>版本，它碰巧通过<code>pip install apache-beam[gcp]</code>安装。在 但是，bigquerypython客户机API在<code>google-cloud-bigquery</code>的较新版本中发生了巨大的变化，例如在我当前使用的<code>1.8.0</code>版本中，<code>bigquery.TableFieldSchema()</code>和{<cd8>}都不起作用。在 如果您使用的是<code>google-cloud-bigquery</code>包的最新版本，下面介绍如何从JSON文件获取所需的<code>SchemaField</code>列表（例如，创建表所必需的）。这是AndreaPierleoni发布的代码的改编版（谢谢！）在 <pre><code>def _get_field_schema(field): name = field['name'] field_type = field.get('type', 'STRING') mode = field.get('mode', 'NULLABLE') fields = field.get('fields', []) if fields: subschema = [] for f in fields: fields_res = _get_field_schema(f) subschema.append(fields_res) else: subschema = [] field_schema = bigquery.SchemaField(name=name, field_type=field_type, mode=mode, fields=subschema ) return field_schema def parse_bq_json_schema(schema_filename): schema = [] with open(schema_filename, 'r') as infile: jsonschema = json.load(infile) for field in jsonschema: schema.append(_get_field_schema(field)) return schema </code></pre> 现在，假设您有一个表的<a href="https://cloud.google.com/bigquery/docs/schemas#specifying_a_json_schema_file" rel="nofollow noreferrer">schema already defined in JSON</a>。假设您有<a href="https://gist.github.com/nonbeing/9e1cbec94a4ad7a17cf3948db5ccb901" rel="nofollow noreferrer">this particular "schema.json" file</a>，那么使用上面的helper方法，您可以获得Python客户机所需的<code>SchemaField</code>表示，如下所示： ^{pr2}$ 现在，对于<a href="https://cloud.google.com/bigquery/docs/tables#bigquery-create-table-python" rel="nofollow noreferrer">create a table having the above schema using the Python SDK</a>，您将执行以下操作： <pre><code>dataset_ref = bqclient.dataset('YOUR_DATASET') table_ref = dataset_ref.table('YOUR_TABLE') table = bigquery.Table(table_ref, schema=res_schema) </code></pre> 您可以选择按如下方式设置基于时间的分区（如果需要）： <pre><code>table.time_partitioning = bigquery.TimePartitioning( type_=bigquery.TimePartitioningType.DAY, field='timestamp' # name of column to use for partitioning ) </code></pre> 最后创建了表格： <pre><code>table = bqclient.create_table(table) print('Created table {}, partitioned on column {}'.format( table.table_id, table.time_partitioning.field)) </code></pre>

JSON表架构到bigquery.TableSchema对于BigQuerySin

1 个回答

相关Python问题