使用PYTHON运行Google数据流Temp

2024-03-29 00:35:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我想用PYTHON执行Google数据流模板。实际上,我一直在使用Dataflow REST APICloud Functions集成来执行数据流模板。这是我在Postman中执行的数据流模板:

网址:https://dataflow.googleapis.com/v1b3/projects/{{my-project-id}}/templates:launch?gcsPath=gs://{{my-cloud-storage-bucket}}/temp/cloud-dataprep-template

    {
    "jobName": "test-datfalow-job",
    "parameters": {
        "inputLocations" : "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-folder/**/*\"}",
        "outputLocations": "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-output/output.csv\"}"
    },
    "environment": {
        "tempLocation": "gs://{{my-cloud-storage-bucket}}/tmp",
        "zone": "us-central1-f"
    }
}

我不知道是否有机会使用googleapi python客户端,或者我必须使用python的请求.post以及谷歌云认证


Tags: gs模板restapicloudoutputbucketmy
1条回答
网友
1楼 · 发布于 2024-03-29 00:35:18

您可以使用来自Dataflow API Client Library for Python的模板launch方法,如下所示:

import googleapiclient.discovery
from oauth2client.client import GoogleCredentials

project = PROJECT_ID
location = LOCATION

credentials = GoogleCredentials.get_application_default()

dataflow = googleapiclient.discovery.build('dataflow', 'v1b3', credentials=credentials)
result = dataflow.projects().templates().launch(
        projectId=project,
        body={
          "environment": {
            "zone": "us-central1-f",
            "tempLocation": "gs://{{my-cloud-storage-bucket}}/tmp"
          },
          "parameters": {
              "inputLocations" : "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-folder/**/*\"}",
              "outputLocations": "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-output/output.csv\"}"
          },
          "jobName": SOME_NAME
        },
        gcsPath = PATH_TO_TEMPLATE
).execute()

相关问题 更多 >