我试图使用云数据流在Google云平台上运行apachebeam管道。然而,它似乎没有越过这一行代码:
p | 'GetFile' >> beam.io.ReadFromText(input_filename)
它返回此警告并保持不变:
警告:根:使用指数退避重试:等待5.14973849643秒后重试存在,因为捕获到异常:SSLHandshakeError:[SSL:CERTIFICATE\u VERIFY\u FAILED]CERTIFICATE VERIFY FAILED(\u SSL.c:661)
这是我的密码:
import apache_beam as beam
PROJECT='xxxx'
BUCKET='xxxx'
class Split(beam.DoFn):
def process(self, element):
IATA,AIRPORT,CITY,STATE,COUNTRY,LATITUDE,LONGITUDE= element.split(",")
return [{
'IATA': IATA,
'AIRPORT': AIRPORT,
'CITY': CITY
}]
def run():
argv = [
'--project={0}'.format(PROJECT),
'--job_name=examplejob2',
'--save_main_session',
'--staging_location=gs://{0}/staging/'.format(BUCKET),
'--temp_location=gs://{0}/staging/'.format(BUCKET),
'--runner=DataflowRunner'
]
p = beam.Pipeline(argv=argv)
input_filename = 'gs://{0}/airports.csv'.format(BUCKET)
output_filename = 'gs://{0}/output.txt'.format(BUCKET)
# find all lines that contain the searchTerm
(p
|'GetFile' >> beam.io.textio.ReadFromText(input_filename)
|'Split' >> beam.ParDo(Split())
|'Write' >> beam.io.WriteToText(output_filename)
)
p.run()
if __name__ == '__main__':
run()
有人能帮忙解决这个问题吗?你知道吗
目前没有回答
相关问题 更多 >
编程相关推荐