Apache Beam Python ReadFromText正则表达式

1条回答

网友

1楼 · 发布于 2024-04-25 23:05:14

我已经知道了如何在不使用通配符的情况下读取预期日期的数据，而是编写了一个python函数。其思想是创建一个包含所有读取操作的数组，然后将该数组展平并将其用作管道的输入。在

    def read_files(pipeline, intended_day):

        collections = []
        previous_day = (datetime.strptime(intended_day, '%Y%m%d') - timedelta(days=1)).strftime('%Y%m%d')

        days = [intended_day, previous_day]
        path = "gs://sensors/{}/<hash>/*"
        for day in days:
            try:
                file_name = path.format(day)
                collection = pipeline | ('Read Past for %s' % day) >> beam.io.ReadFromText(file_name)
                collections.append(collection)
            except IOError:
                logging.error("Failed to read for day %s" % day)

        return collections

然后像这样调用管道中的函数：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

Apache Beam Python ReadFromText正则表达式

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >