用Join-on-TimeStamp压平JSON,类似于问题Flattening a nested JSON to multiple rows,但需要Java、Scala、Spark和PySpark中的解决方案
输入JSON:
{ "Sensor": "seda_01", "Location": { "City": "Los Angeles", "State": "CA" }, "rain_value": [ [ "1564073521", "0.02" ], [ "1564073522", "0.01" ], [ "1564073523", "0.03" ] ], "sun_value": [ [ "1564073521", "0.11" ], [ "1564073522", "0.10" ], [ "1564073523", "0.13" ] ], "wind_value": [ [ "1564073521", "0.21" ], [ "1564073522", "0.21" ], [ "1564073523", "0.23" ] ] }
{ "Sensor": "seda_01", "Location": { "City": "Los Angeles", "State": "CA" }, "rain_value": [ [ "1564073521", "0.02" ], [ "1564073522", "0.01" ], [ "1564073523", "0.03" ] ], "sun_value": [ [ "1564073521", "0.11" ], [ "1564073522", "0.10" ], [ "1564073523", "0.13" ] ], "wind_value": [ [ "1564073521", "0.21" ], [ "1564073522", "0.21" ], [ "1564073523", "0.23" ] ] }
输出dataframe
:
| Sensor| Location_City | Location_Sate| Rain_value_TS | Rain_value | Sun_value_TS | Sun_value |
------------------------------------------------------- ----------------
| seda_01 | Los Angeles | CA | 1564073521 | 0.02 | 1564073521 | 0.11 |
| seda_01 | Los Angeles | CA | 1564073522 | 0.01 | 1564073522 | 0.10 |
请注意:Rain_value_TS = Sun_value_TS
。我们可以用其中一个作为时间戳,
对于给定的时间戳,如果我们只有Rain\u值,Rain\u值,我们可以为Sun\u值输入NULL。你知道吗
目前没有回答
相关问题 更多 >
编程相关推荐