SparkSQL第n个数组项的SQL语法

1条回答

网友

1楼 · 发布于 2024-04-20 13:09:25

不清楚您所说的JSON object是什么意思，所以让我们考虑两种不同的情况：

结构数组

import tempfile    

path = tempfile.mktemp()
with open(path, "w") as fw: 
    fw.write('''{"stuff": [{"a": 1, "b": 2, "c": 3}]}''')
df = sqlContext.read.json(path)
df.registerTempTable("df")

df.printSchema()
## root
##  |-- stuff: array (nullable = true)
##  |    |-- element: struct (containsNull = true)
##  |    |    |-- a: long (nullable = true)
##  |    |    |-- b: long (nullable = true)
##  |    |    |-- c: long (nullable = true)

sqlContext.sql("SELECT stuff[0].a FROM df").show()

## +---+
## |_c0|
## +---+
## |  1|
## +---+

一系列地图

# Note: schema inference from dictionaries has been deprecated
# don't use this in practice
df = sc.parallelize([{"stuff": [{"a": 1, "b": 2, "c": 3}]}]).toDF()
df.registerTempTable("df")

df.printSchema()
## root
##  |-- stuff: array (nullable = true)
##  |    |-- element: map (containsNull = true)
##  |    |    |-- key: string
##  |    |    |-- value: long (valueContainsNull = true)

sqlContext.sql("SELECT stuff[0]['a'] FROM df").show()
## +---+
## |_c0|
## +---+
## |  1|
## +---+

另见Querying Spark SQL DataFrame with complex types

相关问题更多 >

编程相关推荐

热门问题

热门文章

SparkSQL第n个数组项的SQL语法

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >