TypeError: 无法序列化为JSON Py2neo 批量提交

0 投票
2 回答
600 浏览
提问于 2025-04-18 12:48

我正在创建一个非常大的图数据库,里面有超过140万个节点和1.6亿个关系。我的代码如下:

from py2neo import neo4j
# first we create all the nodes
batch = neo4j.WriteBatch(graph_db)
nodedata = []

for index, i in enumerate(words): # words is predefined
    batch.create({"term":i})
    if index%5000 == 0: #so as not to exceed the batch restrictions
        results = batch.submit()
        for x in results:
            nodedata.append(x)
        batch = neo4j.WriteBatch(graph_db)

results = batch.submit()
for x in results:
    nodedata.append(x)

#nodedata contains all the node instances now
#time to create relationships

batch = neo4j.WriteBatch(graph_db)
for iindex, i in enumerate(weightdata): #weightdata is predefined 
    batch.create((nodedata[iindex], "rel", nodedata[-iindex], {"weight": i})) #there is a different way how I decide the indexes of nodedata, but just as an example I put iindex and -iindex
    if iindex%5000 == 0: #again batch constraints
        batch.submit() #this is the line that shows error
        batch = neo4j.WriteBatch(graph_db)
batch.submit()

我遇到了以下错误:

Traceback (most recent call last):
  File "test.py", line 53, in <module>
    batch.submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2116, in submit
    for response in self._submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2085, in _submit
    for id_, request in enumerate(self.requests)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 427, in _send
    return self._client().send(request)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 351, in send
    rs = self._send_request(request.method, request.uri, request.body, request.$
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 326, in _send_re$
    data = json.dumps(data, separators=(",", ":"))
  File "/usr/lib64/python2.6/json/__init__.py", line 237, in dumps
    **kw).encode(obj)
  File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode
    chunks = list(self.iterencode(o))
  File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode
    for chunk in self._iterencode_list(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 204, in _iterencode_list
    for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
    for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
    for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 317, in _iterencode
    for chunk in self._iterencode_default(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 323, in _iterencode_default
    newobj = self.default(o)
  File "/usr/lib64/python2.6/json/encoder.py", line 344, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: 3448 is not JSON serializable

有没有人能告诉我到底发生了什么,以及我该如何解决这个问题?任何帮助都非常感谢!提前谢谢大家! :)

2 个回答

1

我从来没有用过p2neo,但如果我看看它的说明文档

这个:

batch.create((nodedata[iindex], "rel", nodedata[-iindex], {"weight": i}))

缺少了rel()这一部分:

batch.create(rel(nodedata[iindex], "rel", nodedata[-iindex], {"weight": i}))
1

要判断这个问题其实挺难的,因为我们没法用相同的数据来运行你的代码。不过,这个问题很可能是因为weightdata里的项目类型不对。

你可以一步一步地检查你的代码,或者在运行时打印出数据的类型,看看在{"weight": i}这个部分,i的类型是什么。你可能会发现它不是int类型,而这是在处理JSON数字时所需要的。如果这个猜测是对的,你就需要找到一种方法,把这个属性值转换成int类型,然后再用在属性设置中。

撰写回答