将unicode字符串中的JSON解析为字典问题的回答

将unicode字符串中的JSON解析为字典

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

正如其他人提到的，您的输入数据不是JSON。理想情况下，应该将其固定在上游，以便获得有效的JSON。你知道吗 但是，如果这超出了您的控制范围，您可以将该数据转换为JSON。你知道吗 主要的问题是那些没有引号的键。我们可以通过使用正则表达式在每行的第一个字段中搜索有效的名称来解决这个问题。如果找到一个有效的名字，我们就用双引号把它括起来。你知道吗 <pre><code>import json import re source = u'''[{ attributes: { NAME: "Name_1ĂĂÎÎ", TYPE: "Tip1", LOC_JUD: "Bucharest", LAT_LON: "234343/432545", S70: "2342345", MAP: "Map_one", SCH: "1:5000", SURSA: "PPP" } }, { attributes: { NAME: "NAME_2șțț", TYPE: "Tip2", LOC_JUD: "cea", LAT_LON: "123/54645", S70: "4324", MAP: "Map_two", SCH: "1:578000", SURSA: "PPP" } } ] ''' # Split source into lines, then split lines into colon-separated fields a = [s.strip().split(': ') for s in source.splitlines()] # Wrap names in first field in double quotes valid_name = re.compile('(^\w+$)') for row in a: row[0] = valid_name.sub(r'"\1"', row[0]) # Recombine the data and load it data = json.loads(' '.join([': '.join(row) for row in a])) # Test print data[0]["attributes"] print '- ' * 30 print json.dumps(data, indent=4, ensure_ascii=False) </code></pre> 输出 <pre><code>{u'LOC_JUD': u'Bucharest', u'NAME': u'Name_1\u0102\u0102\xce\xce', u'MAP': u'Map_one', u'SURSA': u'PPP', u'S70': u'2342345', u'TYPE': u'Tip1', u'LAT_LON': u'234343/432545', u'SCH': u'1:5000'} - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - [ { "attributes": { "LOC_JUD": "Bucharest", "NAME": "Name_1ĂĂÎÎ", "MAP": "Map_one", "SURSA": "PPP", "S70": "2342345", "TYPE": "Tip1", "LAT_LON": "234343/432545", "SCH": "1:5000" } }, { "attributes": { "LOC_JUD": "cea", "NAME": "NAME_2șțț", "MAP": "Map_two", "SURSA": "PPP", "S70": "4324", "TYPE": "Tip2", "LAT_LON": "123/54645", "SCH": "1:578000" } } ] </code></pre> 注意，这个代码有点脆弱。它可以处理问题中所示格式的数据，但是如果一行中有多个键值对，它就不起作用了。你知道吗 如前所述，解决这个问题的最佳方法是在上游，在那里生成非JSON。你知道吗

将unicode字符串中的JSON解析为字典

1 个回答

相关Python问题