如何在python3中更改byte对象的编码？

>>> import urllib.request as urllib2 >>> print(urllib2.urlopen('http://www.ted.com/talks/subtitles/id/667/lang/fa').read().decode("utf-8")) {"captions":[{"duration":4000,"content":"\u0627\u0645\u0631\u0648\u0632\u0647 \u062a\u0645\u0627\u0645 \u0628\u0646\u0627\u0647\u0627 \u06cc\u06a9 \u0686\u06cc\u0632 \u0645\u0634\u062a\u0631\u06a9 \u062f\u0627\u0631\u0646\u062f.","startOfParagraph"...

1条回答

网友

1楼 · 发布于 2024-06-16 14:10:35

这里有JSON，阿拉伯语字符转义为permitted by RFC 7159。您需要用^{}解析它才能撤消转义。完成之后，您应该能够提取“contents”值并将其打印（到一个文件中，因为控制台上有Windowscan't always display Unicode properly）。像这样：

>>> import urllib.request as urllib2
>>> result = json.loads(urllib2.urlopen('...').read().decode('utf8'))
>>> with open('example.txt', 'w', encoding='utf8') as f:
...     print(result['captions'][0]['content'], file=f)

你应该可以打开示例.txt你选择的编辑。如果显示不正确，请确保将编码设置为UTF-8。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章