<p>从因特网上读取文本文件的正确方法是什么。
例如这里的文本文件<a href="https://gist.githubusercontent.com/deekayen/4148741/raw/01c6252ccc5b5fb307c1bb899c95989a8a284616/1-1000.txt" rel="nofollow noreferrer">https://gist.githubusercontent.com/deekayen/4148741/raw/01c6252ccc5b5fb307c1bb899c95989a8a284616/1-1000.txt</a></p>
<p>下面的代码可以工作,但会在每个单词前面产生额外的<code>'b</code></p>
<pre><code>from urllib.request import urlopen
#url = 'https://raw.githubusercontent.com/first20hours/google-10000-english/master/google-10000-english.txt'
url = 'https://gist.githubusercontent.com/deekayen/4148741/raw/01c6252ccc5b5fb307c1bb899c95989a8a284616/1-1000.txt'
#data = urlopen(url)
#print('H w')
# it's a file like object and works just like a file
l = set()
data = urlopen(url)
for line in data: # files are iterable
word = line.strip()
print(word)
l.add(word)
print(l)
</code></pre>