擅长:python、mysql、java
<p>您没有指定以二进制模式打开该文件,因此<code>f.read()</code>正在尝试将该文件读取为UTF-8编码的文本文件,这似乎不起作用。但由于我们采用的是字节的散列,而不是字符串的散列,所以编码是什么,甚至文件是不是文本都无关紧要:只需打开它,然后将其作为二进制文件读取。</p>
<pre><code>>>> with open("test.h5.bz2","r") as f: print(hashlib.sha1(f.read()).hexdigest())
Traceback (most recent call last):
File "<ipython-input-3-fdba09d5390b>", line 1, in <module>
with open("test.h5.bz2","r") as f: print(hashlib.sha1(f.read()).hexdigest())
File "/home/dsm/sys/pys/Python-3.5.1-bin/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 10: invalid start byte
</code></pre>
<p>但是</p>
<pre><code>>>> with open("test.h5.bz2","rb") as f: print(hashlib.sha1(f.read()).hexdigest())
21bd89480061c80f347e34594e71c6943ca11325
</code></pre>