如何修复TypeError：Unicode对象必须先编码再哈希？

461 投票

10 回答

569879 浏览

提问于 2025-04-17 03:16

我遇到了这个错误：

Traceback (most recent call last):
  File "python_md5_cracker.py", line 27, in <module>
  m.update(line)
TypeError: Unicode-objects must be encoded before hashing

当我尝试在 Python 3.2.2 中运行这段代码时：

import hashlib, sys
m = hashlib.md5()
hash = ""
hash_file = input("What is the file name in which the hash resides?  ")
wordlist = input("What is your wordlist?  (Enter the file name)  ")
try:
  hashdocument = open(hash_file, "r")
except IOError:
  print("Invalid file.")
  raw_input()
  sys.exit()
else:
  hash = hashdocument.readline()
  hash = hash.replace("\n", "")

try:
  wordlistfile = open(wordlist, "r")
except IOError:
  print("Invalid file.")
  raw_input()
  sys.exit()
else:
  pass
for line in wordlistfile:
  # Flush the buffer (this caused a massive problem when placed 
  # at the beginning of the script, because the buffer kept getting
  # overwritten, thus comparing incorrect hashes)
  m = hashlib.md5()
  line = line.replace("\n", "")
  m.update(line)
  word_hash = m.hexdigest()
  if word_hash == hash:
    print("Collision! The word corresponding to the given hash is", line)
    input()
    sys.exit()

print("The hash given does not correspond to any supplied word in the wordlist.")
input()
sys.exit()

10 个回答

在编程中，有时候我们需要处理一些数据，这些数据可能来自不同的地方，比如用户输入、文件或者网络请求。为了让程序能够理解这些数据，我们需要将它们转换成程序能用的格式。

比如说，如果你从一个表单获取了用户的名字，这个名字可能是字符串格式的。为了在程序中使用这个名字，你可能需要把它放到一个变量里，这样你就可以在后面的代码中随时调用它。

此外，有些时候我们还需要对数据进行一些操作，比如计算、排序或者过滤。为了实现这些功能，我们会使用一些编程语言提供的工具和函数。这些工具就像是你在厨房里用的刀、锅、铲子，帮助你更方便地处理食材（数据）。

总之，处理数据的过程就是将原始数据转换成程序可以理解和使用的格式，并利用编程语言的功能来对这些数据进行操作。

import hashlib
string_to_hash = '123'
hash_object = hashlib.sha256(str(string_to_hash).encode('utf-8'))
print('Hash', hash_object.hexdigest())

回答于 2025-04-17 由 Python大师

分享举报

183

你必须定义一下 编码格式，比如 utf-8。试试这个简单的方法。

这个例子使用 SHA256 算法生成一个随机数：

>>> import hashlib
>>> hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest()
'cd183a211ed2434eac4f31b317c573c50e6c24e3a28b82ddcb0bf8bedf387a9f'

回答于 2025-04-17 由 Python大师

分享举报

440

这可能是在寻找来自 wordlistfile 的字符编码。

wordlistfile = open(wordlist,"r",encoding='utf-8')

或者，如果你是逐行处理的话：

line.encode('utf-8')

编辑

根据下面的评论和这个回答。

我上面的回答假设你想要的输出是来自 wordlist 文件的 str。如果你对处理 bytes 感到熟悉，那么使用 open(wordlist, "rb") 会更好。但要记住，如果你要将 hashfile 和 hexdigest 的输出进行比较，hashfile 绝对不能使用 rb。因为 hashlib.md5(value).hashdigest() 输出的是 str，而这不能直接和字节对象比较：'abc' != b'abc'。（这个话题还有很多内容，但我现在没时间详细讲了）。

还要注意的是，这一行：

line.replace("\n", "")

可能应该改成

line.strip()

这样可以同时适用于 bytes 和 str。但是如果你决定只转换为 bytes，那么可以把这一行改成：

line.replace(b"\n", b"")

回答于 2025-04-17 由 Python大师

分享举报

如何修复TypeError：Unicode对象必须先编码再哈希？

10 个回答

编辑

撰写回答