如何将utf8转换为cp1251以写入mp3文件的ID3_V1标签？

1 投票

1 回答

1881 浏览

提问于 2025-04-18 04:36

ID3_V1只支持latin1编码。如果想用俄文字符写V1标签，就得用cp1251编码。我想把V2标签（使用unicode编码）里的数据复制到V1标签里。我用下面的代码通过eyeD3获取V2标签：

tag.link(mp3path, v=eyeD3.ID3_V2)
mp3album_v2 = tag.getAlbum()
...
tag.link(mp3path, v=eyeD3.ID3_V1)
tag.setTextEncoding(eyeD3.LATIN1_ENCODING)
tag.setAlbum(mp3album_v2.encode('cp1251')) # ???
tag.update()

返回的结果是：

>>> print mp3album_v2
Жить в твоей голове

>>> print type(mp3album_v2)
<type 'unicode'>

>>> print repr(mp3album_v2)
u'\u0416\u0438\u0442\u044c \u0432 \u0442\u0432\u043e\u0435\u0439 \u0433\u043e\u043b\u043e\u0432\u0435'

看起来setAlbum需要的是utf-8格式的字符串（？）：

def setAlbum(self, a):
    self.setTextFrame(ALBUM_FID, self.strToUnicode(a));

def strToUnicode(self, s):
    t = type(s);
    if t != unicode and t == str:
        s = unicode(s, eyeD3.LOCAL_ENCODING);
    elif t != unicode and t != str:
        raise TagException("Wrong type passed to strToUnicode: %s" % str(t));
    return s;

但是如果我尝试这样做tag.setAlbum(mp3album_v2.encode('cp1251').encode('utf-8'))，我就会遇到一个错误UnicodeDecodeError: 'utf8' codec can't decode byte 0xc6 in position 0: invalid continuation byte

1 个回答

ID3v1标签不能可靠地包含任何非ASCII字符。你可以把cp1251编码的字节写入ID3v1标签，但这些字节在俄罗斯地区的操作系统上才能显示为西里尔字母，甚至并不是所有的应用程序都能正确显示。

EyeD3在内部处理Unicode字符串，并随意选择使用latin1（也叫ISO-8859-1）作为ID3v1标签的编码。这可能不是个好选择，因为latin1从来不是Windows系统上的默认区域编码（在西欧，实际上是cp1252，虽然相似但并不完全相同）。

不过，这种编码选择的一个特点是，每个字节都对应一个具有相同编码点的Unicode字符。你可以利用这一点，创建一个Unicode字符串，其中的字符在编码为latin1时，会变成其他编码中选定字符串的字节编码。

album_name = u'Жить в твоей голове'
mangled_name = album_name.encode('cp1251').decode('latin1')
tag.setAlbum(mangled_name) # will encode as latin1, resulting in cp1251 bytes

这是一种糟糕的技巧，效果值得怀疑，也是你应该避免使用ID3v1的原因之一。

回答于 2025-04-18 由 Python大师

分享举报

如何将utf8转换为cp1251以写入mp3文件的ID3_V1标签？

1 个回答

撰写回答