如何判断字符串是base64还是n

------=_NextPart_000_0091_01C940CC.EF5AC860 Content-Type: application/vnd.ms-excel; name="Copy of Book1.xls" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="Copy of Book1.xls"

------=_NextPart_000_0091_01C940CC.EF5AC860 Content-Type: application/vnd.ms-excel; name="=?gb2312?B?uLGxvmhlbrixsb5nLnhscw==?=" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="=?gb2312?B?uLGxvmhlbrixsb5nLnhscw==?="

3条回答

网友
1楼 · 编辑于 2024-04-25 12:11:14

Please note both Content-Transfer-Encoding have base64
在本例中不相关，Content-Transfer-Encoding只适用于主体负载，而不适用于头。
=?gb2312?B?uLGxvmhlbrixsb5nLnhscw==?=
这是一个编码的头原子。解码它的stdlib函数是email.header.decode_header。它仍然需要一些后期处理来解释该功能的结果，尽管：
import email.header x= '=?gb2312?B?uLGxvmhlbrixsb5nLnhscw==?=' try: name= u''.join([ unicode(b, e or 'ascii') for b, e in email.header.decode_header(x) ]) except email.Errors.HeaderParseError: pass # leave name as it was
但是。。。
Content-Type: application/vnd.ms-excel; name="=?gb2312?B?uLGxvmhlbrixsb5nLnhscw==?="
这完全是错误的。是谁创造的？RFC2047编码只能在原子中发生，引用的字符串不是原子。RFC2047第5条明确否认：
An 'encoded-word' MUST NOT appear within a 'quoted-string'.
当存在长字符串或Unicode字符时，对参数头进行编码的公认方法是RFC2231，这是一个全新的难题。但是您应该使用一个标准的邮件解析库来处理这个问题。
因此，如果需要，可以检测文件名参数中的'=?'，并尝试通过rfc247对其进行解码。然而，严格地说，正确的做法是把mailer放在它的单词上，并真正调用文件=?gb2312?B?uLGxvmhlbrixsb5nLnhscw==?=！

网友
2楼 · 编辑于 2024-04-25 12:11:14

@gnud，@edg-除非我误解了，他问的是文件名，而不是文件内容 @setori-Content Trasfer编码告诉您文件内容是如何编码的，而不是“文件名”。
我不是专家，但文件名中的这一部分告诉他以下字符：
=？gb2312标准？B？
我在寻找RFC中的文档。。。啊！这是：http://tools.ietf.org/html/rfc2047
RFC说：
通常，“编码字”是以“=”开头的可打印ASCII字符序列，以“？=”，有两个“？”介于两者之间。
另一个需要注意的是SharpMimeTools中的代码，一个MIME解析器（在C中），我在我的bug tracking应用程序中使用它，BugTracker.NET

网友
3楼 · 编辑于 2024-04-25 12:11:14

header值告诉您：

=?gb2312?B?uLGxvmhlbrixsb5nLnhscw==?=

"=?"     introduces an encoded value
"gb2312" denotes the character encoding of the original value
"B"      denotes that B-encoding (equal to Base64) was used (the alternative 
         is "Q", which refers to something close to quoted-printable)
"?"      functions as a separator
"uLG..." is the actual value, encoded using the encoding specified before
"?="     ends the encoded value

所以分裂“？”实际上就是这样（JSON符号）

["=", "gb2312", "B", "uLGxvmhlbrixsb5nLnhscw==", "="]

在生成的数组中，如果“B”位于位置2，则在位置3面对一个base-64编码的字符串。一旦你解码了它，一定要注意位置1上的编码，也许最好用这个信息把整个东西转换成UTF-8。

相关问题更多 >

编程相关推荐

热门问题

热门文章