只打印字符串的内容

3条回答

网友

1楼 · 编辑于 2024-04-25 04:57:43

By default, this function will return the data as encoded bytes. The actual encoding of the output data may depend on the command being invoked, so the decoding to text will often need to be handled at the application level.
This behaviour may be overridden by setting universal_newlines to True as described below in Frequently Used Arguments.

如果您遵循Frequently Used Arguments的链接，它描述了universal_newlines=True的功能：

If universal_newlines is False the file objects stdin, stdout and stderr will be opened as binary streams, and no line ending conversion is done.
If universal_newlines is True, these file objects will be opened as text streams in universal newlines mode using the encoding returned by locale.getpreferredencoding(False). For stdin, line ending characters '\n' in the input will be converted to the default line separator os.linesep. For stdout and stderr, all line endings in the output will be converted to '\n'. For more information see the documentation of the io.TextIOWrapper class when the newline argument to its constructor is None.

有关更多详细信息，请参阅^{} documentation。你知道吗

要运行echo -n "hello world!"shell命令并返回不带check_output()和不使用universal_newlines=True的文本：

#!/usr/bin/env python
import locale
from subprocess import Popen, PIPE

charset = locale.getpreferredencoding(False)
with Popen(['echo', 'Hello world!'], stdout=PIPE) as process:
    output = process.communicate()[0].decode(charset).strip()

这里有一个couple of code examples显示how ^{} pipes and ^{} class could be used together。你知道吗

要理解Python中什么是文本，什么是二进制数据，请阅读Unicode HOWTO。这里是最重要的部分：Python中有两种主要的字符串类型：表示二进制数据的bytestring（字节序列）和表示人类可读文本的Unicode字符串（Unicode码点序列）。把一个转换成另一个很简单(☯)地址：

unicode_text = bytestring.decode(character_encoding)
bytestring = unicode_text.encode(character_encoding)

网友

2楼 · 编辑于 2024-04-25 04:57:43

Also: I believe the first output is of type bytes, but what is the type of the second output? My guess is str with UTF-8 encoding.

很接近，但不完全正确。在Python3中，^{}类型由Unicode代码点索引（注意，代码点通常（但不总是）与用户感知的字符具有1:1的对应关系）。因此，当使用str类型时，底层的编码被抽象掉了，即使基本上不是这样。它是^{}类型，索引为一个简单的字节数组，因此必须使用一个特定的encoding，在这种情况下（与大多数类似的用法一样），ASCII就足以解码子进程脚本生成的内容。你知道吗

Python2对str类型（see here）的解释有不同的默认值，因此在该语言版本中，字符串文字的表示方式会有所不同（这种差异可能是研究文本处理时的一大障碍）。你知道吗

作为一个主要使用C++的人，我发现以下内容对Unicode文本的实际存储、编码和索引是非常有启发性的：^ {A5}

所以问题的第一部分的答案是^{}：

a = a.decode('ascii') ## convert from `bytes` to 'str' type

虽然只是使用

a = a.decode() ## assumes UTF-8 encoding

通常会产生相同的结果，因为ASCII是UTF-8的子集。你知道吗

或者，您可以这样使用^{}：

a = str(a,encoding='ascii')

但是请注意，如果您想要“仅内容”表示，则必须在这里指定编码，否则它将实际构建一个str类型，该类型内部包含引号字符（包括'b'前缀），这正是问题中显示的第一个输出中发生的情况。你知道吗

默认情况下，^{}以二进制模式（返回原始字节序列）处理数据，但神秘参数universal_newlines=True基本上告诉它对字符串进行解码，并将其表示为文本（使用str类型）。如果您想使用Python的print函数显示输出（并且“仅显示内容”），那么这种到str类型的转换是必要的（在Python3中）。你知道吗

这种转换的有趣之处在于，出于这些目的，它实际上对数据没有任何作用。在幕后发生的是一个实现细节，但是如果数据是ASCII（对于这种类型的程序来说非常典型），那么它基本上只是从一个地方复制到另一个地方，而没有任何有意义的转换。解码操作只是hoop jumping更改数据类型，而该操作看似毫无意义的性质进一步模糊了Python文本处理背后更大的视野（对于未初始化的用户）。此外，由于the docs没有使返回类型显式（按名称），因此很难知道从何处开始寻找合适的转换函数。你知道吗

网友
3楼 · 编辑于 2024-04-25 04:57:43

正如伊格纳西奥最初的评论所暗示的，您可以使用decode：

>>> a = b"hello world!"
>>> print("a="+str(a))
a=b'hello world!'
>>> print("a="+a.decode())
a=hello world!

相关问题更多 >

编程相关推荐

热门问题

热门文章