使用双引号Python获取str repr

3条回答

网友
1楼 · 编辑于 2024-05-16 20:38:14

如果你要一个python str来获取它的repr，我不认为引号的类型是真正可配置的。在python 2.6.4源代码树中的PyString_Repr函数中：
/* figure out which quote to use; single is preferred */ quote = '\''; if (smartquotes && memchr(op->ob_sval, '\'', Py_SIZE(op)) && !memchr(op->ob_sval, '"', Py_SIZE(op))) quote = '"';
所以，如果字符串中有单引号，我想应该使用双引号，但如果字符串中有双引号，也不要使用双引号。
我会尝试编写自己的类来包含字符串数据，而不是使用内置字符串。一个选项是从str派生一个类并编写自己的repr：
class MyString(str): __slots__ = [] def __repr__(self): return '"%s"' % self.replace('"', r'\"') print repr(MyString(r'foo"bar'))
或者，完全不要使用repr：
def ready_string(string): return '"%s"' % string.replace('"', r'\"') print ready_string(r'foo"bar')
如果字符串中已经有转义引号，那么这种简单的引号可能做不到“正确”的事情。

网友
2楼 · 编辑于 2024-05-16 20:38:14

最好不要破解repr()，而是从头开始使用正确的编码。您可以使用编码string_escape直接获取repr的编码
>>> "naïveté".encode("string_escape") 'na\\xc3\\xafvet\\xc3\\xa9' >>> print _ na\xc3\xafvet\xc3\xa9
对于转义“-引号，我认为在转义编码之后使用简单的替换是一个完全明确的过程：
>>> '"%s"' % 'data:\x00\x01 "like this"'.encode("string_escape").replace('"', r'\"') '"data:\\x00\\x01 \\"like this\\""' >>> print _ "data:\x00\x01 \"like this\""

网友
3楼 · 编辑于 2024-05-16 20:38:14

repr（）不是你想要的。有一个基本问题：repr（）可以使用字符串的任何表示形式（可以作为Python计算）来生成字符串。这意味着，在理论上，它可能决定使用在C中无效的任何数量的其他构造，例如“长字符串”。

这个代码可能是正确的方向。我使用了默认值140，这是2009年的一个合理值，但是如果您真的想将代码包装成80列，只需更改它。

如果unicode=True，则输出一个L“wide”字符串，该字符串可以有意义地存储unicode转义。或者，您可能希望将Unicode字符转换为UTF-8并输出转义的字符，具体取决于您正在使用它们的程序。

def string_to_c(s, max_length = 140, unicode=False):
    ret = []

    # Try to split on whitespace, not in the middle of a word.
    split_at_space_pos = max_length - 10
    if split_at_space_pos < 10:
        split_at_space_pos = None

    position = 0
    if unicode:
        position += 1
        ret.append('L')

    ret.append('"')
    position += 1
    for c in s:
        newline = False
        if c == "\n":
            to_add = "\\\n"
            newline = True
        elif ord(c) < 32 or 0x80 <= ord(c) <= 0xff:
            to_add = "\\x%02x" % ord(c)
        elif ord(c) > 0xff:
            if not unicode:
                raise ValueError, "string contains unicode character but unicode=False"
            to_add = "\\u%04x" % ord(c)
        elif "\\\"".find(c) != -1:
            to_add = "\\%c" % c
        else:
            to_add = c

        ret.append(to_add)
        position += len(to_add)
        if newline:
            position = 0

        if split_at_space_pos is not None and position >= split_at_space_pos and " \t".find(c) != -1:
            ret.append("\\\n")
            position = 0
        elif position >= max_length:
            ret.append("\\\n")
            position = 0

    ret.append('"')

    return "".join(ret)

print string_to_c("testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing", max_length = 20)
print string_to_c("Escapes: \"quote\" \\backslash\\ \x00 \x1f testing \x80 \xff")
print string_to_c(u"Unicode: \u1234", unicode=True)
print string_to_c("""New
lines""")

相关问题更多 >

编程相关推荐

热门问题

热门文章