如何在Python中使字符串小写？

3条回答

网友

1楼 · 编辑于 2024-04-25 12:19:02

How to convert string to lowercase in Python?
Is there any way to convert an entire user inputted string from uppercase, or even part uppercase to lowercase?
E.g. Kilometers --> kilometers

典型的Python式的方法是

>>> 'Kilometers'.lower()
'kilometers'

但是，如果目的是进行不区分大小写的匹配，则应使用大小写折叠：

>>> 'Kilometers'.casefold()
'kilometers'

原因如下：

>>> "Maße".casefold()
'masse'
>>> "Maße".lower()
'maße'
>>> "MASSE" == "Maße"
False
>>> "MASSE".lower() == "Maße".lower()
False
>>> "MASSE".casefold() == "Maße".casefold()
True

这是Python3中的str方法，但在Python2中，您需要查看PyICU或py2casefold-several answers address this here。

UnicodePython3

Python 3将纯字符串文字处理为unicode：

>>> string = 'Километр'
>>> string
'Километр'
>>> string.lower()
'километр'

Python 2，纯字符串文本是字节

在Python 2中，下面的代码粘贴到shell中，使用^{}将文本编码为一个字节字符串。

而且lower没有映射字节可能知道的任何更改，因此我们得到相同的字符串。

>>> string = 'Километр'
>>> string
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> string.lower()
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> print string.lower()
Километр

在脚本中，Python将反对非ascii（从Python 2.5开始，在python2.4中为warning）字节，因为预期的编码是不明确的。有关更多信息，请参见docs和PEP 263中的Unicode操作方法

使用Unicode文本，而不是`str`文本

因此，我们需要一个unicode字符串来处理这种转换，它可以很容易地用unicode字符串文字来完成，这可以消除前缀u的歧义（请注意，u前缀也可以在Python 3中工作）：

>>> unicode_literal = u'Километр'
>>> print(unicode_literal.lower())
километр

注意，字节与str字节完全不同-转义字符是'\u'，后跟2字节宽度，或这些unicode字母的16位表示：

>>> unicode_literal
u'\u041a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> unicode_literal.lower()
u'\u043a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'

如果我们只有一个str的形式，我们需要把它转换成unicode。Python的Unicode类型是一种通用编码格式，与大多数其他编码相比有许多advantages。我们可以使用unicode构造函数或str.decode方法与编解码器一起将str转换为unicode：

>>> unicode_from_string = unicode(string, 'utf-8') # "encoding" unicode from string
>>> print(unicode_from_string.lower())
километр
>>> string_to_unicode = string.decode('utf-8') 
>>> print(string_to_unicode.lower())
километр
>>> unicode_from_string == string_to_unicode == unicode_literal
True

这两个方法都转换为unicode类型，并且与unicode文本相同。

最佳实践，使用Unicode

建议你总是work with text in Unicode。

Software should only work with Unicode strings internally, converting to a particular encoding on output.

必要时可以重新编码

但是，要使小写返回到类型str，请再次将python字符串编码为utf-8：

>>> print string
Километр
>>> string
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> string.decode('utf-8')
u'\u041a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> string.decode('utf-8').lower()
u'\u043a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> string.decode('utf-8').lower().encode('utf-8')
'\xd0\xba\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> print string.decode('utf-8').lower().encode('utf-8')
километр

所以在Python 2中，Unicode可以编码成Python字符串，Python字符串可以解码成Unicode类型。

网友

2楼 · 编辑于 2024-04-25 12:19:02

使用.lower()-例如：

s = "Kilometer"
print(s.lower())

官方2.x文档在这里：^{}
官方的3.x文档在这里：^{}

网友

3楼 · 编辑于 2024-04-25 12:19:02

对于Python 2，这对UTF-8中的非英语单词不起作用。在这种情况下decode('utf-8')可以帮助：

>>> s='Километр'
>>> print s.lower()
Километр
>>> print s.decode('utf-8').lower()
километр

How to convert string to lowercase in Python?

UnicodePython3

Python 2，纯字符串文本是字节

使用Unicode文本，而不是`str`文本

最佳实践，使用Unicode

必要时可以重新编码

相关问题更多 >

编程相关推荐

热门问题

热门文章