Python重新搜索数字和decim

2024-04-25 12:46:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图使用pythons正则表达式来获取数字值(100.00&200.00),但当我调用代码时,它不会产生任何结果。。。 我使用的是python2.7版本

1)我的文件名是“file100”,我需要从中选择值。。在

# cat file100
Hi this doller 100.00
Hi this is doller 200.00

2)这是我的python代码。。在

^{pr2}$

3)当我运行这段代码时,它不会产生任何错误。。没有什么。。在

# python   count100.py

Tags: 代码py版本is文件名错误数字hi
3条回答

如果它们总是在行尾,只需rsplit一次,然后拉出最后一个元素:

with open('file100', 'r') as f:
    for line in f:
        print(line.rsplit(None, 1)[1])

输出:

^{pr2}$

rsplit(None,1)只意味着我们从空白字符串的末尾分离一次,然后再提取第二个元素:

In [1]: s = "Hi this doller 100.00"

In [2]: s.rsplit(None,1)
Out[2]: ['Hi this doller', '100.00']

In [3]: s.rsplit(None,1)[1]
Out[3]: '100.00'

In [4]: s.rsplit(None,1)[0]
Out[4]: 'Hi this doller'

如果您确实需要regex使用search

import re

with open('file100', 'r') as f:
    for line in f:
        m = re.search(r"\b\d+\.\d{2}\b",line)
        if m:
            print(m.group())

请改用re.search

import re
file = open('file.txt', 'r')
for digit in file.readlines():
    myre = re.search(r'\s\b(\d*\.\d{2})\b', digit)
    if myre:
        print myre.group(1)

结果

^{pr2}$

来自文档

Scan through string looking for the first location where the regular expression pattern produces a match

如果您决定使用一个组,还需要括号

(...) Matches whatever regular expression is inside the parentheses, and indicates the start and end of a group; the contents of a group can be retrieved after a match has been performed, and can be matched later in the string with the \number special sequence, described below. To match the literals '(' or ')', use ( or ), or enclose them inside a character class: [(] [)].

re.match仅在以下情况下有效:

If zero or more characters at the beginning of string match the regular expression pattern

rregex括为raw strings

String literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and use different rules for interpreting backslash escape sequences.

。。。在

Unless an 'r' or 'R' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C

您的主要问题是使用re.match,它需要从字符串开头开始的匹配,而不是{},后者允许从字符串中的任何一点开始匹配。不过,我会把我的建议分解如下:

import re

不需要在每个循环上重新编译(Python实际上会为您缓存一些regex,但是为了安全起见,在引用中保留一个regex)。我使用VERBOSE标志来为您分解regex。在字符串前面使用r,这样反斜杠就不会在Python读取字符串时转义它们前面的字符:

^{pr2}$

使用上下文管理器,用通用换行符打开文件,'rU'模式,这样无论文件是在哪个平台上创建的,都可以逐行读取。在

with open('file100', 'rU') as file:

不要使用readlines,它会一次将整个文件加载到内存中。相反,请将file对象用作迭代器:

    for line in file:
        myre = regex.search(line) 
        if myre:
            print(myre.group(0)) # access the first group, there are no  
                                 # capture groups in your regex

我的代码打印:

100.00
200.00

相关问题 更多 >