我已经构建了这个小代码来从字符串(字符串也可以是1页)中获取pkeys(数字),然后按行显示唯一的pkeys。例如,如果字符串为“失败:无法检索市场数据:[工具48021088、1029755的历史相关性
大于2M,仪器48021088、102975554454的历史相关性大于2M,加载结构化产品市场数据时出错“
输出应为: 48021088 1029755 5445454
但是现在我的输出是[480210810297552] 544545]
注:不应将字符串“2M”中的2作为pkey,即2个月
此外,每当我在这段代码中复制一个长字符串时,我必须在每一行的末尾放\以使其运行。如果我从outlook或任何其他源中复制一个字符串并粘贴到这段代码中,我能做些什么,它应该自动格式化它并自行插入\吗
import re
import numpy as np
import pandas as pd
regex = ('\d+')
match = re.findall(regex, 'Failure: Cannot retrieve market data: [Historical correlation for instruments 48021088, 1029755 \
is older than 2M, Historical correlation for instruments 48021088, 1029755 is older than 2M, Error while loading Structured Product market data \
Failure: Cannot retrieve market data: [Historical correlation for instruments 52598110, 35602558 is older than 2M, Historical correlation for instruments \
52598110, 35602558 is older than 2M, Historical correlation for instruments 52598110, 35602558 is older than 2M, Historical correlation for instruments 52598110, \
35602558 is older than 2M, Error while loading Structured Product market data \
Failure: Cannot retrieve market data: [Historical correlation for instruments 48021088, 1029755 is older than 2M, Historical correlation for instruments 48021088, 1029755 \
is older than 2M, Error while loading Structured Product market data \
Failure: Cannot retrieve market data: [Historical correlation for instruments 612292, 52598110 is older than 2M, Historical correlation for instruments 612292, 52598110 is \
older than 2M, Historical correlation for instruments 612292, 52598110 is older than 2M, Historical correlation for instruments 612292, 52598110 is older than 2M, \
Error while loading Structured Product market data \
Failure: Cannot retrieve market data: [Historical correlation for instruments 489459, 104322960 is older than 2M, Historical correlation for instruments 489459, \
104322960 is older than 2M, Historical correlation for instruments 489459, 104322960 is older than 2M, Historical correlation for instruments 489459, \
104322960 is older than 2M, Error while loading Structured Product market data')
res = list(map(int,match))
x = res
# print(str(x))
unique_numbers = list(set(x))
print(np.transpose(unique_numbers))
由
' '
或" "
分隔的字符串应该在一行上。可以通过在每行末尾使用\
或通过''' '''
或""" """
对它们进行定界,将它们分成多行关于正则表达式,我看到您使用
\d+
表示至少一个数字。您可以使用\d{n,}
将其更改为至少n个数字这应该可以做到:
解释
从Python Regex docs
如果字符串中的当前位置前面有一个以当前位置结尾的
...
匹配项,则(?<=...)
匹配。这被称为正向查找断言如果
...
匹配next,则(?=...)
匹配,但不使用任何字符串。这称为前瞻断言这些共同构成了一个环顾四周的断言。然后我们有
<look behind>(\d+)<look ahead>
。()
表示这是匹配的组-从findall
调用返回的组。然后\d+
和以前一样-不止一个数字相关问题 更多 >
编程相关推荐