line1 = " The median income for a household in the city was $64,411, and the median income for a family was $78,940. The per capita income for the city was $22,466. About 4.3% of families and 5.9% of the population were below the poverty line, including 7.0% of those under age 18 and 12.3% of those age 65 or over."
line2 = " The median income for a household in the city was $31,893, and the median income for a family was $38,508. Males had a median income of $30,076 versus $20,275 for females. The per capita income for the city was $16,336. About 14.1% of families and 16.7% of the population were below the poverty line, including 21.8% of those under age 18 and 21.0% of those age 65 or over."
预期产量:
household median income: $64,411
family median income: $78,940
per capital income: $22,466
[householdIncome, familyIncome, perCapitalIncome] = re.findall("\d+,\d+",line1)
一号线很好用。第2行:
ValueError: too many values to unpack (expected 3)
主要目标是如何在定位关键字后识别第一个数字/值。你知道吗
有些线路他们没有人均收入,我可以接受为“”
在第2行中,findall找到了3个以上的匹配项,而您试图仅在3个变量上解压它们。你知道吗
用这样的方法:
执行
re.findall("\d+,\d+",line2)
的结果是['31,893', '38,508', '30,076', '20,275', '16,336']
。因此,眼前的问题是正则表达式有五个结果,而您只允许三个。然而,还有一个稍深的问题。当我检查这两个句子时,我发现它们有不同的结构。在第一句中,家庭收入、家庭收入和人均收入似乎确实排在第一位,但在第二句中似乎并非如此。我想说的是,你需要对这个句子作一些更复杂的分析。你知道吗正如其他人所指出的,您将需要一些额外的编程逻辑。考虑以下示例,该示例使用正则表达式来查找相关值,并在必要时计算中值:
相关问题 更多 >
编程相关推荐