基于具有嵌套条件的字符串筛选行

2021-10-17 15:15:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv文件中的数据行,存储的第一行是标题。例如:

first line -> [a,b,c,d,e]

second line -> [0,1,2,1,2]

third line -> [4,2,4,1,5]

另外,我还有与数据相关的条件字符串,格式如下:

条件=(((a=d)或(a=c))和(c<;e))

输出应仅为第3行。如何评估这个条件,并分离所有嵌套的子条件?我在想一个递归函数,看完括号,但是我的代码有点乱:(。谢谢你的回答,很抱歉我的英语不好!你知道吗

PS:我不想使用熊猫或csv库 PS2:上面的条件只是一个例子,可能还有另一个更嵌套的条件,如(((a=d)和(c>;e))或(b=c)和(e<;d)),或者有时只是(a=d)

2条回答
网友
1楼 ·

最好在需求已更新/更改时创建新答案

评估字符串格式条件的最快解决方案是使用内置函数eval。这样,您就不必进行繁重/负担不起的解析(lexical analysissyntactic analysis

以下是示例代码:

from itertools import ifilter

condition1 = '(((a = d) OR (a = c)) AND (c < e))'

def evalCondition(condition, *args):
    '''
    1) if you have condition format follow python grammar, then you don't need below replacement
    2) assume there is no '>=' or '<=', otherwise, you have to use more sophisticated replacement method e.g. using regular exppression
    '''
    condition = condition.replace('=', '==').replace('OR', 'or').replace('AND', 'and')

    a,b,c,d,e = args
    return eval(condition)

with open('input.csv', 'r') as fi:
    results = ifilter(
        lambda fields: evalCondition(condition1, *fields),
        (map(int, rawline.split(',')) for rawline in fi.readlines()[1:]))
    for fields in results:
        print ','.join(map(str,fields))

与输入.csv为:

a,b,c,d,e
0,1,2,1,2
4,2,4,1,5

结果输出为:

4,2,4,1,5
网友
2楼 ·

下面是一个快速解决方案:

from itertools import ifilter

with open('input.csv', 'r') as fi:
    lines = ((rawline, map(int, rawline.split(','))) for rawline in  fi.readlines()[1:])
    results = ifilter(lambda (_, fds): (fds[0] == fds[3] or fds[0] == fds[2]) and (fds[2] < fds[4]), lines)
    for (rawline, _) in results:
        print rawline

与输入.csv为:

a,b,c,d,e
0,1,2,1,2
4,2,4,1,5

结果输出为:

4,2,4,1,5

更新:较短/紧凑的实现:

from itertools import ifilter

with open('input.csv', 'r') as fi:
    results = ifilter(
        lambda fds: (fds[0] == fds[3] or fds[0] == fds[2]) and (fds[2] < fds[4]),
        (map(int, rawline.split(',')) for rawline in fi.readlines()[1:]))
    for fields in results:
        print ','.join(map(str,fields))

相关问题