Python数据提取与搜索

for i in lines: for word in i: if word.find(':')!=-1: att = word.split(':', 1)[-1] idx = word.split(':', 1)[0] for j in lines: clas = j.split(' ', 1)[0] if word.find(':')!=-1: if idx ==word.split(':', 1)[0]: if att ==word.split(':', 1)[0]: if clas>0: ifattandyes = ifattandyes+1 else: ifattandno = ifattandno+1

3条回答

网友

1楼 · 编辑于 2024-04-26 04:01:16

以下是一个建议（前提是我正确理解了问题）：

#!/bin/env python
from collections import defaultdict

positives=defaultdict(int)
negatives=defaultdict(int)

for line in open('data'):
    theclass = line[0:2] == '+1'
    for pair in line[2:].split():
        positives[pair]+=theclass
        negatives[pair]+=not theclass

for key in positives.keys():
    print key, "\t+1:",  positives[key], "\t-1:", negatives[key]

适用于以下数据：

$ cat data
+1 1:4 2:11 3:3 4:11 5:1 6:13 7:4 8:2 9:2 10:13
-1 1:2 2:7 3:4 4:12 5:3 6:4 7:3 8:12 9:2 10:12
+1 1:4 2:6 3:3 4:2 5:3 6:5 7:4 8:2 9:3 10:6

它给出：

$ python t.py 
9:2     +1: 1   -1: 1
9:3     +1: 1   -1: 0
8:2     +1: 2   -1: 0
10:6    +1: 1   -1: 0
6:13    +1: 1   -1: 0
10:13   +1: 1   -1: 0
10:12   +1: 0   -1: 1
2:7     +1: 0   -1: 1
2:6     +1: 1   -1: 0
4:11    +1: 1   -1: 0
4:12    +1: 0   -1: 1
4:2     +1: 1   -1: 0
1:2     +1: 0   -1: 1
1:4     +1: 2   -1: 0
3:3     +1: 2   -1: 0
5:1     +1: 1   -1: 0
3:4     +1: 0   -1: 1
5:3     +1: 1   -1: 1
8:12    +1: 0   -1: 1
7:4     +1: 2   -1: 0
7:3     +1: 0   -1: 1
2:11    +1: 1   -1: 0
6:5     +1: 1   -1: 0
6:4     +1: 0   -1: 1

网友

2楼 · 编辑于 2024-04-26 04:01:16

我不知道我有没有这个。你知道吗

tot_up = {}; tot_dn = {}
for line in input_file:
    parts = line.split()   # split on whitespace
    up_or_down = parts[0]
    parts = parts[1:]
    if up_or_down == '-1':
        store = tot_dn
    else:
        store = tot_up
    for part in parts:
        store[part] = store.get(part, 0) + 1
print "Total +1s: ", sum(tot_up.values())
print "Total -1s: ", sum(tot_dn.values())

这做不到的，但可以很容易地做到的，是剥离收件人：val pairs没有找到匹配的。你知道吗

但我不确定我是否正确理解你的要求。你知道吗

网友

3楼 · 编辑于 2024-04-26 04:01:16

我将创建这个社区wiki，因为它与已经发布的答案太接近（无论如何，在精神上），但它有一些优点：

from collections import Counter
with open("datafile.dat") as fp:
    counts = {}
    for line in fp:
        parts = line.split()
        sign, keys = parts[0], parts[1:]
        counts.setdefault(sign, Counter()).update(keys)

all_keys = set().union(*counts.values())
for key in sorted(all_keys):
    print '{:8}'.format(key), 
    print ' '.join('{}: {}'.format(c, counts[c].get(key, 0)) for c in counts)

产生

10:12    +1: 0 -1: 1
10:13    +1: 1 -1: 0
10:6     +1: 1 -1: 0
1:2      +1: 0 -1: 1
1:4      +1: 2 -1: 0
[etc.]

请注意，任何地方都没有对+1或-1的引用；sign可以是任何东西。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章