如何在python中递增元组值并在循环中搜索字符串

2024-06-12 00:48:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这个密码。在

 arfffile = []

inputed = raw_input("Enter Evaluation for name including file extension...")

reader = open(inputed, 'r')

verses = []

for line in reader:
    verses.append(line)

for line in verses:
    if line.split('@') == "@":
        verses.pop(line)


numclusters = int(raw_input("Enter the number of clusters"))

clusters = {}

for i in range(1,numclusters+1):
    clusters["cluster"+str(i)] = 0



print clusters
 # If verse belongs to a cluster, increment the cluster count by one in the dictionary value.
for verse in verses:
    for k in clusters:
        if k in verse:
            clusters[k] += 1
        else:
            print "not in"

print clusters

yeslist = []

for verse in verses:
    for k in clusters:
        if k not in yeslist:
            yeslist.append((k,0))
        elif k in yeslist:
            print "already in" + k


for verse in verses:
    for k in clusters:
        if k in verse and "Yes" in verse:
            yeslist.append(yeslist.index(k), +1)


    # iterate through dictionary and iterate through the lines
    # need to read in file line by line, 



    # if "yes" and cluster x increment cluster 
    # need to work out percentage of possitive verses in each cluster. 

arff文件的一个例子是

^{pr2}$

当它站着时,程序读入数据线如

0,1,0,0,0,0,0,0,0,1,1,No,cluster3

我创建了一个字典来检测数据文件中有多少个簇。在这个例子中有3个。cluster1 cluster2和cluster3。然后,代码将每个簇作为键值附加在字典“clusters”中以字符串表示
然后我遍历所有的诗句,并计算每一行,看看它属于哪个集群。在

我的下一步是计算每个集群中出现“Yes”的行的次数。所以假设数据中每一行的字符串中有10行带有“yes”,代码应该能够计算出发生这种情况的次数。在

到目前为止,我做的代码在这里

for verse in verses:
        for k in clusters:
            if k in verse and "Yes" in verse:
                yeslist.append(yeslist.index(k), +1)

我正在创建一个名为“yeslist”的元组列表,其值如下[(cluster1,0),(cluster2,3)]

所以对于每一行(用字符串表示),检查其中是否有“Yes”,如果有检查它属于哪个集群,那么将元组值增加一。在

我想不出该怎么做。。。有人能帮忙吗?在

谢谢。在


Tags: andthetoinforiflineyes
1条回答
网友
1楼 · 发布于 2024-06-12 00:48:26
import collections

inputed = raw_input("Enter Evaluation for name including file extension...")

reader = open(inputed, 'r')

verses = [ line.strip() for line in reader.readlines() if line[0] != '@' ]

reader.close()

cluster_count = collections.defaultdict(int)
yes_count = collections.defaultdict(int)

verse_infos = [ (split_verse[-1],split_verse[-2]) for split_verse \
                 in verses.split(",") ]

for verse in verse_infos:
    cluster_count[verse[0]]+=1
    if verse[1] == 'yes':
        yes_count[verse[0]]+=1

你得到了两本字典:

^{pr2}$

如果您真的需要元组列表:

yes_tuples = ( x for x in sorted(yes_count.iteritems()) )

相关问题 更多 >