Python 人气云无法工作
我做了一个人气云,但它运行得不太好。我的文本文件是这样的:
1 Top Gear
3 Scrubs
3 The Office (US)
5 Heroes
5 How I Met Your Mother
5 Legend of the Seeker
5 Scrubs
.....
在我的人气云中,名字会根据出现的次数来显示。比如,《寻道者传奇》这个名字出现了5次,所以它的大小会变大。每个词本来应该只出现一次,大小应该根据它的人气数字(5)来决定。但每个词应该只出现一次,大小也应该根据它的人气来调整。我该怎么修复这个问题呢?
另外,我的程序还应该满足以下条件:
出现频率相同的词通常会显示成同一种颜色,比如高尔夫和空手道。不同频率的词则会显示成不同的颜色,比如篮球、板球和冰球。在每个云的底部,应该用显示这些词的颜色来输出它们的频率/数量。
我的代码在这里:
#!/usr/bin/python
import string
def main():
# get the list of tags and their frequency from input file
taglist = getTagListSortedByFrequency('tv.txt')
# find max and min frequency
ranges = getRanges(taglist)
# write out results to output, tags are written out alphabetically
# with size indicating the relative frequency of their occurence
writeCloud(taglist, ranges, 'tv.html')
def getTagListSortedByFrequency(inputfile):
inputf = open(inputfile, 'r')
taglist = []
while (True):
line = inputf.readline()[:-1]
if (line == ''):
break
(count, tag) = line.split(None, 1)
taglist.append((tag, int(count)))
inputf.close()
# sort tagdict by count
taglist.sort(lambda x, y: cmp(x[1], y[1]))
return taglist
def getRanges(taglist):
mincount = taglist[0][1]
maxcount = taglist[len(taglist) - 1][1]
distrib = (maxcount - mincount) / 4;
index = mincount
ranges = []
while (index <= maxcount):
range = (index, index + distrib-1)
index = index + distrib
ranges.append(range)
return ranges
def writeCloud(taglist, ranges, outputfile):
outputf = open(outputfile, 'w')
outputf.write("<style type=\"text/css\">\n")
outputf.write(".smallestTag {font-size: xx-small;}\n")
outputf.write(".smallTag {font-size: small;}\n")
outputf.write(".mediumTag {font-size: medium;}\n")
outputf.write(".largeTag {font-size: large;}\n")
outputf.write(".largestTag {font-size: xx-large;}\n")
outputf.write("</style>\n")
rangeStyle = ["smallestTag", "smallTag", "mediumTag", "largeTag", "largestTag"]
# resort the tags alphabetically
taglist.sort(lambda x, y: cmp(x[0], y[0]))
for tag in taglist:
rangeIndex = 0
for range in ranges:
url = "http://www.google.com/search?q=" + tag[0].replace(' ', '+') + "+site%3Asujitpal.blogspot.com"
if (tag[1] >= range[0] and tag[1] <= range[1]):
outputf.write("<span class=\"" + rangeStyle[rangeIndex] + "\"><a href=\"" + url + "\">" + tag[0] + "</a></span> ")
break
rangeIndex = rangeIndex + 1
outputf.close()
if __name__ == "__main__":
main()