Coursera Python最终项目情感分类器问题的回答

Coursera Python最终项目情感分类器

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

<p>最后，复制以前的函数并编写代码打开文件project_twitter_data.csv，该文件包含伪生成的twitter数据（tweet文本、该tweet的转发次数以及该tweet的回复次数）。你的任务是建立一个情绪分类器，它将检测每条推文的积极或消极程度。从上面的代码窗口复制代码，并将其放在此代码窗口的顶部。现在，您将编写代码来创建一个名为resulting_data.CSV的CSV文件，该文件包含每条推文的转发数、回复数、正分数（表示推文中有多少快乐的词）、负分数（表示推文中有多少愤怒的词）和净分数（文本总体上是正还是负）。该文件应按该顺序包含这些标题。请记住，此项目还有另一个组件。您将把CSV文件上传到Excel或Google Sheets，并生成一张净分数与转发次数的图表。如果您是从Coursera访问本教科书，请查看Coursera的作业部分</p> <p>我需要帮助回答这个问题。从大约一个星期以来一直被困在这个问题上。请帮助这是最后的项目</p> <pre><code>punctuation_chars = ["'", '"', ",", ".", "!", ":", ";", '#', '@'] def strip_punctuation(a): for x in punctuation_chars: if x in a: a = a.replace(x,"") return(a) positive_words = [] with open("positive_words.txt") as pos_f: for lin in pos_f: if lin[0] != ';' and lin[0] != '\n': positive_words.append(lin.strip()) def get_pos(c): pos = 0 b = c.lower() b = strip_punctuation(b) lst = b.split(" ") for i in positive_words: for j in lst: if i == j: pos+=1 return pos negative_words = [] with open("negative_words.txt") as pos_f: for lin in pos_f: if lin[0] != ';' and lin[0] != '\n': negative_words.append(lin.strip()) def get_neg(c): neg = 0 b = c.lower() b = strip_punctuation(b) lst = b.split(" ") for i in negative_words: for j in lst: if i == j: neg+=1 return neg file = open("project_twitter_data.csv", "r") e = file.read() nega = posi = 0 for f in e: nega += get_neg(f) negat = nega*-1 posi += get_pos(f) negat = nega*-1 ne = str(nega) po = str(posi) net = posi + negat netd = str(net) filer = open('resulting_data.csv','w') result = filer.write('Number of Retweets, Number of Replies, Positive Score, Negtive Score, Net Score\n') result = filer.write('0, 0, ' + ne +', ' + po +", " + netd + '\n') </code></pre> <p>这就是我所能想到的。我不能在这里使用导入CSV。它不允许我这样做</p> <p>一些好话-</p> <p/><div class="snippet" data-lang="js" data-hide="false" data-console="true" data-babel="false"> <div^{cl2}$ <pre class="snippet-code-html lang-html prettyprint-override"><code>a+ abound abounds abundance abundant accessable accessible acclaim acclaimed acclamation</code></pre> </div> </div> 这些单词存储在文件positive_words.txt中一些否定词- <p/><div class="snippet" data-lang="js" data-hide="false" data-console="true" data-babel="false"> <div^{cl2}$ <pre class="snippet-code-html lang-html prettyprint-override"><code>2-faced 2-faces abnormal abolish abominable abominably abominate abomination abort</code></pre> </div> </div> <p>这些单词存储在negative_words.txt中推特数据-</p> <p/><div class="snippet" data-lang="js" data-hide="false" data-console="true" data-babel="false"> <div^{cl2}$ <pre class="snippet-code-html lang-html prettyprint-override"><code>tweet_text,retweet_count,reply_count @twitteruser: On now - @Fusion scores first points #FirstFinals @overwatchleague @umich @umsi Michigan Athletics made out of emojis. #GoBlue,3,0 BUNCH of things about crisis respons… available July 8th… scholarship focuses on improving me… in North America! A s… and frigid temperatures,1,0 FREE ice cream with these local area deals: chance to </code></pre> </div> </div> <p>此外，在此之后，我必须将其保存在一个CSV格式的文件中</p>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<p>谢谢你更新你的问题。首先，我要定义程序的入口点，例如<code>main</code>。然后，只需进行初步的CSV（非常简单）解析即可。这只是打印有关CSV中每个条目的信息，以验证我们是否正确解析它：</p> <pre><code>def main(): with open("project_twitter_data.csv", "r") as file: # Skip the first line next(file) for tweet, retweet_count, reply_count in map(lambda line: line.strip().split(","), file): print(f"tweet: {tweet[:20]}...\nretweet_count: {retweet_count}\nreply_count: {reply_count}\n") if __name__ == "__main__": main() </code></pre> <p>输出：</p> <pre><code>tweet: @twitteruser: On now... retweet_count: 3 reply_count: 0 tweet: BUNCH of things abou... retweet_count: 1 reply_count: 0 >>> </code></pre> <p>我的CSV文件中只有两个条目，但它应该适用于任意数量的条目（只要推文中没有逗号）</p> <p>然后，你需要加载你的积极和消极的话。我假设文件不是太大，所以你可以把所有单词都读入列表。有许多不同的方法可以计算每条推文的正面和负面词汇。例如，您可以将当前推文拆分为空白，以获得“单词”列表。我之所以说“单词”，是因为从技术上讲，这些字符串可能包含标点符号，所以您必须以某种方式将其考虑在内。另一种方法是使用带有单词边界的正则表达式模式从当前tweet生成单词列表。我在下面所做的只是在当前tweet中寻找一个子串，这有点幼稚。除非有一个适当的单元测试，故意寻找以确保没有使用这种方法，否则这应该足够好了</p> <pre><code>def main(): with open("positive_words.txt", "r") as file: positive_words = file.read().splitlines() with open("negative_words.txt", "r") as file: negative_words = file.read().splitlines() with open("project_twitter_data.csv", "r") as file: # Skip the first line next(file) for tweet, retweet_count, reply_count in map(lambda line: line.strip().split(","), file): positive_count = sum(tweet.count(word) for word in positive_words) negative_count = sum(tweet.count(word) for word in negative_words) net_count = positive_count - negative_count # Write retweet_count, reply_count, positive_count, negative_count and net_count to resulting_data.csv if __name__ == "__main__": main() </code></pre>

Coursera Python最终项目情感分类器

1 个回答

相关Python问题