在我的场景中,我尝试使用AWS lambda
python代码获取特定的aws3存储文本file
word count
及其{
我试着数一数
import boto3
def lambda_handler(event, context):
# create the s3 resource
s3 = boto3.resource('s3')
# get the file object
obj = s3.Object('bucket name', 'sample.txt')
# read the file contents in memory
file_contents = obj.get()["Body"].read()
# print the occurrences of the new line character to get the number of lines
# print file_contents.count('\n')
# TODO implement
return {
'Line Count': file_contents.count('\n')
}
Current Response: { "Line Count": 48, }
Expected Response: { "Line Count": 48, "Word Count": : ?, // Here I want to show word count "Language": ? // Here language name }
要获得字数,您可以尝试下面列出的任何方法:How to count the number of words in a sentence, ignoring numbers, punctuation and whitespace?
要检测语言,您可以尝试下面列出的方法:NLTK and language detection
不幸的是,你的问题相当宽泛。另外,检测文本语言的任务是相当困难的。计算字数很容易,但在很大程度上取决于你将如何定义一个词。在
相关问题 更多 >
编程相关推荐