LIWC2015分析的驱动因素。不包括LIWC2015字典。
liwc-analysis的Python项目详细描述
LIWC分析
此包用作liwc2015.txt字典的驱动程序。字典不包括在内,可以直接从LIWC购买。
用法
用法相当直截了当。首先导入包
importliwcanalysis
然后需要创建一个liwc分析的实例,该实例的路径是.txt文件。
LIWCLocation="/Users/Eric/repositories/transcript-analysis/LIWC/LIWC.2015.all.txt"LIWC=liwcanalysis.liwc(LIWCLocation)
然后,您可以传入要分析的字符串列表,以接收结果字典和计数字典的元组。
transcripts={"Example1":"This is a single transcript. Red hat angry.","Example2":"This is another single transcript. Dog boy cat.",}str_list=[]forkeyintranscripts:str_list.append(transcripts[key])result_dics,count_dics=LIWC.analyze(str_list)
请注意,analyze()
可以接受单个字符串参数或字符串列表。示例:
# this is validresult_dics,coutn_dics=LIWC.analyze(["this is a string","here is another","one more"])# this is also validresult_dics,coutn_dics=LIWC.analyze("this is a string")
result_dics
是字典列表。每个字典对应于传入analyze
的一个字符串。每本词典都遵循"LIWC Category": [list, of, words, matched]
的形式。例如,一个字符串的字典可能类似于:
{
"FUNCTION": ["is", "a"],
"QUANT": ["single"],
...
}
count_dics
与result_dics
非常相似,但它不是给出匹配的单词列表,而是给出每个匹配单词列表的长度:
{
"FUNCTION": 2,
"QUANT": 1,
...
}
最后,您可以使用:
LIWC.print(output_dir,titles)
您需要指定输出目录,以及每个字符串的标题列表。有关更多详细信息,请参见完整示例。
还可以使用LIWC.get_categories()
检索按字母顺序排序的LIWC类别列表(a->;z)。
完整示例
importliwcanalysistranscripts={"Example1":"This is a single transcript. Red hat angry.","Example2":"This is another single transcript. Dog boy cat.",}str_list=[]forkeyintranscripts:strs.append(transcripts[key])LIWCLocation="/Users/Downloads/LIWC/LIWC.2015.all.txt"output_dir="/Path/to/my/file/"LIWC=liwcanalysis.liwc(LIWCLocation)result_dics,count_dics=LIWC.analyze(str_list)LIWC.print(output_dir,list(transcript.keys()))
使用print将返回以下表格: /路径/to/my/file/liwccounts.csv:
Category | Example1 | Example2 |
---|---|---|
ADJ | 1 | 1 |
ARTICLE | 1 | |
AUXVERB | 1 | 1 |
FOCUSPRESENT | 1 | 1 |
FUNCTION | 2 | 2 |
IPRON | 1 | |
MALE | 1 | |
NUMBER | 1 | 1 |
PRONOUN | 1 | |
QUANT | 1 | 2 |
SOCIAL | 1 | |
VERB | 1 | 1 |
WORK | 1 | 1 |
TOTAL | 8 | 8 |
/路径/to/my/file/liwcwords.csv:
Category | Example1 | Example2 |
---|---|---|
ADJ | ['single'] | ['single'] |
ARTICLE | ['a'] | |
AUXVERB | ['is'] | ['is'] |
FOCUSPRESENT | ['is'] | ['is'] |
FUNCTION | ['is', 'a'] | ['is', 'another'] |
IPRON | ['another'] | |
MALE | ['boy'] | |
NUMBER | ['single'] | ['single'] |
PRONOUN | ['another'] | |
QUANT | ['single'] | ['another', 'single'] |
SOCIAL | ['boy'] | |
VERB | ['is'] | ['is'] |
WORK | ['transcript.'] | ['transcript.'] |
/路径/to/my/file/liwcrelativerefreq.csv
Category | Example1 | Example2 |
---|---|---|
ADJ | 0.125 | 0.125 |
ARTICLE | 0.125 | |
AUXVERB | 0.125 | 0.125 |
FOCUSPRESENT | 0.125 | 0.125 |
FUNCTION | 0.25 | 0.25 |
IPRON | 0.125 | |
MALE | 0.125 | |
NUMBER | 0.125 | 0.125 |
PRONOUN | 0.125 | |
QUANT | 0.125 | 0.25 |
SOCIAL | 0.125 | |
VERB | 0.125 | 0.125 |
WORK | 0.125 | 0.125 |
如果您有任何问题或功能要求,请告诉我。请随时打开一个请求,发布,或向我发送一封电子邮件到ericwiener3@gmail.com。