Blockquote
ArabicTokenizer supports various orthographic normalization options that can be configured
in ArabicSegmenter using the -orthoOptions flag. The argument to -orthoOptions is a comma-separated list of
normalization options. The following options are supported:
...
removeDiacritics : Strip all diacritics
removeTatweel : Strip tatweel elongation character
removeQuranChars : Remove diacritics that appear in the Quran
根据github:
这有用吗
相关问题 更多 >
编程相关推荐