在代码翻译方面需要一些帮助（从Python到C#）问题的回答

在代码翻译方面需要一些帮助（从Python到C#）

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

大家晚安 这个问题让我有点尴尬，因为，我知道我应该可以独自得到答案。不过，我对Python的了解还远远不够，所以我需要比我更有经验的人来帮助我。。。在 下面的代码来自最近编辑的一本书中的<a href="http://norvig.com/ngrams" rel="nofollow">Norvig's "Natural Language Corpus Data"</a>一章，它是关于将一个句子“likethisone”转换成“[像，这个，一个]”（意思是，正确地分割单词）。。。在 除了函数<code>segment</code>之外，我已经将所有代码移植到C#（事实上，我自己重新编写了这个程序），我甚至在试图理解它的语法时也遇到了很多麻烦。有人能帮我把它翻译成一个更易读的C格式吗？在 事先非常感谢。在 <pre><code>################ Word Segmentation (p. 223) @memo def segment(text): "Return a list of words that is the best segmentation of text." if not text: return [] candidates = ([first]+segment(rem) for first,rem in splits(text)) return max(candidates, key=Pwords) def splits(text, L=20): "Return a list of all possible (first, rem) pairs, len(first)<=L." return [(text[:i+1], text[i+1:]) for i in range(min(len(text), L))] def Pwords(words): "The Naive Bayes probability of a sequence of words." return product(Pw(w) for w in words) #### Support functions (p. 224) def product(nums): "Return the product of a sequence of numbers." return reduce(operator.mul, nums, 1) class Pdist(dict): "A probability distribution estimated from counts in datafile." def __init__(self, data=[], N=None, missingfn=None): for key,count in data: self[key] = self.get(key, 0) + int(count) self.N = float(N or sum(self.itervalues())) self.missingfn = missingfn or (lambda k, N: 1./N) def __call__(self, key): if key in self: return self[key]/self.N else: return self.missingfn(key, self.N) def datafile(name, sep='\t'): "Read key,value pairs from file." for line in file(name): yield line.split(sep) def avoid_long_words(key, N): "Estimate the probability of an unknown word." return 10./(N * 10**len(key)) N = 1024908267229 ## Number of tokens Pw = Pdist(datafile('count_1w.txt'), N, avoid_long_words) </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

我根本不懂C，但我可以解释Python代码是如何工作的。在 <pre><code>@memo def segment(text): "Return a list of words that is the best segmentation of text." if not text: return [] candidates = ([first]+segment(rem) for first,rem in splits(text)) return max(candidates, key=Pwords) </code></pre> 第一条线 ^{pr2}$ 是一个<a href="http://docs.python.org/glossary.html#term-decorator" rel="nofollow">decorator</a>。这将导致函数（在后面的行中定义）被包装在另一个函数中。装饰器通常用于过滤输入和输出。在本例中，根据它所包装的函数的名称和角色，我认为这个函数<a href="http://en.wikipedia.org/wiki/Memoization" rel="nofollow">memoizes</a>调用<code>segment</code>。在 下一步： <pre><code>def segment(text): "Return a list of words that is the best segmentation of text." if not text: return [] </code></pre> 正确声明函数，给出<a href="http://docs.python.org/glossary.html#term-docstring" rel="nofollow">docstring</a>，并设置此函数递归的终止条件。在 接下来是最复杂的一行，也可能是给你带来麻烦的那一行： <pre><code> candidates = ([first]+segment(rem) for first,rem in splits(text)) </code></pre> 外圆括号与<code>for..in</code>构造相结合，创建一个<a href="http://docs.python.org/glossary.html#term-generator-expression" rel="nofollow">generator expression</a>。这是迭代序列的有效方法，在本例中是<code>splits(text)</code>。生成器表达式是一种紧凑的for循环，可以产生值。在这种情况下，这些值将成为迭代<code>candidates</code>的元素。”Genexps“类似于<a href="http://en.wikipedia.org/wiki/List_comprehension" rel="nofollow">list comprehensions</a>，但是通过不保留它们产生的每个值来实现更高的内存效率。在 因此，对于<code>splits(text)</code>返回的迭代中的每个值，生成器表达式都会生成一个列表。在 来自<code>splits(text)</code>的每个值都是<code>(first, rem)</code>对。在 每个生成的列表都以对象<code>first</code>开头；这是通过将<code>first</code>放在列表文本中来表示的，即<code>[first]</code>。然后将另一个列表添加到其中；第二个列表由对<code>segment</code>的递归调用确定。在Python中添加列表将它们串联起来，即<code>[1, 2] + [3, 4]</code>给出{<cd13>}。在 最后，在 <pre><code> return max(candidates, key=Pwords) </code></pre> 递归确定的列表<code>iteration</code>和一个键函数被传递给<code>max</code>。对迭代中的每个值调用key函数，以获取用于确定该列表在迭代中是否具有最高值的值。在

在代码翻译方面需要一些帮助（从Python到C#）

1 个回答

相关Python问题