单元测试设计

0 投票
2 回答
968 浏览
提问于 2025-04-16 19:50

几天前,我写了一段代码作为面试的一部分。题目是给定一段文本,找出查询的文本是否存在于这段文本中。我使用了哈希表来保存这段文本(键是文本中的单词,值是这些单词在文本中的位置)。所以现在给定查询字符串,我可以找到文本中存在的单词的位置,并显示包含最多查询单词的文本片段。我觉得一切都很好。

但是我还被要求为这段代码写单元测试。虽然我之前从未写过单元测试,但我知道它们在开发过程中是多么重要。因此,我创建了一些测试用例,考虑了平均情况和边界情况。但我不太清楚的是,写测试用例时是否需要提前知道正确的输出。

我开始获取一些文本作为程序的输入,以及相应的输出,把它们放在一个类里,之后作为输入读取到我的程序中。

下面是单元测试的代码:

import unittest
import random
import generate_random_test
class c_Known_Output():
Input_Text1 = '''We ordered the traditional deep dish pizza and a Manchego salad. Were started off with a complimentary bread, that looks like a really big hamburger bun top at first glance. Even though it was free bread, it was soft and slightly sweet and delicious. I liked dipping it in the balsamic reduction and olive oil from the salad plate. The salad dish was perfectly fine, but I wish the Manchego slices on top were somehow sliced a bit thinner. The deep dish traditional pizza came out a bit later (remember the 40 min. cooking time, folks!), piping hot and smelling delicious. At first bite, I wasnt sure how much I liked it.'''

Output_Text1 = '''I liked dipping it in the balsamic reduction and olive oil from the salad plate. The salad [[HIGHLIGHT]]dish[[ENDHIGHLIGHT]] was perfectly fine, but I wish the Manchego slices on top were somehow sliced a bit thinner. The [[HIGHLIGHT]]deep dish[[ENDHIGHLIGHT]] traditional pizza came out a bit later (remember the 40 min. cooking time, folks!), piping hot and smelling delicious.'''

Input_Text2 = '''Best tacos I have ever had Lived down the road from this truck for years. Watched almost every episode of BSG eating these tacos with beer. Moved to Az and El Chato is one of the things I miss the most! ANYONE that is around them, you have to go here.'''

Output_Text2 = '''Best [[HIGHLIGHT]]tacos[[ENDHIGHLIGHT]] I have ever had Lived down the road from this truck for years. Watched almost every episode of BSG eating these [[HIGHLIGHT]]tacos[[ENDHIGHLIGHT]] with beer. Moved to Az and El Chato is one of the things I miss the most!'''

Query_Not_found = '''Query Not Found'''


class c_myTest( unittest.TestCase ):
Generator = generate_random_test.TestCaseGenerator()
KnowOutput = c_Known_Output()

def testAverageCase1(self):
    """no keywords present...no highlight"""
    output = highlight.m_Highlight_doc( self.KnowOutput.Input_Text1, 'deep dish')
    print "\nTest Case 1"
    print output
    self.assertEqual(output, self.KnowOutput.Output_Text1)

def testAverageCase2(self):
    output = highlight.m_Highlight_doc( self.KnowOutput.Input_Text2, 'burrito taco take out')
    print "\nTest Case 2"
    print output
    self.assertEqual(output, self.KnowOutput.Output_Text2)

def testSnippetLength(self):
    """ if the search word is present only once in the text...check if the snippet is of optimum length...optimum length is defined as one sentence before
    and after the sentence in which the query word is present"""
    output = highlight.m_Highlight_doc( self.KnowOutput.Input_Text3, 'tacos')
    print "\nTest Case 3"
    print output
    self.assertEqual(output, self.KnowOutput.Output_Text3)

def testSmallText(self):
    """The text is just one sentence, with the query present in it. The same sentence should be the output"""
    output = highlight.m_Highlight_doc( self.KnowOutput.Input_Text4, 'deep dish pizzas')
    print "\nTest Case 4"
    print output
    self.assertEqual(output, self.KnowOutput.Output_Text4)

def testBadInput(self):
    """no keywords present...no highlight"""
    output = highlight.m_Highlight_doc( self.KnowOutput.Input_Text4, 'tacos')
    print "\nTest Case 5"
    print output
    self.assertEqual(output, self.KnowOutput.Query_Not_found)

#Now test with randomly generated text
def testDistantKeywords(self):
    """the search queries are very distant in the text. 6 query words are generated. First 4 of these queries are inserted in one paragraph and the last two
    queries are inserted in another. The snippet should be of the first paragraph which has the maximum number of query words present in it."""
    query = self.Generator.generateSentence(6, 5)
    text1 = self.Generator.generateTextwithQuery( query[0:4], 10, 10, 5, 3 )
    text2 = self.Generator.generateTextwithQuery( query[5:], 10, 10, 5, 3 )
    text1.append('\n')
    text1.extend(text2)
    print "\nTest Case 6"
    print "=========================TEXT=================="
    print ' '.join(text1)
    print "========================QUERY=================="
    print ' '.join(query)
    print " "
    output_text = highlight.m_Highlight_doc( ' '.join(text1), ' '.join(query))
    print "=======================SNIPPET================="
    print output_text
    print " "


if __name__=='__main__':
    unittest.main()

显然我失败了,但没有人告诉我原因,现在我正在努力弄清楚这段代码是否有问题。有人能帮我找出单元测试中的问题吗?如果你提前知道代码的输出,该怎么写单元测试呢?比如,我们能为随机数生成器写单元测试吗?

提前谢谢大家!!!

2 个回答

0

现在的软件开发方法认为,几乎所有的软件都应该进行测试。谷歌在测试随机数生成器方面有很多答案。你可以查看 这个维基百科页面这个谷歌搜索结果

1

我觉得如果你知道你的代码应该做什么,那你就可以写单元测试。在你的测试搜索案例中,可以说你能识别出一组给定输入的预期输出。你的面试官可能更关注的是文本匹配器和测试的编码部分,而不是你使用的原则。关于随机数生成器,是的,你可以对它进行测试,只要你记住在计算机中你只能使用伪随机数生成器。可以测试的一些实际和有用的事情包括:生成器对于相同的种子应该产生相同的输出,以及周期不应该短于你定义的值。你可能会在意某个种子是否产生一个预先设定的序列,这应该在测试套件和文档中有所体现。

我的方法是先从测试开始,然后编写代码确保它们都能通过(参见测试驱动开发)。这样不仅能提供良好的测试覆盖率,还能在你编写代码之前帮助定义代码的功能。

撰写回答