从字典生成包含随机单词的csv文件
z3c.gibberish的Python项目详细描述
从字典中生成包含随机单词的csv文件
从构建获取脚本:
>>> import os >>> gibberish = os.path.join( ... reduce(lambda path, _: os.path.dirname(path), ... range(3), __file__), 'bin', 'gibberish')
打印帮助:
>>> from zc.buildout.testing import system >>> print system(gibberish+' --help'), usage: gibberish [options] LINES COLUMN [COLUMN ...] <BLANKLINE> Generate lines of CSV consisting of random words from a dictionary. The number of lines of CSV must be specified either as a single integer to specify a fixed number of lines or two integers separated by a dash to specify that a random number of lines between the two integers should be used. The columns are specified in the same manner where the numbers represent the number of words in that column for a given line. <BLANKLINE> options: -h, --help show this help message and exit -w WORDS, --words=WORDS File containing the words to be chosen from [default: /usr/share/dict/words]
制作一个简单的文件,其中一行一列包含一个单词:
>>> import cStringIO, csv >>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' 1 1')))) >>> len(result) 1 >>> len(result[0]) 1 >>> len(result[0][0].split()) 1
确保删除换行符:
>>> result[0][0][-1] != '\n' True
列中有两个单词:
>>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' 1 2')))) >>> len(result) 1 >>> len(result[0]) 1 >>> len(result[0][0].split()) 2
列中有任意数量的单词:
>>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' 1 1-10')))) >>> len(result) 1 >>> len(result[0]) 1 >>> 1 <= len(result[0][0].split()) <= 10 True
10行:
>>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' 10 2')))) >>> len(result) 10 >>> len(result[0]) 1 >>> len(result[0][0].split()) 2
随机行数:
>>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' 1-10 2')))) >>> 1 <= len(result) <= 10 True >>> len(result[0]) 1 >>> len(result[0][0].split()) 2
有两列:
>>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' 1 2 3')))) >>> len(result) 1 >>> len(result[0]) 2 >>> len(result[0][0].split()) 2 >>> len(result[0][1].split()) 3
列中包含零的随机单词数:
>>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' 1 0-1')))) >>> len(result) 1 >>> len(result[0]) 1 >>> len(result[0][0].split()) in (0, 1) True
用一本小字典来测试是否用尽字典:
>>> import tempfile >>> _, tmp_path = tempfile.mkstemp() >>> tmp = file(tmp_path, 'w') >>> tmp.write('foo') >>> tmp.close() >>> result = tuple(csv.reader(cStringIO.StringIO( ... system(gibberish+' -w %s 1 1' % tmp_path)))) >>> result (['foo'],) >>> os.remove(tmp_path)