从字典生成包含随机单词的csv文件

z3c.gibberish的Python项目详细描述


从字典中生成包含随机单词的csv文件

从构建获取脚本:

>>> import os
>>> gibberish = os.path.join(
...     reduce(lambda path, _: os.path.dirname(path),
...            range(3), __file__), 'bin', 'gibberish')

打印帮助:

>>> from zc.buildout.testing import system
>>> print system(gibberish+' --help'),
usage: gibberish [options] LINES COLUMN [COLUMN ...]
<BLANKLINE>
Generate lines of CSV consisting of random words from a
dictionary.  The number of lines of CSV must be specified either
as a single integer to specify a fixed number of lines or two
integers separated by a dash to specify that a random number of
lines between the two integers should be used.  The columns are
specified in the same manner where the numbers represent the
number of words in that column for a given line.
<BLANKLINE>
options:
  -h, --help            show this help message and exit
  -w WORDS, --words=WORDS
                        File containing the words to be chosen
                        from [default: /usr/share/dict/words]

制作一个简单的文件,其中一行一列包含一个单词:

>>> import cStringIO, csv
>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' 1 1'))))
>>> len(result)
1
>>> len(result[0])
1
>>> len(result[0][0].split())
1

确保删除换行符:

>>> result[0][0][-1] != '\n'
True

列中有两个单词:

>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' 1 2'))))
>>> len(result)
1
>>> len(result[0])
1
>>> len(result[0][0].split())
2

列中有任意数量的单词:

>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' 1 1-10'))))
>>> len(result)
1
>>> len(result[0])
1
>>> 1 <= len(result[0][0].split()) <= 10
True

10行:

>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' 10 2'))))
>>> len(result)
10
>>> len(result[0])
1
>>> len(result[0][0].split())
2

随机行数:

>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' 1-10 2'))))
>>> 1 <= len(result) <= 10
True
>>> len(result[0])
1
>>> len(result[0][0].split())
2

有两列:

>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' 1 2 3'))))
>>> len(result)
1
>>> len(result[0])
2
>>> len(result[0][0].split())
2
>>> len(result[0][1].split())
3

列中包含零的随机单词数:

>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' 1 0-1'))))
>>> len(result)
1
>>> len(result[0])
1
>>> len(result[0][0].split()) in (0, 1)
True

用一本小字典来测试是否用尽字典:

>>> import tempfile
>>> _, tmp_path = tempfile.mkstemp()
>>> tmp = file(tmp_path, 'w')
>>> tmp.write('foo')
>>> tmp.close()
>>> result = tuple(csv.reader(cStringIO.StringIO(
...     system(gibberish+' -w %s 1 1' % tmp_path))))
>>> result
(['foo'],)
>>> os.remove(tmp_path)

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
virtualbox无法从java移动共享文件夹中的文件   java如何连接Android 4.3.5(GA)的apache HttpClient库?   片段中的java Recyclerview未立即显示警报对话框结果   javac(n,r)计算器程序不工作   java使用BooleanQuery还是编写更多索引?   如何在java中设置y/n循环?   java不兼容的通用通配符捕获   java如何在安卓xml中编写数据绑定时的三元操作条件   java如何使用FileDialog?   java如何创建单元测试来检测是否有人使用错误的编码编辑了文件?   java如何从唯一的字符串生成唯一的int?   java gradletomcatplugin:log4j:WARN找不到记录器的附加程序   java我的动态编程解决方案(Kefa和第一步)在codeforces中有什么问题?   java每天更新两个数据库,使它们都包含相同的有效数据集   java如何检查给定的时间是否在时间限制之间   java在单个json POST上保存父级和子级   java如何获取Solr字段类型