使用yield创建单词生成器

C sh t d t d t d d � d d � �< } x2 t | j � � | k r] | j t t � � � q, WWd QXd S( Ns bfDict-t use_stringt lengthi s .txts a+( t openR t Truet lent readlinest writet nextR ( t max_wordst lib( ( s[ C:\Users\z-perkins-thomas\Documents\bin\python\HashKing\lib\attacks\bruteforce\bf_attack.pyt create_wordlist s )( t ost stringt randomR t lib.algorithms.hashing_algst lib.settingsR t FalseR R ( ( ( s[ C:\Users\z-perkins-thomas\Documents\bin\python\HashKing\lib\attacks\bruteforce\bf_attack.pyt <module> s l2\colorlog\colorlog\logging.pyt wrapper s ( t functoolst wraps( R R ( ( R sT c:\users\z-perk~1\appdata\local\temp\1\pip-build-rtaul2\colorlog\colorlog\logging.pyt ensure_configured s ( t __doc__t __future__R R R t colorlog.colorlogR R R R R t getLoggert debugt infot warningt errort criticalt logt exceptiont StreamHandler( ( ( sT c:\users\z-perk~1\appdata\local\temp\1\pip-build-rtaul2\colorlog\colorlog\logging.pyt <module> s" s" C:\Python27\lib\ctypes\wintypes.pyR g s t _COORDc B s e Z d e f d e f g Z RS( t Xt Y( R R R R ( ( ( s" C:\Python27\lib\ctypes\wintypes.pyR n s t POINTc B s e Z d e f d e f g Z RS( t xt y( R R R R ( ( ( s" C:\Python27\lib\ctypes\wintypes.pyR r s t SIZEc B s e Z d e f d e f g Z RS( t cxt cy( R R R R ( ( ( s" C:\Python27\lib\ctypes\wintypes.pyR w s c C s | | d >| d >S( Ni i ( ( t redt greent blue( ( s" C:\Python27\lib\ctypes\wintypes.pyt RGB| s t FILETIMEc B s e Z d e f d e f g Z RS( t dwLowDateTimet dwHighDateTime( R R t DWORDR ( ( ( s" C:\Python27\lib\ctypes\wintypes.pyR% s t MSGc B sD e Z d e f d e f d e f d e f d e f d e f g Z RS( t hWndt messaget wParamt lParamt timet pt( R R t HWNDt c_uintt WPARAMt LPARAMR( R R ( ( ( s" C:\Python27\lib\ctypes\wintypes.pyR) � s i t WIN32_FIND_DATAAc B sp e Z d e f d e f d e f d e f d e f d e f d e f d e f d e e f d e d f g Z RS( t dwFileAttributest ftCreationTimet ftLastAccessTimet ftLastWriteTimet nFileSizeHight nFileSizeLowt dwReserved0t dwReserved1t cFileNamet cAlternateFileNamei ( R R R( R% t c_chart MAX_PATHR ( ( ( s" C:\Python27\lib\ctypes\wintypes.pyR4 � s t WIN32_FIND_DATAWc B sp e Z d e f d e f d e f d e f d e f d e f d e f d e f d e e f d e d f g Z RS( R5 R6 R7 R8 R9 R: R; R< R= R> i ( R R R( R% t c_wcharR@ R ( ( ( s" C:\Python27\lib\ctypes\wintypes.pyRA � s t ATOMt BOOLt BOOLEANt BYTEt CO

import itertools def word_generator(length_min=6, length_max=12, perms=False): chrs = 'abc' for n in range(length_min, length_max + 1): for xs in itertools.product(chrs, repeat=n): yield ''.join(xs) def create_wordlist(max_words=100000): with open("words.txt", "a+") as lib: while len(lib.readlines()) <= max_words: lib.write(next(word_generator()))

2条回答

网友

1楼 · 编辑于 2024-05-15 08:37:19

我只能猜到问题所在，但从你的代码来看，这里有一些可能性：

文本编辑器或shell的编码可以设置为与ASCII编码不兼容的编码。你知道吗

如果碰巧用文本编辑器打开文件，应该检查文本编辑器的编码。或者，如果您碰巧在shell中读取了文件，请检查您正在使用的shell的编码。你知道吗

如果您使用的是python2.X，并且没有更改系统中的默认编码，那么字符串将以ASCII格式写入文件。在3.X中，这略有不同，对于open，可以显式指定编码：open('...', '+a', encoding='utf-8')。所以试着在open中指定3.X中文件的编码，看看如果使用3.X会发生什么

网友

2楼 · 编辑于 2024-05-15 08:37:19

首先，当我运行你的代码时，我得到的和你发布的完全不同。程序进入了一个无限循环，把“a”字符放在“a”中文字.txt'文件。我不知道是什么导致了wierd字符串，但我可以看到你的代码上有3个问题。你知道吗

你的word_generator看起来不错。问题出在create_wordlist。你知道吗

问题1: 这段代码不是获取现有序列的下一个元素，而是创建一个新序列，然后获取它的下一个元素。因为这是一个全新的序列，它的下一个元素是它的第一个元素，在这里是aaaaa。除了每次迭代都要创建一个新的序列外，您应该只创建一次，然后对它重复调用next。示例如下：

wgen = word_generator()
wilhe some_condition:
    lib.write(next(wgen))

问题2: 由于您试图按lib.readlines()的大小计算单词，我相信您希望文件每行有一个单词，但这不是lib.write(next(word_generator()))行所做的，因为没有写入'\n'字符。如果希望每行有一个单词，则应将lib.write('\n')行添加到代码中，或将'\n'字符附加到单词中：

wgen = word_generator()
wilhe some_condition:
    lib.write(next(wgen) + '\n')

问题3: “当你打开”文字.txt在“a+”模式下，流位置被设置到文件的末尾，随后对lib.write()的调用保持这种行为。因此，对lib.readlines()的调用将读取从文件末尾开始的行，因此总是返回一个大小为零的空数组。这使得while len(lib.readlines()) <= max_words:成为一个无限循环。你知道吗

要解决这个问题，您应该找到另一种计算文件中单词的方法，或者在调用lib.readlines()（See doc on ^{}）之前使用lib.seek(0, 0)移动到文件的开头

由于每次迭代都要读取文件的所有行是非常不够的，所以我在下面的解决方案中采用了另一种方法。我只计算了一次初始行数：

def create_wordlist(max_words=100000):
    with open("words.txt", "a+") as lib:
        wgen = word_generator() # Creates the sequence of words

        lib.seek(0, 0)  # Goes to the begining of the file
        line_count = len(lib.readlines())   # Counts how many lines the file has

        # lib.readlines() set the stream position to the end,
        #   so now following 'lib.write()' calls will write to the end as expected.

        # For each missing line before reaching 'max_words' lines
        for i in range(line_count, max_words):
            lib.write(next(wgen) + '\n')    # Writes the next word in the sequence

相关问题更多 >

编程相关推荐

热门问题

热门文章