如何在一个文件中搜索以特定格式开头的所有文本行并将其移动到新的lin

1:16 And God made two great lights; the greater light to rule the day, and the lesser light to rule the night: he made the stars also. 1:17 And God set them in the firmament of the heaven to give light upon the earth, 1:18 And to rule over the day and over the night, and to divide the light from the darkness: and God saw that it was good.

3条回答

网友

1楼 · 编辑于 2024-06-08 07:34:19

让我们看看9:3的片段：

stand before the children of Anak! 9:3 Understand therefore this day,

如果您搜索children of Anak，那么您发布的代码（假设正则表达式可以修复）将返回9:3，即使它应该是9:2。因此，我们需要重新思考如何解决这个问题。在

我建议

contents=book.read()
re.split(r'(\d+:\d+)',contents)

这就把整个文本分成章节/节数。在

^{pr2}$

在"consuming fire"上运行test.py会产生结果

% test.py 
                               King James Bible                               
Enter a word to search: consuming fire
Deuteronomy 4:24
For the LORD thy God is a consuming fire, even a jealous God.

Deuteronomy 9:3
Understand therefore this day, that the LORD thy God is he which goeth over
before thee; as a consuming fire he shall destroy them, and he shall bring
them down before thy face: so shalt thou drive them out, and destroy them
quickly, as the LORD hath said unto thee.

Hebrews 12:29
For our God is a consuming fire.

硬编码书籍的first_line数量是脆弱的——不要使用它们。（如果有人决定删除Gutenberg文件附带的标题文本，或者意外地在某个地方插入一些空白换行符，等等），会发生什么情况

您真正需要的只是书籍的顺序，因为每本新书都以chapter_verse1:1开头。在

网友

2楼 · 编辑于 2024-06-08 07:34:19

这是一个相当复杂的问题。由于不了解Python，下面是一个Perl
具有一个（可能是多个）正则表达式解决方案中的一个。这就是我想到的
在5分钟内，我确信可以重构它以提高效率，但您应该
明白了。在

use strict;
use warnings;

my $str = '
1:16 And God made two great lights; the greater light to rule the day,
and the lesser light to rule the night: he made the stars also.

1:17 And God set them in the firmament of the heaven to give light
upon the earth, 1:18 And to rule over the day and over the night, and
to divide the light from the darkness: and God saw that it was good.
';

my $word_search = 'God';

while ( $str =~ /

  (?:^|\s)
  (\d+) : (\d+)    # group 1,2
  (?:\s|$)
  (                # group 3
    (?:
        (?!
           \s+ \d+ : \d+ (?:\s|$)
        )
        .
    )*
    $word_search
    (?:
       (?!
          \s+ \d+ : \d+ (?:\s|$)
       )
       .
    )*
  )

/xsg )

{
  print "\nChapter $1, Verse $2\n";
  print "Verse: $3\n";
}

__END__

输出：

第一章，第16节
经文：上帝创造了两个伟大的光；更大的光统治着一天，
他也创造了星星。在

第一章，第17节
经文：神将他们安置在天上，要发光在地球上

第一章，第18节
从天而降，从天而降要将光明与黑暗分开。神就知道这是好的。在

编辑压缩如下：/(?:^|\s)(\d+):(\d+)(?:\s|$)((?:(?!\s+\d+:\d+(?:\s|$)).)*$word_search(?:(?!\s+\d+:\d+(?:\s|$)).)*)/sg

标志是（/sg）“单行”和“全局”

网友

3楼 · 编辑于 2024-06-08 07:34:19

尝试将正则表达式更改为：

^(\d+):(\d+)

^应该将匹配项定位到文本的开头。在

输出：

相关问题更多 >

编程相关推荐

热门问题

热门文章