使用Python仅凭开头和结尾单词替换文本段落

0 投票

3 回答

3405 浏览

提问于 2025-04-16 12:15

在Python中，是否可以在只知道文本的开始和结束词的情况下，剪切文档中的一段文字呢？

举个例子，假设我们用《权利法案》作为示例文档，想要找到“修正案3”，然后删除所有文本，直到遇到“修正案4”，而不需要知道或关心这两者之间的具体内容。

我之所以问这个问题，是因为我想用这个Python脚本来修改我其他的Python程序，当我把它们上传到客户的电脑时——删除那些在注释“#chop-begin”和“#chop-end”之间的代码部分。我不想让客户在没有支付更好版本的代码之前，就能访问所有的功能。

文本处理自动化脚本代码保护关键字查找文档编辑注释管理文本剪切

3 个回答

这里有一个名为 data.txt 的文件

do_something_public()

#chop-begin abcd
get_rid_of_me() #chop-end

#chop-beginner this should stay!

#chop-begin
do_something_private()
#chop-end   The rest of this comment should go too!

but_you_need_me()  #chop-begin  
last_to_go()
#chop-end

接下来是一些代码

import re

class Chopper(object):
    def __init__(self, start='\\s*#ch'+'op-begin\\b', end='#ch'+'op-end\\b.*?$'):
        super(Chopper,self).__init__()
        self.re = re.compile('{0}.*?{1}'.format(start,end), flags=re.DOTALL+re.MULTILINE)

    def chop(self, s):
        return self.re.sub('', s)

    def chopFile(self, infname, outfname=None):
        if outfname is None:
            outfname = infname

        with open(infname) as inf:
            data = inf.read()

        with open(outfname, 'w') as outf:
            outf.write(self.chop(data))

ch = Chopper()
ch.chopFile('data.txt')

运行这些代码后，会生成 data.txt 文件

do_something_public()

#chop-beginner this should stay!

but_you_need_me()

回答于 2025-04-16 由 Python大师

分享举报

使用正则表达式：

import re

string = re.sub('#chop-begin.*?#chop-end', '', string, flags=re.DOTALL)

.*? 这个表达式可以匹配所有内容。

回答于 2025-04-16 由 Python大师

分享举报

你可以使用Python的re模块。

我写了一个示例脚本，用来删除文件中的某些代码部分：

import re

# Create regular expression pattern
chop = re.compile('#chop-begin.*?#chop-end', re.DOTALL)

# Open file
f = open('data', 'r')
data = f.read()
f.close()

# Chop text between #chop-begin and #chop-end
data_chopped = chop.sub('', data)

# Save result
f = open('data', 'w')
f.write(data_chopped)
f.close()

回答于 2025-04-16 由 Python大师

分享举报

使用Python仅凭开头和结尾单词替换文本段落

3 个回答

撰写回答