Python根据3个值的格式执行文件检查，然后执行任务

with open('CrimeReport20150518.txt', 'r') as f: content = f.read() print content.index('**UPDATES**') print content.index('**INCIDENTS**') print content.index('**ARRESTS**') updatesLine = content.index('**UPDATES**') incidentsLine = content.index('**INCIDENTS**') arrestsLine = content.index('**ARRESTS**') #print content[updatesLine:incidentsLine] updates = content[updatesLine:incidentsLine] #print updates incidents = content[incidentsLine:arrestsLine] #print incidents arrests = content[arrestsLine:] print arrests

2条回答

网友

1楼 · 编辑于 2024-04-20 06:28:31

您当前正在使用^{}查找文本中的标题。文件规定：

Like find(), but raise ValueError when the substring is not found.

这意味着您需要捕获异常才能处理它。例如：

try:
    updatesLine = content.index('**UPDATES**')
    print "Found updates heading at", updatesLine
except ValueError:
    print "Note: no updates"
    updatesLine = -1

从这里，您可以根据存在的节来确定用于切片字符串的正确索引

或者，您可以使用文档中引用的^{}方法.index()

Return -1 if sub is not found.

使用find可以测试它返回的值

updatesLine = content.find('**UPDATES**')
# the following is straightforward, but unwieldy
if updatesLine != -1:
    if incidentsLine != -1:
        updates = content[updatesLine:incidentsLine]
    elif arrestsLine != -1:
        updates = content[updatesLine:arrestsLine]
    else:
        updates = content[updatesLine:]

无论哪种方式，您都必须处理存在和不存在哪些截面的所有组合，以确定正确的切片边界

我更喜欢用状态机来处理这个问题。逐行读取文件并将该行添加到相应的列表中。找到报头后，更新状态。下面是一个未经测试的原理演示：

data = {
    'updates': [],
    'incidents':  [],
    'arrests': [],
    }

state = None
with open('CrimeReport20150518.txt', 'r') as f:
    for line in f:
        if line == '**UPDATES**':
            state = 'updates'
        elif line == '**INCIDENTS**':
            state = 'incidents'
        elif line == '**ARRESTS**':
            state = 'arrests'
        else:
            if state is None:
                print "Warn: no header seen; skipping line"
            else
                data[state].append(line)

print data['arrests'].join('')

网友

2楼 · 编辑于 2024-04-20 06:28:31

尝试使用content.find()而不是content.index()。当字符串不存在时，它不会中断，而是返回-1。然后可以执行以下操作：

updatesLine = content.find('**UPDATES**')
incidentsLine = content.find('**INCIDENTS**')
arrestsLine = content.find('**ARRESTS**')

if incidentsLine != -1 and arrestsLine != -1:

       # Do what you normally do
       updatesLine = content.index('**UPDATES**')
       incidentsLine = content.index('**INCIDENTS**')
       arrestsLine = content.index('**ARRESTS**')

       updates = content[updatesLine:incidentsLine]
       incidents = content[incidentsLine:arrestsLine]
       arrests = content[arrestsLine:]

elif incidentsLine != -1:
     # Do whatever you need to do to files that don't have an arrests section here

elif arreststsLine != -1:
     # Handle files that don't have an incidents section here

else:
     # Handle files that are missing both

可能您需要稍微不同地处理所有四种可能的组合

你的解决方案在我看来一般没问题，只要节总是以相同的顺序出现，文件不会变得太大。您可以在stack exchange的代码评审https://codereview.stackexchange.com/中获得真正的反馈

相关问题更多 >

编程相关推荐

热门问题

热门文章