如何在python中检查包含制表符的列表?

2024-04-20 12:23:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据.csv包含belwo内容的文件,在这个文件的末尾,还有一些新行。现在我想读取这个文件并从最后一行中获取特定列的值。你知道吗

Connecting to the ControlService endpoint

Found 3 rows.
Requests List:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Client ID                                                                   | Client Type                  | Service Type | Status               | Trust Domain              | Data Instance Name | Data Version | Creation Time              | Last Update                | Scheduled Time | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866                              | ROUTINGTIER_ARTIFACTS | SYSTEM       | COMPLETED            | RRA Bulk Client    | soa_server1       | 18.2.2.0.0  | 2016-06-14 03:49:55 -07:00 | 2016-06-14 03:49:57 -07:00 | ---            | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 500333443                                                          | CREATE                        | [FA_GSI]     | COMPLETED            | holder       | soa_server1       | 18.3.2.0.0  | 2018-08-07 11:59:57 -07:00 | 2018-08-07 12:04:37 -07:00 | ---            | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 500333446                                                          | CREATE                        | [FA_GSI]     | COMPLETED            | holder-test  | soa_server1       | 18.3.2.0.0  | 2018-08-07 12:04:48 -07:00 | 2018-08-07 12:08:52 -07:00 | ---            | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

现在我想解析上面的文件和最后一行的额外值。我想增加最后一行中“Client ID”和“Trust Domain”列的值,即:

Client ID: 500333446
Trust Domain: holder-test

我得到了下面的python脚本,但是由于csv文件末尾的新行而失败了?如果我的csv文件没有任何新行,那么它可以正常工作。你知道吗

import csv

lines_to_skip = 4
with open('data.csv', 'r') as f:
    reader = csv.reader(f, delimiter='|')
    for i in range(lines_to_skip):
        next(reader)

    data = []
    for line in reader:
        if line[0].find("---") != 0:
            print line
            data.append(line)

print("{}={}".format(data[-1][0].replace(" ",""),data[-1][4].replace(" ","")))

如果我的csv文件末尾有一些新行,则在if块行处出现此错误:

Traceback (most recent call last):
  File "test.py", line 11, in <module>
    if line[0].find("---") != 0:
IndexError: list index out of range

这是最后打印出来的行:

[' \t\t']

Tags: 文件csvtoclientiddatadomainline
3条回答

如果末尾有空行,csv.reader将给出空行,因此必须编写代码来处理。如果你对每一行都做line[0],即使是空的,你也会得到你想要的异常。你知道吗

但在尝试检查line[0]之前,您只需检查line是否为空:

if line:
    if line[0].find(" -") != 0:

……或者更简洁地说:

if line and line[0].find(" -") != 0:

您可以尝试将带有|的每一行拆分为字典列表,并仅打印最后一行的Client IDTrust Domain

with open('data.txt') as f:

    # collect rows of interest
    rows = []
    for line in f:
        if '|' in line:
            items = [item.strip() for item in line.split('|')]
            rows.append(items)

    # first item will be headers
    headers = rows[0]

    # put each row into dictionary
    data = [dict(zip(headers, row)) for row in rows[1:]]

    # print out last row information of interest
    print('Client ID:', data[-1]['Client ID'])
    print('Trust Domain:', data[-1]['Trust Domain'])

输出:

Client ID: 500333446
Trust Domain: holder-test

按照注释中的要求,如果要打印500333446=holder-test,可以将最终打印顺序更改为:

print('%s=%s' % (data[-1]['Client ID'], data[-1]['Trust Domain']))
# 500333446=holder-test

在处理行之前,您应该strip去掉任何不需要的字符,并验证它是否是您想要的行。你知道吗

你能做的是:

if line and line[0].strip(" \t") and not line[0].startswith(" -"):

或者另一种方式:

if all([line, line[0].strip(" \t"), not line[0].startswith(" -")]):
  1. if line检查line是否为空列表,以便2。不会抛出错误。你知道吗
  2. line[0].strip(" \t")检查第一个值是否只包含不需要的字符。你知道吗
  3. not line[0].startswith(" -")与您的line[0].find(" -") != 0相同

相关问题 更多 >