<p>这是一个非正则表达式的解决方案(但它依赖于字符串中的换行符保存为文件中的字符串,请参见Armanli的注释)。不需要正则表达式,因为字符串具有类似的结构。此解决方案循环文件中的行,在<code>\\r\\n</code>上拆分,并从列表中提取<code>Detected</code>、<code>Traces</code>或任何气体。它将值保存在可加载到熊猫中的DICT列表中:</p>
<pre><code>import numpy as np
import pandas as pd
gasses = ['Helium', 'Oxygen', 'Nitrogen', 'Carbon monoxide', 'Argon']
def get_data(gas, line):
return [line.split(f' {gas} (')[0].strip(), float(line.split(f' {gas} (')[1].split('%')[0])]
all_data = []
with open("filename.txt", "r") as f:
d = [i.split('\\r\\n') for i in f.readlines()]
for i in d:
tmp_dict = {}
for z in i[:-1]:
if 'Detected' in z:
tmp_dict['Detected'] = int(z.split(" ")[1])
elif 'Traces' in z:
tr = z[10:].split(', ')
for t in tr:
tmp_dict[f'{t.strip()} (txt)'] = 'Traces'
else:
gas = [ele for ele in gasses if(ele in z)] [0]
r = get_data(gas, z)
tmp_dict[f'{gas} (txt)'] = r[0]
tmp_dict[f'{gas} (%)'] = r[1]
all_data.append(tmp_dict)
df = pd.DataFrame(all_data)
</code></pre>
<p>输出:</p>
<div class="s-table-container">
^{tb1}$
</div>