从字符串中一次提取多个值
我正在尝试清理我的数据,这些数据的格式是这样的:
'Ti': ['88.115', '199.2', '44.4', '39.0', '1.89', '89', '0.870']
我想提取值 [0]、值 [1] 和值 [-2]。手动处理单个字符串时,这样做没问题,但在我的代码中,它只显示了值 [1] 两次,而不是值 [-2]。
输入数据:
corrected Dataframes: ('Peak'{'Al': ['90.569', '533.3', '32.6', '28.8', '1.02', '115', '9.588'], 'Ca': ['107.254', '5759.7', '57.5', '53.0', '0.30', '83', '102.945'], 'Cr': ['73.359', '-0.2', '89.8', '76.0', '100.00', '', '100', '0.000'], 'Fe': ['134.750', '581.0', '20.3', '16.8', '0.96', '164', '7.775'], 'K': ['119.624', '-4.7', '37.4', '30.0', '100.00', '', '91', '0.000'], 'Mg': ['107.507', '5699.9', '40.7', '33.5', '0.30', '51', '31.063'], 'Mn': ['146.274', '20.3', '12.3', '13.5', '7.49', '143', '0.256'], 'Na': ['129.500', '64.2', '17.6', '12.0', '4.77', '71', '2.252'], 'Si': ['77.272', '7183.2', '92.6', '56.8', '0.27', '196', '75.399'], 'Ti': ['88.115', '199.2', '44.4', '39.0', '1.89', '89', '0.870']})
这里的代码是,element_data 是一个数据框,而 Peak 是这个数据框中的一个键:
if 'Peak' in element_data:
for key in element_data['Peak'].keys():
element_data['Peak'][key] = [element_data['Peak'][key][0], element_data['Peak'][key][1], element_data['Peak'][key][-2]]
输出:
'Peak' data Al: ['90.569', '533.3', '533.3']
'Peak' data Ca: ['107.254', '5759.7', '5759.7']
'Peak' data Cr: ['73.359', '-0.2', '-0.2']
'Peak' data Fe: ['134.750', '581.0', '581.0']
'Peak' data K: ['119.624', '-4.7', '-4.7']
'Peak' data Mg: ['107.507', '5699.9', '5699.9']
'Peak' data Mn: ['146.274', '20.3', '20.3']
'Peak' data Na: ['129.500', '64.2', '64.2']
'Peak' data Si: ['77.272', '7183.2', '7183.2']
'Peak' data Ti: ['88.115', '199.2', '199.2']
想要的输出:
'Peak' data Al: ['90.569', '533.3', '115']
'Peak' data Ca: ['107.254', '5759.7', '83']
'Peak' data Cr: ['73.359', '-0.2', '100']
'Peak' data Fe: ['134.750', '581.0', '164']
'Peak' data K: ['119.624', '-4.7', '91']
'Peak' data Mg: ['107.507', '5699.9', '51']
'Peak' data Mn: ['146.274', '20.3', '143']
'Peak' data Na: ['129.500', '64.2', '71']
'Peak' data Si: ['77.272', '7183.2', '196']
'Peak' data Ti: ['88.115', '199.2', '89']
2 个回答
0
我觉得你手里的是一个Python字典,而不是pandas的数据框。如果真是这样的话:
data = {
"Peak": {
"Al": ["90.569", "533.3", "32.6", "28.8", "1.02", "115", "9.588"],
"Ca": ["107.254", "5759.7", "57.5", "53.0", "0.30", "83", "102.945"],
"Cr": ["73.359", "-0.2", "89.8", "76.0", "100.00", "", "100", "0.000"],
"Fe": ["134.750", "581.0", "20.3", "16.8", "0.96", "164", "7.775"],
"K": ["119.624", "-4.7", "37.4", "30.0", "100.00", "", "91", "0.000"],
"Mg": ["107.507", "5699.9", "40.7", "33.5", "0.30", "51", "31.063"],
"Mn": ["146.274", "20.3", "12.3", "13.5", "7.49", "143", "0.256"],
"Na": ["129.500", "64.2", "17.6", "12.0", "4.77", "71", "2.252"],
"Si": ["77.272", "7183.2", "92.6", "56.8", "0.27", "196", "75.399"],
"Ti": ["88.115", "199.2", "44.4", "39.0", "1.89", "89", "0.870"],
}
}
if (d := data.get("Peak")) is not None:
for k, v in d.items():
s = [v[0], v[1], v[-2]]
print(f"'Peak' data {k}: {s}")
输出结果:
'Peak' data Al: ['90.569', '533.3', '115']
'Peak' data Ca: ['107.254', '5759.7', '83']
'Peak' data Cr: ['73.359', '-0.2', '100']
'Peak' data Fe: ['134.750', '581.0', '164']
'Peak' data K: ['119.624', '-4.7', '91']
'Peak' data Mg: ['107.507', '5699.9', '51']
'Peak' data Mn: ['146.274', '20.3', '143']
'Peak' data Na: ['129.500', '64.2', '71']
'Peak' data Si: ['77.272', '7183.2', '196']
'Peak' data Ti: ['88.115', '199.2', '89']
0
我没有复现这个问题。也许问题出在别的地方?
import pandas as pd
element_data = {
"Peak": {
"Al": ["90.569", "533.3", "32.6", "28.8", "1.02", "115", "9.588"],
"Ca": ["107.254", "5759.7", "57.5", "53.0", "0.30", "83", "102.945"],
"Cr": ["73.359", "-0.2", "89.8", "76.0", "100.00", "", "100", "0.000"],
"Fe": ["134.750", "581.0", "20.3", "16.8", "0.96", "164", "7.775"],
"K": ["119.624", "-4.7", "37.4", "30.0", "100.00", "", "91", "0.000"],
"Mg": ["107.507", "5699.9", "40.7", "33.5", "0.30", "51", "31.063"],
"Mn": ["146.274", "20.3", "12.3", "13.5", "7.49", "143", "0.256"],
"Na": ["129.500", "64.2", "17.6", "12.0", "4.77", "71", "2.252"],
"Si": ["77.272", "7183.2", "92.6", "56.8", "0.27", "196", "75.399"],
"Ti": ["88.115", "199.2", "44.4", "39.0", "1.89", "89", "0.870"],
}
}
if "Peak" in element_data:
for key in element_data["Peak"].keys():
element_data["Peak"][key] = [
element_data["Peak"][key][0],
element_data["Peak"][key][1],
element_data["Peak"][key][-2],
]
for k, v in element_data["Peak"].items():
print(k, v)
输出结果:
Al ['90.569', '533.3', '115']
Ca ['107.254', '5759.7', '83']
Cr ['73.359', '-0.2', '100']
Fe ['134.750', '581.0', '164']
K ['119.624', '-4.7', '91']
Mg ['107.507', '5699.9', '51']
Mn ['146.274', '20.3', '143']
Na ['129.500', '64.2', '71']
Si ['77.272', '7183.2', '196']
Ti ['88.115', '199.2', '89']
补充:你的代码里可能有多余的重复部分。在上面的代码之后:
...
# A redundant repeat of the above code
if "Peak" in element_data:
for key in element_data["Peak"].keys():
element_data["Peak"][key] = [
element_data["Peak"][key][0],
element_data["Peak"][key][1],
element_data["Peak"][key][-2],
]
for k, v in element_data["Peak"].items():
print(k, v)
然后你就会得到你的输出:
Al ['90.569', '533.3', '533.3']
Ca ['107.254', '5759.7', '5759.7']
Cr ['73.359', '-0.2', '-0.2']
Fe ['134.750', '581.0', '581.0']
K ['119.624', '-4.7', '-4.7']
Mg ['107.507', '5699.9', '5699.9']
Mn ['146.274', '20.3', '20.3']
Na ['129.500', '64.2', '64.2']
Si ['77.272', '7183.2', '7183.2']
Ti ['88.115', '199.2', '199.2']