使用pandas编辑.xml

2024-05-16 06:39:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含以下内容的XML文件:

<CompanyData><Identifier>Country Context Afghanistan</Identifier>
<LanguageCode>2057</LanguageCode><DataTypeId>CCO</DataTypeId><ISOAlpha3>AFG</ISOAlpha3>
<DataSource>EDWH</DataSource><BuildDate>2019-09-17T18:53:59.973</BuildDate>
<DataSet><Name>Country context</Name><SourceName>Source</SourceName>
<SourceDate>2019-09-17T18:53:59.973</SourceDate><Data>
<Name>Total population (2019)</Name><Value>35,688,787</Value></Data><Data>
<Name>Birth cohort (2019)</Name><Value>1,083,460</Value></Data><Data>
<Name>Surviving Infants (surviving to 1 year per year, 2019)</Name><Value>1,151,687</Value></Data><Data>
<Name>Infant mortality rate (deaths &lt; 1 year per 1000 births, 2015)</Name><Value>66/1000</Value></Data><Data>
<Name>Child mortality rate (deaths &lt; 5 years per 1000 births, 2015)</Name><Value>91/1000</Value></Data><Data>
<Name>World Bank Index, IDA (2015)</Name><Value>2.69</Value></Data><Data>
<Name>Gross Nation Income (per capita US$, 2015)</Name><Value>610</Value></Data><Data>
<Name>No. of districts/territories (2018)</Name><Value>407</Value></Data></DataSet></CompanyData>

我需要更改这个.xml中的值(比如总人口)。 我正在考虑将这个.xml转换为.DF,执行更改并将其转换回.xml结构。但我没有找到任何解决方案将其转换为df并从df构建.xml。 可能还有另一种方法,比如直接编辑.xml


Tags: namedatavaluexmlyearcountrydatasetidentifier
2条回答

下面的代码将2个元素替换为虚拟值

您所需要做的就是用所需的数据填充dictnew_data

from xml.etree import ElementTree as ET


xml = '''<?xml version="1.0" encoding="UTF-8"?>
<CompanyData>
   <Identifier>Country Context Afghanistan</Identifier>
   <LanguageCode>2057</LanguageCode>
   <DataTypeId>CCO</DataTypeId>
   <ISOAlpha3>AFG</ISOAlpha3>
   <DataSource>EDWH</DataSource>
   <BuildDate>2019-09-17T18:53:59.973</BuildDate>
   <DataSet>
      <Name>Country context</Name>
      <SourceName>Source</SourceName>
      <SourceDate>2019-09-17T18:53:59.973</SourceDate>
      <Data>
         <Name>Total population (2019)</Name>
         <Value>35,688,787</Value>
      </Data>
      <Data>
         <Name>Birth cohort (2019)</Name>
         <Value>1,083,460</Value>
      </Data>
      <Data>
         <Name>Surviving Infants (surviving to 1 year per year, 2019)</Name>
         <Value>1,151,687</Value>
      </Data>
      <Data>
         <Name>Infant mortality rate (deaths &lt; 1 year per 1000 births, 2015)</Name>
         <Value>66/1000</Value>
      </Data>
      <Data>
         <Name>Child mortality rate (deaths &lt; 5 years per 1000 births, 2015)</Name>
         <Value>91/1000</Value>
      </Data>
      <Data>
         <Name>World Bank Index, IDA (2015)</Name>
         <Value>2.69</Value>
      </Data>
      <Data>
         <Name>Gross Nation Income (per capita US$, 2015)</Name>
         <Value>610</Value>
      </Data>
      <Data>
         <Name>No. of districts/territories (2018)</Name>
         <Value>407</Value>
      </Data>
   </DataSet>
</CompanyData>'''

new_data = {'Total population (2019)': 10000,'World Bank Index, IDA (2015)': 7.45}
root = ET.fromstring(xml)
for field_name,new_value in new_data.items():
    value_element = root.find(".//Data/[Name='{}']".format(field_name))
    value_element.find('Value').text = str(new_value)
ET.dump(root)

输出

<CompanyData>
   <Identifier>Country Context Afghanistan</Identifier>
   <LanguageCode>2057</LanguageCode>
   <DataTypeId>CCO</DataTypeId>
   <ISOAlpha3>AFG</ISOAlpha3>
   <DataSource>EDWH</DataSource>
   <BuildDate>2019-09-17T18:53:59.973</BuildDate>
   <DataSet>
      <Name>Country context</Name>
      <SourceName>Source</SourceName>
      <SourceDate>2019-09-17T18:53:59.973</SourceDate>
      <Data>
         <Name>Total population (2019)</Name>
         <Value>10000</Value>
      </Data>
      <Data>
         <Name>Birth cohort (2019)</Name>
         <Value>1,083,460</Value>
      </Data>
      <Data>
         <Name>Surviving Infants (surviving to 1 year per year, 2019)</Name>
         <Value>1,151,687</Value>
      </Data>
      <Data>
         <Name>Infant mortality rate (deaths &lt; 1 year per 1000 births, 2015)</Name>
         <Value>66/1000</Value>
      </Data>
      <Data>
         <Name>Child mortality rate (deaths &lt; 5 years per 1000 births, 2015)</Name>
         <Value>91/1000</Value>
      </Data>
      <Data>
         <Name>World Bank Index, IDA (2015)</Name>
         <Value>7.45</Value>
      </Data>
      <Data>
         <Name>Gross Nation Income (per capita US$, 2015)</Name>
         <Value>610</Value>
      </Data>
      <Data>
         <Name>No. of districts/territories (2018)</Name>
         <Value>407</Value>
      </Data>
   </DataSet>
</CompanyData>

我建议使用熊猫以外的东西。实际上,有一个xml库,您可能会发现它足以满足您的需要:

from xml.etree import ElementTree as et
tree = et.parse(xml_data)
tree.find('CompanyData.Identifier').text = "New Country Context"

在“x路径”中查找更多关于选择它们的指南,但是上面的方法应该可以帮助您在不需要熊猫的情况下更改数据

相关问题 更多 >