读取CSV文件并替换xml标记

2024-04-26 02:21:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我想读取一个CSV文件,并用CSV文件的第二列替换xml文件中的标记。“name”标记值在第一列中。在

A         |    B

Value1    |    ValueX
Value2    |    ValueX
Value3    |    ValueY

XML结构看起来像。在

^{pr2}$

Python代码

import csv 
import collections
import xml.etree.ElementTree
tree = xml.etree.ElementTree.parse("jolly.xml").getroot()

with open('file.csv', 'r') as f:
    reader = csv.DictReader(f)# read rows into a dictionary format
    reader = csv.reader(f, dialect=csv.excel_tab)
    list = list(reader)
    columns = collections.defaultdict(list)# each value in each column is appended to a list

for (k, v) in row.items(): #go over each column name and value
    columns[k].append(v)# append the value into the appropriate list

print columns['A']
print columns['B']
for elem in tree.findall('.//name'):
    if elem.attrib['name'] == columns['A']:
        elem.attrib['name'] == columns['B']

我该怎么办?在

以下是CSV文件的外观:

Reading CSV file looks like

输出应如下所示:

Value1 should be replaced with ValueX

好吧,我的解决方案是:

import lxml.etree as ET


arr = ["Value1", "Value2", "Value3"]
arr2 = ["ValuX", "ValuX", "ValueY"]

with open('file.xml', 'rb+') as f:
    tree = ET.parse(f)
    root = tree.getroot()
    for i, item in enumerate(arr):
         for elem in root.findall('.//Value1'):
             print(elem);
             if elem.tag:
                 print(item)
                 print(arr2[i])

                 elem.text = elem.text.replace(item, arr2[i])



    f.seek(0)
    f.write(ET.tostring(tree, encoding='UTF-8', xml_declaration=True))
    f.truncate()

我用的是数组。我可以将值从文件复制到数组中。对于大文件,它需要一个更好的代码。在


Tags: columns文件csvnameinimporttreefor
1条回答
网友
1楼 · 发布于 2024-04-26 02:21:38

考虑使用XSLT,这是一种特殊用途的声明性语言,旨在重组XML文件。与大多数其他通用语言(包括ASP、C、Java、PHP、Perl、VB)一样,Python维护xslt1.0处理器,特别是在其lxml模块中。在

出于您的目的,您可以动态创建可用于转换的XSLT字符串。唯一需要的循环是循环csv数据:

import csv
import lxml.etree as ET

# READ IN CSV DATA AND APPEND TO LIST
csvdata = []
with open('file.csv'), 'r') as csvfile:
    readCSV = csv.reader(csvfile)
    for line in readCSV:
        csvdata.append(line)

# DYNAMICALLY CREATE XSLT STRING
xsltstr = '''<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
            <xsl:output version="1.0" encoding="UTF-8" indent="yes" />
            <xsl:strip-space elements="*"/>

              <!  Identity Transform  >
              <xsl:template match="@*|node()">
                <xsl:copy>
                  <xsl:apply-templates select="@*|node()"/>
                </xsl:copy>
              </xsl:template>

        '''

for i in range(len(csvdata)):
    xsltstr = xsltstr + \
              '''<xsl:template match="name[.='{0}']">
                  <xsl:element name="{1}">
                     <xsl:apply-templates />
                  </xsl:element>
              </xsl:template>

              '''.format(*csvdata[i])

xsltstr = xsltstr + '</xsl:transform>'

# PARSE ORIGINAL FILE AND XSLT STRING
dom = ET.parse('jolly.xml')
xslt = ET.fromstring(xsltstr)

# TRANSFORM XML
transform = ET.XSLT(xslt)
newdom = transform(dom)

# OUTPUT FINAL XML (PRETTY PRINT)
tree_out = ET.tostring(newdom, encoding='UTF-8', pretty_print=True,  xml_declaration=True)

xmlfile = open('final.xml'),'wb')
xmlfile.write(tree_out)
xmlfile.close()

输出

^{pr2}$

相关问题 更多 >