是否有可能将数据帧的每一行转换为预定义的文本文件?

2024-05-16 20:13:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我的数据框看起来是这样的:

Dataframe 我希望将每一行插入到预定义的文本文件中,以便这些值在文档中有一个特定的位置。 这就是我想到的:

for i in range(len(df)):
with open("%s.xml" %index, "w") as f:
    f.write(
     """<?xml version="1.0"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
  <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
  <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
  <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
  <cbc:ID> """df[Factuurdatum[i]]" </cbc:ID>
  <cbc:IssueDate> Totaal </cbc:IssueDate>
  <cbc:DueDate> Factuurdatum[i] </cbc:DueDate>"
  <cbc:InvoiceTypeCode listID="UNCL1001" listAgencyID="6">380</cbc:InvoiceTypeCode>
  <cbc:DocumentCurrencyCode>EUR</cbc:DocumentCurrencyCode>
  <cac:AccountingSupplierParty>

我的理想输出是第一行:

<?xml version="1.0"?>
    <Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
      <cbc:UBLVersionID>2.1</cbc:UBLVersionID>         <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
      <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
      <cbc:ID> ""0606194584" </cbc:ID>
      <cbc:IssueDate> 12.93 </cbc:IssueDate>
      <cbc:DueDate> 2020-09-18 </cbc:DueDate>"
      <cbc:InvoiceTypeCode listID="UNCL1001" listAgencyID="6">380</cbc:InvoiceTypeCode>
      <cbc:DocumentCurrencyCode>EUR</cbc:DocumentCurrencyCode>
      <cac:AccountingSupplierParty>

我的理想输出是第二行:

<?xml version="1.0"?>
    <Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
      <cbc:UBLVersionID>2.1</cbc:UBLVersionID>         <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
      <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
      <cbc:ID> ""20200633369" </cbc:ID>
      <cbc:IssueDate> 30.25 </cbc:IssueDate>
      <cbc:DueDate> 2020-06-26 </cbc:DueDate>"
      <cbc:InvoiceTypeCode listID="UNCL1001" listAgencyID="6">380</cbc:InvoiceTypeCode>
      <cbc:DocumentCurrencyCode>EUR</cbc:DocumentCurrencyCode>
      <cac:AccountingSupplierParty>

等等,每行。 有什么可行的方法可以做到这一点?有人能帮我吗


Tags: orgnamesschemawwwinvoicespecificationxsdubl
2条回答

如果您迭代这些行,将得到一个(index, series)元组,其中series包含单行的列值。该系列可以扩展为str.format调用,该调用保存要生成的xml模板。举个简单的例子

>>> df=pd.DataFrame([[1,2,3],[4,5,6]], columns=['A','B','C'])
>>> df
   A  B  C
0  1  2  3
1  4  5  6
>>> template = "<xml>\n  <a>{A}</a>\n  <b>{B}</b>\n  <c>{C}</c>\n</xml>"
>>> for row in df.iterrows():
...     print(template.format(**row[1]))
... 
<xml>
  <a>1</a>
  <b>2</b>
  <c>3</c>
</xml>
<xml>
  <a>4</a>
  <b>5</b>
  <c>6</c>
</xml>

在扩展这个示例时,我将想要的文档分解为boiler plate(用于封装的xml文档)和{fac_details}格式变量(用于唯一信息)。我不知道这个数据的好名字,所以我叫它“fac”-你会想要更具描述性的东西。我试图使xml更加完整,但没有提到您感兴趣的所有列

注意:OP不提供完整的运行程序,因此这是未经测试的伪代码

# xml document to be expanding with per row details
fac_doc_template = """<?xml version="1.0"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
  <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
  <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
  <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
  {fac_details}
</cbc:CustomizationID>
</Invoice>"""

# per row details
# todo: expand for all of the column values you want
fac_details_xml_template = """
<cbc:ID>{Factuurnumer}</cbc:ID>
<cbc:IssueDate>{Factuurdatum}</cbc:IssueDate>
"""

def series_to_fac_details_xml(s):
    return fac_details_xml_template.format(**s)

for index, row in df.iterrows():
    details = series_to_fac_details_xml(row)
    with open(f"{index}.xml", "w") as f:
        f.write(fac_doc_template.format(fac_details=details))

你就快到了。可以使用字符串格式在字符串中插入值,如下所示:

data = "some data i want to insert"

result = "This is what I want to say: {}".format(data)
# or
result = f"This is what I want to say: {data}"

参考资料:

https://docs.python.org/3/library/stdtypes.html?highlight=format#str.format

https://docs.python.org/3/library/string.html#formatstrings

相关问题 更多 >