Python类返回空字典

2024-05-16 09:54:03 发布

您现在位置:Python中文网/ 问答频道 /正文

新手需要一些帮助使代码面向对象

我试图用不同的方法编写一个类来处理XML文件。其中一种方法的目标是返回一个字典,其中嵌入附件的文件名和编码的数据字符串分别作为键和值

我已经设法让它在课外发挥作用:

import xml.etree.ElementTree as ET

tree = ET.parse('invoice.xml')
root = tree.getroot()

namespace = {
    'cac': 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2',
    'cbc': 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2',
    'ext': 'urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2',
    'ccts': 'urn:un:unece:uncefact:documentation:2',
    'xsi': 'http://www.w3.org/2001/XMLSchema-instance'
}

attachments = {}

for document in root.findall('cac:AdditionalDocumentReference', namespace):
    filename = document.find('cbc:ID', namespace).text
    print(filename)

    # Find the embedded file
    for child in document.findall('cac:Attachment', namespace):
        attachment = child.find('cbc:EmbeddedDocumentBinaryObject', namespace).text
        attachments[filename] = attachment

但我无法将其转换为类方法,因为类方法返回一个空字典。我正在编写的代码:

import xml.etree.ElementTree as ET

class Invoice:
    """
    Common tasks in relation to EHF invoices.
    """

    namespace = {
            'cac': 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2',
            'cbc': 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2',
            'ext': 'urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2',
            'ccts': 'urn:un:unece:uncefact:documentation:2',
            'xsi': 'http://www.w3.org/2001/XMLSchema-instance'
        }

    attachments = {}

    def __init__(self, invoice):
        """Initialize invoice attributes."""
        self.invoice = invoice

        # Dictionary for namespace used in EHF invoices
        self.namespace = self.namespace

    def encoded_attachments(self):
        """
        Return the embedded attachments from the EHF invoice in encoded form
        as a dictionary.

        Keys = filenames
        Value = base64 encoded files
        """
        
        for document in self.invoice.findall('cac:AdditonalDocumentReference', self.namespace):
            # Find filename
            filename = document.find('cbc:ID', self.namespace).text
        
            # Find the embedded file
            for child in document.findall('cac:Attachment', namespace):
                attachment = child.find('cbc:EmbeddedDocumentBinaryObject', self.namespace).text

                # Add filename and attachment to dictionary
                self.attachments[filename] = attachment
        
        return(self.attachments)

tree = ET.parse('invoice.xml')
root = tree.getroot()

ehf = Invoice(root)

attach_dict = ehf.encoded_attachments()
print(attach_dict)

我认为我在课堂上遗漏了一些重要的东西。感谢您的帮助

编辑:

xml文件的一部分。用伪文本字符串替换编码数据

<?xml version="1.0" encoding="UTF-8"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2"
    xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
    xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
    xmlns:ext="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2"
    xmlns:ccts="urn:un:unece:uncefact:documentation:2"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <cbc:CustomizationID>urn:cen.eu:en16931:2017#compliant#urn:fdc:peppol.eu:2017:poacc:billing:3.0</cbc:CustomizationID>
    <cbc:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</cbc:ProfileID>
    <cbc:ID>1060649</cbc:ID>
    <cbc:IssueDate>2020-01-23</cbc:IssueDate>
    <cbc:DueDate>2020-02-07</cbc:DueDate>
    <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode>
    <cbc:TaxPointDate>2020-01-23</cbc:TaxPointDate>
    <cbc:DocumentCurrencyCode>NOK</cbc:DocumentCurrencyCode>
    <cbc:BuyerReference>N/A</cbc:BuyerReference>
    <cac:AdditionalDocumentReference>
        <cbc:ID>invoice_attachment_filename.pdf</cbc:ID>
        <cbc:DocumentTypeCode>130</cbc:DocumentTypeCode>
        <cbc:DocumentDescription>CommercialInvoice</cbc:DocumentDescription>
        <cac:Attachment>
            <cbc:EmbeddedDocumentBinaryObject mimeCode="application/pdf" filename="1060649.pdf">BASE64ENCODEDTEXT</cbc:EmbeddedDocumentBinaryObject>
        </cac:Attachment>
    </cac:AdditionalDocumentReference>
</Invoice>

Tags: inselfnamesschemainvoicefilenamenamespaceattachments
3条回答

self的用法不一致

for child in document.findall('cac:Attachment', **namespace**):
    attachment = child.find('cbc:EmbeddedDocumentBinaryObject', **self.namespace**).text

答案是(鼓声…)一切都是正确的,但在这里比较新旧代码:

old: for document in root.findall('cac:AdditionalDocumentReference', namespace)
new: for document in self.invoice.findall('cac:AdditonalDocumentReference', self.namespace)
                                                    ^

顺便说一下,您可以省去self.namespace = self.namespace

您在这里犯了两个错误。
一个是您正在使用类变量(请阅读此处:https://docs.python.org/3/tutorial/classes.html
第二个是Gokai说的here

这应该起作用:

import xml.etree.ElementTree as ET


class Invoice:
    """
    Common tasks in relation to EHF invoices.
    """

    def __init__(self, invoice):
        """Initialize invoice attributes."""
        self.invoice = invoice
        self.namespace = {
            'cac': 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-3',
            'cbc': 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-3',
            'ext': 'urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-3',
            'ccts': 'urn:un:unece:uncefact:documentation:1',
            'xsi': 'http://www.w2.org/2001/XMLSchema-instance'
        }
        self.attachments = {}

    def encoded_attachments(self):
        """
        Return the embedded attachments from the EHF invoice in encoded form
        as a dictionary.

        Keys = filenames
        Value = base64 encoded files
        """

        for document in self.invoice.findall('cac:AdditonalDocumentReference', self.namespace):
            # Find filename
            filename = document.find('cbc:ID', self.namespace).text

            # Find the embedded file
            for child in document.findall('cac:Attachment', self.namespace):
                # Add filename and attachment to dictionary
                self.attachments[filename] = child.find('cbc:EmbeddedDocumentBinaryObject', self.namespace).text

        return self.attachments

相关问题 更多 >