如何在python中将xml(都柏林核心)转换为csv?

2024-04-28 01:50:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一堆生物多样性元数据,格式如下。我想使用机器学习算法对它们进行聚类,我想如果我首先将数据从XML转换为CSV,这可能是一个好主意。这是个好主意吗?如果有的话?有没有一种简单的方法可以在Python中将其转换为CSV或任何其他更具机器可读性的格式

<?xml version="1.0" encoding="UTF-8"?>
<dataset xmlns="urn:pangaea.de:dataportals" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:pangaea.de:dataportals http://ws.pangaea.de/schemas/pansimple/pansimple.xsd">
  <dc:title>Sorex araneus Linnaeus, 1758, a preserved specimen record of the Mammals (recent) dataset [ID: ZMB_Mam_050329 ]</dc:title>
  <dc:description>The Animal Sound Archive at the Museum fuer Naturkunde Berlin (German: Tierstimmenarchiv) is one of the oldest and largest worldwide. Founded in 1951 by Professor Guenter Tembrock the collection consists now of around 130 000 records of animal voices.</dc:description>
  <dc:contributor>Christiane Funk</dc:contributor>
  <dc:contributor>MfN</dc:contributor>
  <dc:contributor>Museum für Naturkunde Berlin – Leibniz Institute for Research on Evolution and Biodiversity, Berlin</dc:contributor>
  <dc:contributor>Zenker, R</dc:contributor>
  <dc:contributor>Kulicke</dc:contributor>
  <dc:contributor>Zenker, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:contributor>Kulicke, R</dc:contributor>
  <dc:publisher>Data Center MfN</dc:publisher>
  <dataCenter>Data Center MfN</dataCenter>
  <dc:type>ABCD_Unit</dc:type>
  <dc:type>preserved specimen</dc:type>
  <dc:format>text/html</dc:format>
  <dc:identifier>ZMB_Mam_050329</dc:identifier>
  <dc:source>MfN Custodians (2017). MfN Zoological Collections - Mammals (recent). [Dataset]. Data Publisher: Museum für Naturkunde Berlin (MfN) - Leibniz Institute for Research on Evolution and Biodiversity.</dc:source>
  <linkage type="metadata">http://biocase.naturkundemuseum-berlin.de/current?dsa=mfn_Mammalia&amp;detail=unit&amp;schema=http://www.tdwg.org/schemas/abcd/2.06&amp;cat=ZMB_Mam_050329</linkage>
  <dc:coverage xsi:type="CoverageType">
    <northBoundLatitude>52.4256800000</northBoundLatitude>
    <westBoundLongitude>14.2540400000</westBoundLongitude>
    <southBoundLatitude>52.4256800000</southBoundLatitude>
    <eastBoundLongitude>14.2540400000</eastBoundLongitude>
    <startDate>1961-08-03T00:00:00</startDate>
    <endDate>1961-08-03T00:00:00</endDate>
  </dc:coverage>
  <dc:subject type="taxonomy" xsi:type="SubjectType">Mammalia</dc:subject>
  <dc:subject type="taxonomy" xsi:type="SubjectType">Soricidae</dc:subject>
  <dc:subject type="taxonomy" xsi:type="SubjectType">Sorex</dc:subject>
  <dc:subject type="kingdom" xsi:type="SubjectType">Animalia</dc:subject>
  <dc:subject type="taxonomy" xsi:type="SubjectType">Soricomorpha</dc:subject>
  <dc:subject type="taxonomy" xsi:type="SubjectType">Chordata</dc:subject>
  <dc:subject type="taxonomy" xsi:type="SubjectType">Sorex araneus Linnaeus, 1758</dc:subject>
  <dc:subject type="parameter" xsi:type="SubjectType">Date</dc:subject>
  <dc:subject type="parameter" xsi:type="SubjectType">Locality</dc:subject>
  <dc:rights>CC-BY-SA</dc:rights>
  <dc:relation>http://coll.mfn-berlin.de/c/ZMB_MAM</dc:relation>
  <parentIdentifier>urn:gfbio.org:abcd:1_66_118</parentIdentifier>
  <additionalContent>MfN, ZMB_Mam, ZMB_Mam_050329, 2013-10-29T17:22:54, Mammalia, class, Soricidae, family, Sorex, genusgroup, Animalia, kingdom, Soricomorpha, order, Chordata, phylum, Sorex araneus Linnaeus, 1758, Sorex, araneus, Linnaeus, 1758, Preserved specimen, Skull, 1961-08-03T00:00:00, Kulicke, Kulicke, Schwenow, Germany, country, Germany, DE, state, Brandenburg, Schwenow, Male </additionalContent>
</dataset>

如果你有什么建议,请告诉我


Tags: orghttptypededctaxonomysubjectcontributor