Python正则表达式在xml文本中，查找标记

#<..... some XML .....> <sec id="aj387295s3"> <label>3.</label> <title><italic>CHANDRA</italic> OBSERVATIONS</title> <p>The 13 candidates were observed with the Advanced CCD Imaging Spectrometer (ACIS; Burke et al. <xref ref-type="bibr" rid="aj387295r8">1997</xref>) on board <italic>Chandra</italic> (Weisskopf et al. <xref ref-type="bibr" rid="aj387295r46">1996</xref>). We chose the S3 chip to image the sources because of its better low-energy sensitivity. The standard TIMED readout with a frame time of 3.2 s was used, and the data were collected in VFAINT mode. In 12 cases, our <italic>Chandra</italic> observations led us to conclude that the RASS detection was not of a candidate INS (see Table <xref ref-type="table" rid="aj387295t1">1</xref>; the <xref ref-type="sec" rid="aj387295app1">Appendix</xref> includes a case-by-case discussion of these sources).</p> #<..... more XML ....>

2条回答

网友

1楼 · 编辑于 2024-05-13 13:44:58

如果您找到了正确的<sec>-标记，那么您只需要获得<label>和<title>中的文本。你知道吗

title = '{} {}'.format(sec.findtext('label'), ''.join(sec.find('title').itertext())

网友

2楼 · 编辑于 2024-05-13 13:44:58

如注释中所述，不建议使用RegEx读取XML值。如果仍要使用它们：

<tag>[\s\S]*?<\/tag>

这些标记之间的部分是值。你知道吗

See another question.

相关问题更多 >

编程相关推荐

热门问题

热门文章