<p>我正在做一个项目,使用Python搜索一篇研究论文的XML,搜索一个特定的字符串。我已经完成了,但是我需要得到搜索结果最前面的部分标题,即标题和标签标签及其内容。你知道吗</p>
<pre><code>#<..... some XML .....>
<sec id="aj387295s3">
<label>3.</label>
<title><italic>CHANDRA</italic> OBSERVATIONS</title>
<p>The 13 candidates were observed with the Advanced CCD Imaging
Spectrometer (ACIS; Burke et&nbsp;al. <xref ref-type="bibr"
rid="aj387295r8">1997</xref>) on board <italic>Chandra</italic>
(Weisskopf et&nbsp;al. <xref ref-type="bibr"
rid="aj387295r46">1996</xref>). We chose the S3 chip to image the
sources because of its better low-energy sensitivity. The standard
TIMED readout with a frame time of 3.2 s was used, and the data were
collected in VFAINT mode. In 12 cases, our <italic>Chandra</italic>
observations led us to conclude that the RASS detection was not of a
candidate INS (see Table&nbsp;<xref ref-type="table"
rid="aj387295t1">1</xref>; the <xref ref-type="sec"
rid="aj387295app1">Appendix</xref> includes a case-by-case discussion
of these sources).</p>
#<..... more XML ....>
</code></pre>
<p>我有一个正则表达式来获取包含“Chandra”的行,但是我不断地敲打我的头来获取“3.Chandra观测值”。这可能是非常明显的,但我在正则表达式方面没有太多的训练。我的Chandra正则表达式和行的其余部分是“(.*)(c | c)handra\b”</p>
<p>谢谢你!-珍妮</p>