如何用漂亮的汤刮桌子?

2024-05-28 23:39:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我试着根据以下问题刮桌子:Python BeautifulSoup scrape tables

从顶级解决方案开始,我尝试:

HTML代码:

<div class="table-frame small">
    <table id="rfq-display-line-items-list" class="table">
        <thead id="rfq-display-line-items-header">
          <tr>
          <th>Mfr. Part/Item #</th>
          <th>Manufacturer</th>
          <th>Product/Service Name</th>
          <th>Qty.</th>
          <th>Unit</th>
          <th>Ship Address</th>
        </tr>
      </thead>
      <tbody id="rfq-display-line-item-0">

        <tr>
            <td><span class="small">43933</span></td>
            <td><span class="small">Anvil International</span></td>
            <td><span class="small">Cap Steel Black 1-1/2"</span></td>
            <td><span class="small">800</span></td>
            <td><span class="small">EA</span></td>
            <td><span class="small">1</span></td>
        </tr>
      <!----><!---->
      </tbody><tbody id="rfq-display-line-item-1">

        <tr>
            <td><span class="small">330035205</span></td>
            <td><span class="small">Anvil International</span></td>
            <td><span class="small">1-1/2" x 8" Black Steel Nipple</span></td>
            <td><span class="small">400</span></td>
            <td><span class="small">EA</span></td>
            <td><span class="small">1</span></td>
        </tr>
      <!----><!---->
      </tbody><!---->
    </table><!---->
</div>

根据解决方案

我尝试的是:

for tr in soup.find_all('table', {'id': 'rfq-display-line-items-list'}):
    tds = tr.find_all('td')
    print(tds[0].text, tds[1].text, tds[2].text, tds[3].text, tds[4].text, tds[5].text)

但这只显示了第一行

43933 Anvil International Cap Steel Black 1-1/2" 800 EA 1

我后来发现所有这些<td> 都存储在列表中。我想打印所有的行

预期产出:

43933      Anvil International Cap Steel Black 1-1/2" 800 EA 1
330035205  Anvil International 1-1/2" x 8" Black Steel Nipple 400 EA 1         

Tags: textiddisplaylinetabletrclasstd
1条回答
网友
1楼 · 发布于 2024-05-28 23:39:03

tr标记开始&;转到td

from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")

for tr in soup.find("table", id="rfq-display-line-items-list").find_all("tr"):
    print(" ".join([td.text for td in tr.find_all('td')]))

43933 Anvil International Cap Steel Black 1-1/2" 800 EA 1
330035205 Anvil International 1-1/2" x 8" Black Steel Nipple 400 EA 1

相关问题 更多 >

    热门问题