<p>对于第二个表,实际的<code>tr</code>、<code>th</code>和{<cd3>}元素在<code>table</code>标记下没有结构化。因此,刮掉所有<code>tr</code>、<code>th</code>、和{<cd3>}标记将生成所需的数据,并且通过应用<code>itertools.groupby</code>,可以获得原始的表结构。在</p>
<pre><code>import requests, itertools
from bs4 import BeautifulSoup as soup
d = soup(requests.get('https://www.fpi.nsdl.co.in/web/Reports/Latest.aspx').text, 'html.parser')
table_data = [[j.text for j in (lambda x:i.find_all('td') if not x else x)(i.find_all('th'))] for i in d.find_all('tr')]
final_table = [list(b) for _, b in itertools.groupby(table_data, key=lambda x:x[0].startswith('Daily Trends'))]
table1, table2 = [final_table[i]+final_table[i+1] for i in range(0, len(final_table), 2)]
</code></pre>
<p>输出:</p>
<p><code>table</code>:</p>
^{pr2}$
<p><code>table2</code>:</p>
<pre><code>[['Daily Trends in FPI Derivative Trades on 08-Aug-2018'], ['Reporting Date', 'Derivative Products', 'Buy', 'Sell', 'Open Interest at the'], ['Open Interest at the'], ['No. of Contracts', 'Amount in Crore', 'No. of Contracts', 'Amount in Crore', 'No. of Contracts', 'Amount in Crore'], ['08-Aug-2018', 'Index Futures', '18797.00', '1732.24', '16696.00', '1600.94', '303684.00', '26636.51'], ['Index Options', '495820.00', '50403.69', '512765.00', '52075.29', '673371.00', '60394.18'], ['Stock Futures', '176472.00', '11999.53', '178301.00', '12020.70', '1116162.00', '83275.79'], ['Stock Options', '98471.00', '6949.88', '101906.00', '7204.18', '116286.00', '8824.33'], ['Interest Rate Futures', '0.00', '0.00', '0.00', '0.00', '2530.00', '47.57'], ['The above report is compiled on the basis of reports submitted to depositories by NSE and BSE on 08-Aug-2018 and constitutes FPIs/FIIs trading / position of the previous trading day.']]
</code></pre>