无法使用正则表达式、beautiful soup和python在列表中刮取和排列卡的详细信息

2024-06-10 21:51:13 发布

您现在位置:Python中文网/ 问答频道 /正文

指向Scrape的链接:https://www.idbibank.in/royal-credit-card.asp

预期输出:必须是这种类型的

例如:[标题:描述…]

[ #This is an array of Strings... 
    
"Earn while you spend: Earn 10,000 bonus reward points on spending ₹ 50,000 within 60 days & 20,000 bonus reward points on spending ₹ 5,00,000 in a year.",
"Interest Free Credit: 1% fuel surcharge waiver at all fuel stations across India on transactions between Rs.400 and Rs.5,000 (Max. Rs. 250 per statement cycle). Note -No Reward Points are earned on fuel transactions.",
"Drive On:  3 reward points for every ₹ 100 spent on any other category.",
"Air Travel Accident Insurance Cover: Redeem your reward points as cashback and other exciting options.All your accumulated reward points can be redeemed for cashback @ 1 reward point = ₹ 0.25",
"In-built insurance cover: Get free Personal Accidental Death Cover to ensure financial protection of your family (Air: 1 Crs, Non-Air: 10 Lakhs) ",
"Airport Lounge Access: Report loss of card immediately to ensure zero liability on any fraudulent transactions",
"Wider Acceptance: Convert purchase of > 2,500/- on your card into easy EMIs of 6/12 months"
]

我使用以下代码从该链接获得的项目位于下图中:

enter image description here

我编写了以下代码:

from urllib.request import urlopen
from bs4 import BeautifulSoup
import json, requests, re, sys
from selenium import webdriver
import re



html = urlopen("https://www.idbibank.in/royal-credit-card.asp")
soup = BeautifulSoup(html,'lxml')


l = []

for x in soup.select('#FB td+ td'):
    l.append(re.sub('\W+',' ', x.get_text()))



print(l)


实际产出

['Earn More while you Spend The IDBI Bank Royale Signature Card understands your lifestyle and rewards you for your needs You will earn 3 Delight Points for every 100 spent by you using your IDBI Bank Royale Signature Credit Card so enjoy on travelling shopping dining out or watching movies get rewarded for all your spends Additionally you will earn a welcome gift of 750 Delight Points on first usage of the card within 30 days days or 400 Delight Points on usage between 31 to 90 days from the card issuance date The minimum eligible transaction value for the welcome gift is 1 500 ', 'Interest Free Credit Enjoy interest free credit of up to 48 days on your purchases and manage your payments as per your convenience You can choose to pay the total outstanding amount minimum amount due or any other amount higher than the minimum amount due Refer MITC for more details', 'Drive On Waiver of 1 fuel surcharge every time you fill fuel using IDBI Bank Royale Signature Card across all fuel stations for transactions in the range of 400 to 5000 So keep driving and enjoying with friends and family Maximum waiver of upto 500 per month', 'Air Travel Accident Insurance Cover IDBI Bank Royale Signature Credit Card provides an air travel accident insurance cover worth 25 lakhs to put your mind at ease while you travel for business or leisure ', 'Airport Lounge Access Enjoy luxury at select airport lounges in India No matter which airline or class you fly your Royale card offers you well deserved rest and use of facilities You will surely find it useful to relieve travel fatigue and escape airport chaos The airport lounge access is being provided by VISA and is subject to change as per VISA s discretion from time to time without any prior notice For the list of participating Lounges please click here ', 'Wider Acceptance Get more freedom with your Royale Credit Card You can use it at over 9 lakh merchants in India and over 29 million merchants abroad ', 'Zero Lost Card Liability In case of loss of credit card report it immediately at our 24 Hour Customer Care 1800 425 7600 Toll free 022 4042 6013 Non toll free The card will be immediately blocked for all transactions to prevent card misuse Any fraudulent transaction after reporting the loss will be covered by the Bank ', 'Family Cards Make IDBI Bank your family Bank by gifting your family members only for members above 18 years an add on Royale Signature Credit Card The add on card enables your family members to avail all the benefits and features applicable on the primary card The Delight Points earned on the add on cards will be clubbed with those of the primary card and can be redeemed as per the higher eligibility ', 'VISA offers Customers will be eligible for exclusive offers on airport shopping hotel car rentals etc which are provided for VISA Signature cardholders Customers can enjoy travel concierge services provided by VISA for their travel related services These offers are provided by VISA and may change as per VISA s discretion from time to time without any prior notice For more details please click here ']

我主要关心的是在列表中每个字符串的标题和描述之间获得一个“”:“

这怎么可能呢

请帮忙,非常感谢您的帮助


Tags: andofthetoinyouforyour
2条回答

代码

from urllib.request import urlopen
from bs4 import BeautifulSoup
import json, requests, re, sys
from selenium import webdriver
import re



html = urlopen("https://www.idbibank.in/aspire-credit-card.asp")
soup = BeautifulSoup(html,'lxml')


l1 = []
l2 = []
l3 = []

for x in soup.select('#FB td+ td strong'):
    l1.append(x.get_text())


for x in soup.select('#FB td+ td'):
    l2.append(re.sub('\W+',' ', x.get_text()))

for x in range(len(l2)):
    l3.append(l2[x].replace(l1[x],l1[x]+':'))



print(l3)

输出

['Earn More while you Spend: The IDBI Bank Aspire Platinum Card understands your lifestyle and rewards you for your needs You will earn 2 Delight Points for every 150 spent by you using your Aspire Platinum Card so enjoy on travelling shopping dining out or watching movies get rewarded for all your spends Additionally you will earn a welcome gift of 500 Delight Points on first usage of the card within 30 days or 300 Delight Points on usage between 31 to 90 days from the card issuance date The minimum eligible transaction value for the welcome gift is 1 500 ', 'Interest Free Credit: Enjoy interest free credit of up to 48 days on your purchases and manage your payments as per your convenience You can choose to pay the total outstanding amount minimum amount due or any other amount higher than the minimum amount due Refer MITC for more details', 'Drive On: Waiver of 1 fuel surcharge every time you fill fuel using IDBI Bank Aspire Platinum Card across all fuel stations for transactions in the range of 400 to 4000 So keep driving and enjoying with friends and family Maximum waiver of upto 300 per month', 'Wider Acceptance: Get more freedom with your Aspire Credit Card You can use it at over 9 lakh merchants in India and over 29 million merchants abroad ', 'Zero Lost Card Liability: In case of loss of credit card report it immediately at our 24 Hour Customer Care 1800 425 7600 Toll free 022 4042 6013 Non toll free The card will be immediately blocked for all transactions to prevent card misuse Any fraudulent transaction after reporting the loss will be covered by the Bank ', 'Family Cards: Make IDBI Bank your family Bank by gifting your family only for members above 18 years an add on Aspire Platinum Credit Card The add on card enables your family members to avail all the benefits and features applicable on the primary card The Delight Points earned on the add on cards will be clubbed with those of the primary card and can be redeemed as per the higher eligibility ', 'VISA offers: Customers will be eligible for exclusive offers on airport shopping hotel car rentals etc which are provided for VISA Platinum cardholde Customers can enjoy travel concierge services provided by VISA for their travel related services These offers are provided by VISA and may change as per VISA s discretion from time to time without any prior notice For more details please click here ']

选择td并使用strong获取标题,使用.next_sibling.contents获取描述

l = []

for x in soup.select('#FB td + td'):
    head = x.select_one('strong').text
    desc = x.select_one('br + br').next_sibling
    # or
    # desc = x.contents[4]
    content = "{}: {}".format(head.strip(), desc.strip())
    l.append(content)

print(l)

相关问题 更多 >