删除特定单词周围一定数量的行

2024-04-30 03:37:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个程序,从网站上获取信息并打印成文本文档,一旦后端工作完成,我很快就会将其格式化成一个更有用的程序

信息是按时间顺序排列的,但它只是一个字符串,所以基本上是原始数据。我希望它逐行阅读,当它点击关键词时,它会删除其余的信息。现在,它只是删除关键字,这是没有用的,因为它留下了大量的数据

关键字是day,在列表中,当更新发生后的天数变为day时,将其包含在我的信息中是没有用的

 PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)

driver.get("https://coinmarketcap.com/new")
try:
    main = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.XPATH, '//*[@id="__next"]/div[1]/div[1]/div[2]/div/div[2]'))
    )
except:
    driver.quit()

time = "hours"
txt = main.text
if time in main.text:
    print(main.text)
    print("Newer cryptos found")
else:
    print("No newer cryptos found")
driver.quit()



f = open("CoinMC.txt", "w")
f.write(txt)
f.close()

lines = []
with open("CoinMC.txt", 'r') as fp:
    lines=fp.readlines()

with open("CoinMC.txt", 'w') as fp:
    for number, line in enumerate(lines):
        if number not in [0,1,2,3,4,5,6,7]:
            fp.write(line)

with open("CoinMC.txt", "r") as input:
    with open("CoinMCtemp.txt", "w") as output:
        for line in input:
            if "day" not in line.strip("\n"):
                output.write(line)

os.replace('CoinMCtemp.txt', "CoinMC.txt")

我有它删除前7行,因为它是不需要的。这是它打印出来的

 1
OEC SHIB
SHIBK
$0.000007142 1.20% 0.00%
--
$6,259
OKExChain
1 hours ago
2
OEC UNI
UNIK
$24.15 5.97% 0.00%
--
$852,208
OKExChain
1 hours ago
3
OEC FIL
FILK
$59.03 3.55% 0.00%
--
$125,513
OKExChain
1 hours ago
4
Asia Coin
ASIA
$0.1168 0.12% 0.00% $11,679,825 $69,817
Ethereum
1 hours ago
5
BabyDogeX
BDOGEX
$0.000002676 63.03% 0.00% $267,565 $132,537
Binance Coin
1 hours ago
6
Everest Token
EVRT
$0.32 39.45% 0.00% $32,003,854 $127,147
Avalanche
1 hours ago
7
Kurobi
KURO
$0.1113 104.31% 0.00% $44,527,852 $145,578
Solana
1 hours ago
8
Octaplex Network
PLX
$2.73 2.41% 0.00% $2,727 $72,107
Binance Coin
2 hours ago
9
PizzaBucks
PIZZAB
$0.000003016 20.01% 0.00% $603,224 $121,475
Binance Coin
6 hours ago
10
Little Angry Bunny v2
LAB V2
$0 0.00% 0.00%
--
$264,358
Binance Coin
6 hours ago
11
VPEX Exchange
VPX
$0.08024 28.08% 0.00% $76,225,869 $91,560
Binance Coin
7 hours ago
12
Synapse
SYN
$2.05 8.13% 0.00% $111,205,453 $33,736,237
Ethereum
9 hours ago
13
Block Farm
BFC
$2.26 7.46% 0.00% $677,248,899 $1,094,921
Binance Coin
13 hours ago
14
BabySafeMoon
BSFM
$0.01843 13.72% 0.00% $1,842,753 $1,439,599
Binance Coin
18 hours ago
15
Happiness
HPNS
$0.02939 0.08% 0.00% $15,139,185 $64,731
18 hours ago
16
SUCCESS INU
SUCCESS
$0.000000004294 27.82% 50.33% $4,281,116 $1,149,340
Binance Coin
17
GravitX
GRX
$0.1682 47.04% 733.58% $14,761,186 $1,949,928
Binance Coin
18
Prelax
PEA
$0.003075 0.50% 33.13% $1,848,248 $1,566,051
Binance Coin
19
Moonkafe Finance
KAFE
$22.14 0.23% 7.35%
--
$151,714
Moonriver
20
Mini Floki
MINIFLOKI
$0.0000001023 12.92% 21.45% $1,023,333 $969,215
Binance Coin
21
NFTrade
NFTD
$0.5436 4.07% 18.20% $73,387,288 $1,043,752
Binance Coin
22
SafeMoon-AVAX
$SAFEMOONA
$0.000000001448 0.51% 1.50% $1,448,031 $16,582
Avalanche
23
FlyPaper
STICKY
$0.001935 19.19% 102.35% $967,562 $1,020,175
Binance Coin
24
Toll Free Swap
TOLL
$3,936.02 0.48% 8.13%
--
$39,092
Ethereum
25
Fruits Eco
FRTS
$0.7254 0.76% 0.41% $290,176,247 $894,943
Ethereum
26
Decentralized data crypto system
DCS
$4.62 0.31% 3.68% $277,286,959 $1,108,303
Binance Coin
27
Sombra
SMBR
$0.01444 0.80% 13.06% $1,444,036 $92,698
Binance Coin
28
Ether Matrix
ETHMATRIX
$0.0007299 19.49% 80.95% $729,936 $546,747
Binance Coin
29
ForeverFOMO
FOREVERFOMO
$0.0001849 4.18% 683.81%
--
$3,561,670
Binance Coin
30
Mars Panda World
MPT
$0.2678 2.09% 29.60% $23,802,499 $82,606
Binance Coin

几乎一半的结果是不必要的,并且增加了混乱。你可以看到它是怎么写的1,然后是几行信息,然后是2,更多信息,包括它上市以来的时间等等。第一个是一天前列出的,我想删除下面的所有内容,并可能删除关键字“day”上方的一些行

预期结果将是

 1
OEC SHIB
SHIBK
$0.000007142 1.20% 0.00%
--
$6,259
OKExChain
1 hours ago
2
OEC UNI
UNIK
$24.15 5.97% 0.00%
--
$852,208
OKExChain
1 hours ago
3
OEC FIL
FILK
$59.03 3.55% 0.00%
--
$125,513
OKExChain
1 hours ago
4
Asia Coin
ASIA
$0.1168 0.12% 0.00% $11,679,825 $69,817
Ethereum
1 hours ago
5
BabyDogeX
BDOGEX
$0.000002676 63.03% 0.00% $267,565 $132,537
Binance Coin
1 hours ago
6
Everest Token
EVRT
$0.32 39.45% 0.00% $32,003,854 $127,147
Avalanche
1 hours ago
7
Kurobi
KURO
$0.1113 104.31% 0.00% $44,527,852 $145,578
Solana
1 hours ago
8
Octaplex Network
PLX
$2.73 2.41% 0.00% $2,727 $72,107
Binance Coin
2 hours ago
9
PizzaBucks
PIZZAB
$0.000003016 20.01% 0.00% $603,224 $121,475
Binance Coin
6 hours ago
10
Little Angry Bunny v2
LAB V2
$0 0.00% 0.00%
--
$264,358
Binance Coin
6 hours ago
11
VPEX Exchange
VPX
$0.08024 28.08% 0.00% $76,225,869 $91,560
Binance Coin
7 hours ago
12
Synapse
SYN
$2.05 8.13% 0.00% $111,205,453 $33,736,237
Ethereum
9 hours ago
13
Block Farm
BFC
$2.26 7.46% 0.00% $677,248,899 $1,094,921
Binance Coin
13 hours ago
14
BabySafeMoon
BSFM
$0.01843 13.72% 0.00% $1,842,753 $1,439,599
Binance Coin
18 hours ago
15
Happiness
HPNS
$0.02939 0.08% 0.00% $15,139,185 $64,731
18 hours ago

一半将被删除,因为程序将写入/打印,直到到达单词“day”,然后将停止写入


Tags: indivtxt信息driverbinanceopenago
1条回答
网友
1楼 · 发布于 2024-04-30 03:37:15

对代码的重要建议: 您可以使用range()函数而不是制作列表:

with open("CoinMC.txt", 'w') as fp:
    for number, line in enumerate(lines):
        if number not in range(0, 7):
            fp.write(line)

不能使用python保留的input之类的关键字,您在此处使用了这些关键字:

with open("CoinMC.txt", "r") as input:
    with open("CoinMCtemp.txt", "w") as output:
        for line in input:
            if "day" not in line.strip("\n"):
                output.write(line)

回答你的问题: 尝试使用以下方法:

str = "Your Data".split("ago")
del str[-1]
str = "ago".join(str)

这将从ago分割字符串并删除最后一项,然后按ago重新加入

相关问题 更多 >