Tkinter Python按钮命令问题

2024-04-29 11:40:51 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我有一个程序,在SEC Edgar数据库中搜索年度报告(10-K),并在列表框中返回40个不同项目的列表。我想创建一个“Next 40”按钮,在列表框中显示接下来的40个项目,下面的代码实现了这一点:

def Next():

global entryWidget

page = 'http://www.sec.gov/cgi-bin/browse-edgar?company=&match=&CIK=' + entryWidget.get().strip() + '&filenum=&State=&Country=&SIC=&owner=exclude&Find=Find+Companies&action=getcompany'
sock = urllib.urlopen(page)
raw = sock.read()
soup = BeautifulSoup(raw)

npar = str(soup.find(value="Next 40"))
index = npar.find('/cgi')
index2 = npar.find('count=40') + len('count=40')
nextpage = 'http://www.sec.gov' + npar[index:index2]

sock2 = urllib.urlopen(nextpage)
raw2 = sock2.read()
soup2 = BeautifulSoup(raw2)

psoup = str(soup2.findAll(nowrap=True))

myparser = MyParser()
myparser.parse(psoup)

filinglist = myparser.get_descriptions()
linklist = myparser.get_hyperlinks()

filinglist = [s for s in filinglist if s != 'Documents']
filinglist = [s for s in filinglist if s != 'Documents Interactive Data']
filinglist = [s for s in filinglist if not re.match(r'\d{3}-', s)]

linklist = [s for s in linklist if not s.startswith('/cgi-')]

Lb1.delete(0, END)

counter = 0

while counter < len(filinglist):
    Lb1.insert(counter, filinglist[counter])
    counter = counter +1

正如您可以看到的,当按下按钮时,它读取原始链接(第页),然后在html网站(第页)上查找“next40”超链接。然后解析新的html文档(nextpage),然后获取项目名称和相关链接。现在这段代码成功地从原始页面转到下一页,但它只能显示下一页。在

那么,我怎样才能在每次按下“下一步”按钮时,将(nextpage)变成原始(page),然后才能列出(nextnextpage)html文档中的项目呢?抱歉,如果这让我很困惑,我真的不知道有其他的解释方法。在

为了进一步说明,这里是我要分析的实际站点链接:http://www.sec.gov/cgi-bin/browse-edgar。。。getcompany公司 我希望“下一步”按钮继续从该网站的“下一步40”按钮检索html超链接。在

以下是我的整个程序代码,以防您需要:

^{pr2}$

Tags: 项目inhttpmyparserforifhtmlcounter
1条回答
网友
1楼 · 发布于 2024-04-29 11:40:51

使用应用程序类而不是全局参数。目前你总是下载第一页。但是您的应用程序类应该缓存当前页面的“soup”,它next用来从“Next 40”表单按钮获取onClick值:

class Application(Frame):
    def __init__(self, parent=None):
        Frame.__init__(self, parent)
        self.pack()

        self.top = Frame(self)
        self.bottom = Frame(self)
        self.bottom2 = Frame(self)
        self.top.pack(side=TOP)
        self.bottom.pack(side=BOTTOM, fill=BOTH, expand=True)
        self.bottom2.pack(side=BOTTOM, fill=BOTH, expand=True)
        #... 
        self.submitbutton = Button(self, text="Submit", command=self.submit)
        self.submitbutton.pack(in_=self.bottom2, side=TOP)
        #...

    #...

    def submit(self):
        page = ('http://www.sec.gov/cgi-bin/browse-edgar?company=&match=&CIK=' + 
                 self.entryWidget.get().strip() + 
                '&filenum=&State=&Country=&SIC=&owner=exclude' 
                '&Find=Find+Companies&action=getcompany')
        #...
        self.soup = ...

    def next(self):
        #...
        #there must be a better way than this to extract the onclick value
        #but I don't use/know BeautifulSoup to help with this part

        npar = str(self.soup.find(value="Next 40"))
        index1 = npar.find('/cgi')
        index2 = npar.find('count=40') + len('count=40')  
        page = 'http://www.sec.gov' + npar[index1:index2]

        sock = urllib.urlopen(page)
        raw = sock.read()
        self.soup = BeautifulSoup(raw)

        #...

if __name__ == '__main__':
    root = Tk()
    root.title("SEC Edgar Search")
    root["padx"] = 10
    root["pady"] = 25

    app = Application(root)

    app.mainloop()
    root.destroy()

对于每个新页面,onClick链接都会更新&Start参数。因此,您也可以在类中增加一个计数器,而不必费心解析当前的soup来获取值。在

相关问题 更多 >