mechanize:为什么我的表单列表只包含一个元素?

0 投票
3 回答
934 浏览
提问于 2025-04-18 01:19

我刚接触mechanize,也不是特别高级的Python用户,但我想自动化一个任务,就是给一个网页输入一些内容。现在的问题是,那个“提交”按钮没有分配控制名称。所以我查了一下,发现有办法给特定的表单设置值。但为了做到这一点,我得先找到我想要设置值的那个表单。所以我的代码看起来是这样的:

forms = [f for f in br.forms()]
print forms[0].controls[0].name

我本来以为可以通过写 forms[x] 来访问表单,然后再做一些操作,比如:

forms[x].set_value("VALUE", 
                       nr=5)

但我遇到的错误是:

    forms[54].set_value("VALUE",nr=100)
IndexError: list index out of range

这可能是个有点傻的问题,可能是因为我不太理解我正在使用的函数,但由于没有真正的文档,我真的很希望能得到一句帮助。

顺便说一下,我可以用下面的代码打印出所有的表单:

for f in br.forms():
    print f

输出结果是:

 <CheckboxControl(lookup=[yes])>
  <TextControl(fld=NoName)>
  <TextControl(pixemail=)>
  <IgnoreControl(<None>=<None>)>
  <TextControl(ra=00 00 00.0)>
  <TextControl(dec=00 00 00.0)>
  <SelectControl(equinox=[*J2000.0, B1950.0])>
  <TextControl(offra=0.0)>
  <TextControl(offdec=0.0)>
  <TextControl(epoch=2000.0)>
  <SubmitControl(<None>= Retrieve Data ) (readonly)>
  <RadioControl(cextract=[*rect, circle])>
  <TextControl(rawid=10.0)>
  <TextControl(decwid=10.0)>
  <SelectControl(wunits=[Degrees, *Minutes, Seconds])>
  <TextControl(cirrad=10.0)>
  <SelectControl(cat=[UCAC 2, UCAC 3, NOMAD, *USNO B1.0, USNO A2.0, ACT])>
  <SelectControl(surims=[None, *All Surveys, POSS-I (103aO, 103aE), POSS-II (IIIaJ, IIIaF, IV-N), SOUTH, AAO-R, POSS-IO, POSS-IE, POSS-IIJ, POSS-IIF, POSS-IIN, SRC-J, SERC-EJ, ESO-R, SERC-ER])>
  <CheckboxControl(getcat=[*yes])>
  <CheckboxControl(getfin=[*yes])>
  <CheckboxControl(pixflg=[yes])>
  <CheckboxControl(colbits=[All, *cb_id, *cb_altid, *cb_ra, *cb_sigra, cb_mep, *cb_mura, cb_muprob, *cb_smura, cb_sfitra, *cb_fitpts, cb_err, *cb_flg, *cb_mag, cb_smag, *cb_mflg, *cb_fldid, *cb_sg, cb_xres, cb_pltidx, *cb_xi, *cb_dstctr, *cb_gall])>
  <RadioControl(skey=[*ra, dec, sigra, sigdec, mep, mura, mudec, muprob, smura, smudec, sfitra, sfitdec, fitpts, err, flg, mag, smag, mflg, fldid, sg, xres, yres, pltidx, clr, sigpos, mutot, sigmu, xi, eta, dstctr, gall, galb])>
  <SelectControl(slf=[*hh/dd mm ss, hh/dd:mm:ss, hh.hhh/dd.ddd, ddd.ddd/dd.ddd])>
  <TextControl(minnpts=0)>
  <TextControl(maxnpts=10)>
  <SelectControl(clr=[B1, R1, B2, *R2, I2, B, V, R, J, H, K])>
  <TextControl(bri=0)>
  <TextControl(fai=100)>
  <SelectControl(clr0m1A=[B1, R1, *B2, R2, I2, B, V, R, J, H, K])>
  <SelectControl(clr0m1B=[B1, R1, B2, *R2, I2, B, V, R, J, H, K])>
  <TextControl(bmrmin=-100)>
  <TextControl(bmrmax=100)>
  <TextControl(minposnerr=0.0)>
  <TextControl(maxposnerr=10000.0)>
  <TextControl(mumin=0.0)>
  <TextControl(mumax=10000.0)>
  <TextControl(minmuerr=0.0)>
  <TextControl(maxmuerr=10000.0)>
  <TextControl(minsep=0.0)>
  <HiddenControl(minmagerr=0.0) (readonly)>
  <HiddenControl(maxmagerr=1.0) (readonly)>
  <SelectControl(opstars=[Yes, *No])>
  <SelectControl(whorbl=[Light Stars/Dark Sky, *Dark Stars/Light Sky])>
  <SelectControl(pixgraph=[Progressive JPEG, *JPEG, GIF, PDF, Large JPEG (1 Survey Only), Large GIF (1 Survey Only), PS (1 Survey Only)])>
  <SelectControl(pixfits=[Yes, *No])>
  <SelectControl(ori=[NE - North Up, East Right, *NW - North Up, East Left, SE - North Down, East Right, SW - North Down, East Left, EN - East Up, North Right, ES - East Up, North Left, WN - East Down, North Right, WS - East Down, North Left])>
  <SelectControl(tck=[N and E marks, *Tick Marks, Grid Lines])>
  <SelectControl(starlbl=[Yes, *No])>
  <SelectControl(cmrk=[*None, 5.0 sec Box, 10.0 sec Box, 30.0 sec Box, 1.0 min Box, 2.0 min Box, 5.0 min Box, 10.0 min Box, 5.0 sec Circle, 10.0 sec Circle, 30.0 sec Circle, 1.0 min Circle, 2.0 min Circle, 5.0 min Circle, 10.0 min Circle])>
  <TextControl(aobj=none)>
  <SelectControl(pcl=[*P - Points, L - Points + Labels, C - Connected Points, A - Connected Points + Labels])>
  <TextareaControl(atbl=  )>
  <IgnoreControl(<None>=<None>)>
  <SubmitControl(<None>= Retrieve Data ) (readonly)>
  <SelectControl(gzf=[*Yes, No])>
  <SelectControl(cftype=[*ASCII, XML/VO])>>

我想要操作的是 <SubmitControl(<None>=Retrieve Data ) (readonly)>,这是从下往上数第三个。

3 个回答

0

我觉得肯定有更好的方法来做这个,但我对Mechanize不太熟悉。你可以尝试这样做:

submit_values = filter(lambda x: 'SubmitControl' in str(x), br.forms())
if submit_values:
    print(submit_values[0])

如果有多个结果,显然你会得到不止一个。这可能是实现你想做的事情最奇怪的方法。此外,假设这个表单变化不大,你可以用requests来替代Mechanize。这样看起来会像这样:

import requests

r = requests.post("http://form/action/url/goes/here", data={"lookup": "yes",
                                                            # all the other elements
                                                            })
print(r.status_code)
print(r.text)
# Do something else with r.text, e.g. scrape values with beautifulsoup or something
0

试试这个。你可以通过控件的类型来搜索一个没有名字的控件。根据我的记忆,应该是这样的:

br.form.find_control(type='submit', nr=1)

我觉得这个语法是对的……我会再确认一下,确保没问题。

0

试试这个:

import mechanize

br = mechanize.Browser()

# Insert the desired URL here
br.open('http://www.nofs.navy.mil/data/fchpix/cfch.html#fchmenu')
br.select_form(nr=0)

br["ra"] = "input 1"
br["dec"] = "input 2"
br["pixfits"] = ["Yes"]

br.find_control("pixflg").items[0].selected=True

response = br.submit()

撰写回答