python 3 - urllib 问题

0 投票
2 回答
1426 浏览
提问于 2025-04-17 17:26

我在Windows 7上使用的是Python 3.3.0。

我有两个文件:dork.txt和fuzz.py。

dork.txt的内容如下:

/about.php?id=1
/en/company/news/full.php?Id=232
/music.php?title=11

fuzz.py的内容如下:

srcurl = "ANY-WEBSITE"
drkfuz = open("dorks.txt", "r").readlines()
print("\n[+] Number of dork names to be fuzzed:",len(drkfuz))

for dorks in drkfuz:
    dorks = dorks.rstrip("\n")
    srcurl = "http://"+srcurl+dorks

    requrl = urllib.request.Request(srcurl) 

    #httpreq = urllib.request.urlopen(requrl)

    # Starting the request
    try:
        httpreq = urllib.request.urlopen(requrl)
    except urllib.error.HTTPError as  e:
        print ("[!] Error code: ",  e.code)
        print("")
        #sys.exit(1)

    except urllib.error.URLError as  e:
        print ("[!] Reason: ",  e.reason)
        print("")
        #sys.exit(1)  

    #if e.code != 404:
    if httpreq.getcode() == 200:
        print("\n*****srcurl********\n",srcurl)
        return srcurl

所以,当我输入正确的网站名,比如有/about.php?id=1的页面时,它能正常工作。

但是当我提供一个有/en/company/news/full.php?Id=232的页面时,它首先打印出Error code: 404,然后给我以下错误:UnboundLocalError: local variable 'e' referenced before assignment或者UnboundLocalError: local variable 'httpreq' referenced before assignment

我能理解如果网站没有包含/about.php?id=1的页面,它会给出Error code: 404,但为什么它不回到for循环去检查文本文件中的其他内容呢?为什么它在这里就停止了并抛出错误?

我想写一个脚本,从一个网站地址,比如www.xyz.com,找出有效的页面。

2 个回答

0
srcurl = "ANY-WEBSITE"
drkfuz = open("dorks.txt", "r").readlines()
print("\n[+] Number of dork names to be fuzzed:",len(drkfuz))

for dorks in drkfuz:
    dorks = dorks.rstrip("\n")
    srcurl = "http://"+srcurl+dorks

    try:
        requrl = urllib.request.Request(srcurl)
        if requrl != None and len(requrl) > 0:
            try:
                httpreq = urllib.request.urlopen(requrl)
                if httpreq.getcode() == 200:
                    print("\n*****srcurl********\n",srcurl)
                    return srcurl
            except:
                # Handle exception
                pass
    except:
        # Handle your exception
        print "Exception"

这个代码还没有经过测试,但从逻辑上来说,它应该是能正常工作的。

2

当这行代码 urllib.request.urlopen(requrl) 出现错误时,变量 httpreq 就不会被赋值。你可以在 try 语句之前把它设置为 None,然后在之后检查它是否仍然是 None

httpreq = None

try:
    httpreq = urllib.request.urlopen(requrl)

# ...

if httpreq is not None and httpreq.getcode() == 200:

撰写回答