功能正常工作,除非通过另一个模块,没有明显的原因

2024-04-29 00:07:35 发布

您现在位置:Python中文网/ 问答频道 /正文

有个问题我不明白。我在一个目录中有两个自定义模块。 注:以下代码是自愿缩短的。我给出了一个不起作用的函数,还有一个类似但起作用的函数。 让我们看看:

module_parsing.py

import re

def hippodrome_numreu_prix(response):
        """Parse the hippodrome's name, number of the meeting and prize's name.
        response -- html page. Arrival table.
        returns a tuple as (hippdrome's name, nb of meeting, prize's name)
        """
        #hippodrome means racecourse, but this word exists in English
        #There are several races on a hippodrome for a particular day, that's called a meeting("reunion" in French). On a particular day there are several meetings too. So that's why we call it number of meeting: n°1, n°2...
        #A race is for a prize ("prix" in French). This prize has a name.
        hip_num = response.xpath("//html//h1[@class='CourseHeader-title']/text()").extract() 
        hip_num = ''.join(hip_num)
        #HIPPODROME
        if re.search('-\s{0,5}([A-zÀ-ÿ|-|\s]+)\s{0,5}/', hip_num):
            hippo = re.search('-\s{0,5}([A-zÀ-ÿ|-|\s]+)\s{0,5}/', hip_num).group(1).lower().replace(' ','')
        else:
            hippo = None
        #NUMBER OF MEETING
        if re.search('R[0-9]+', hip_num):
            num_reunion = re.search('R[0-9]+', hip_num).group()
        else:
            num_reunion = 'PMH'
        #PRIZE
        prix = response.xpath("//html//h1[@class='CourseHeader-title']/strong/text()").extract_first().lower()
        return (hippo,num_reunion,prix)

def allocation_devise(response):
    """Parse the amount of allowance and currency (€, £, $, etc)
    response -- html page. Arrival table.
    returns a tuple as (allowance's amount, currency's symbol)
    """
    #"allocation" means allowance. That's sum of the prizes for winners: 1st, 2nd, 3rd, etc.
    #"devise" means currency. Depending of the country of the hippodrome, the allowance are expressed in different currencies.
    alloc_devise = response.xpath("//html//div[@class='row-fluid row-no-margin text-left']/p[2]/text()[2]").extract_first()
    #ALLOWANCE
    if re.search(r'[0-9]+',alloc_devise.replace('.','')):
        alloc = int(re.search(r'[0-9]+',alloc_devise.replace('.','')).group())
    else:
        alloc = None
    #CURRENCY
    if re.search(r'([A-Z|a-z|£|$|€]+)',alloc_devise):
        devise = re.search(r'([A-Z|a-z|£|$|€]+)',alloc_devise).group()
    else:
        devise = None
    return (alloc, devise)

module_correction.py与上一个有依赖关系:

from module_parsing import *

def fonction_parse_correction(
        champ_corrige, response):

    dico_fonction_parse = {
    'allocation': allocation_devise,
    'devise': allocation_devise,
    'hippodrome':hippodrome_numreu_prix,
    'reunion' : hippodrome_numreu_prix,
    'prix': hippodrome_numreu_prix,
    }
    if champ_corrige in {'allocation','hippodrome'}:
        return dico_fonction_parse[champ_corrige](response)[0]
    elif champ_corrige in {'devise','reunion'}:
        return dico_fonction_parse[champ_corrige](response)[1]
    elif champ_corrige in {'prix'}:
        return dico_fonction_parse[champ_corrige](response)[2]

现在,当我在scrapy shell中测试我的功能时:

scrapy shell https://www.paris-turf.com/programme-courses/2019-07-09/reunion-chateaubriant/resultats-rapports/prix-synergie-1150124
#Here I change the path with sys.path.insert() and only import module_correction
In [1]: from module_correction import *
In [2]: fonction_parse_correction('hippodrome',response)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-34-e82fe121aab0> in <module>
----> 1 fonction_parse_correction('hippodrome',response)

~/.../scrapy_project/scrapy_project/module_correction.py in fonction_parse_correction(champ_corrige, response)
    103         if champ_corrige in {'allocation','hippodrome'}:
--> 104             return dico_fonction_parse[champ_corrige](response)[0]
    105         elif champ_corrige in {'devise','reunion'}:

~/.../scrapy_project/scrapy_project/module_parsing.py in hippodrome_numreu_prix(response)
    157     #HIPPODROME 
--> 158     if re.search('-\s{0,5}([A-zÀ-ÿ|-|\s]+)\s{0,5}/', hip_num):
    160         hippo = re.search('-\s{0,5}([A-zÀ-ÿ|-|\s]+)\s{0,5}/', hip_num).group(1).lower().replace(' ','')
    161     else:

AttributeError: 'NoneType' object has no attribute 'group'

当我执行allocation_devise()fonction_parse_correction()时,它会变得很奇怪,因为它可以工作:

In [3]: fonction_parse_correction('allocation',response)      
Out[3]: 35000

更奇怪的是,我只是在shell中复制并粘贴hippodrome_numreu_prix()函数,让它自己执行:

In [4]: hippodrome_numreu_prix(response)
Out[4]: ('châteaubriant', 'R1', 'prix synergie')

所以在这里,我清楚地看到None类型没有问题,因为search()清楚地找到了它。此外,这似乎不是一个被dict使用的问题,因为类似的函数allocation_devise()工作得很好,甚至可以在相同的参数response上工作。 有什么问题我看不出来?你知道吗

注:Ubuntu 18.04、Scrapy 1.5.2、Python 3.7.1


Tags: inresearchparseresponsenumhipcorrection