识别真实电话号码

2024-04-29 22:34:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据集,它有一个专门的列用于捕获电话号码。我的任务是验证相同的内容,因为存在错误的条目,如“9999999999”、“0123456789”和许多其他类似性质的条目。 我想通过识别运营商名称来解决这个问题,因为上面的实例很容易被忽略,因为没有任何运营商名称。 我遇到了一个名为phonenumbers的包,并使用了下面的代码

import phonenumbers
from phonenumbers import carrier
ro_number = phonenumbers.parse("+91xxxxxxxxxx") # number is redacted purposely
carrier.name_for_number(ro_number, "en")

其输出为'BSNL MOBILE' 我想在dataframe的整个列上运行它,在其中创建一个新列,并记录每个数字载体名称

我尝试使用for循环

for i in df['phone_number']:
    ro_number = phonenumbers.parse(i)
    carrier.name_for_number(ro_number, "en")

但是得到了下面的错误

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-80-af01b9d8c9ef> in <module>
      1 for i in merged_Data['SELLER_NUMBER']:
----> 2     ro_number = phonenumbers.parse(i)
      3     carrier.name_for_number(ro_number, "en")

~\anaconda3\lib\site-packages\phonenumbers\phonenumberutil.py in parse(number, region, keep_raw_input, numobj, _check_region)
   2834         raise NumberParseException(NumberParseException.NOT_A_NUMBER,
   2835                                    "The phone number supplied was None.")
-> 2836     elif len(number) > _MAX_INPUT_STRING_LENGTH:
   2837         raise NumberParseException(NumberParseException.TOO_LONG,
   2838                                    "The string supplied was too long to parse.")

TypeError: object of type 'int' has no len()

不确定迭代整个列的方法是否正确。 非常感谢您的帮助


Tags: nameinimport名称numberforroparse
2条回答
TypeError: object of type 'int' has no len()

该错误表明您正试图对int调用len()。应首先转换为字符串:

len(str(x))

制作了两个代码模块:

  1. 使用方法is\u valid\u number检查该号码是否与exchange关联
  2. 指定区域(如“US”),因为对测试用例“1800444”(MCI电话测试号码)使用None不起作用

代码

import phonenumbers
from phonenumbers import carrier

def valid_number(number, region = "US"):
    ''' check validity of phone numbers (default to US region)
        
        Used default region as US since some numbers did not work using None
    '''
    # Parsing String to Phone number
    phone_number = phonenumbers.parse(number, region)
  
    # Validating a phone number (i.e. it's in an assigned exchange)
    return phonenumbers.is_valid_number(phone_number)

使用列表进行测试

data = ["+442083661177", "+123456789", "18004444444"]

for i in data:
    print(i, valid_number(i))

# Output
+442083661177 True
+123456789 False
18004444444 True    # note: this number doesn't work with default region = None

使用数据帧进行测试

df = pd.DataFrame({"phone_number": data})
df['valid'] = df['phone_number'].apply(valid_number)
# Resulting df
    phone_number    valid
0   +442083661177   True
1   +123456789  False
2   18004444444 True

相关问题 更多 >