循环引用在哪里?
我正在写一些Python类,想把它们转换成JSON格式。当我尝试把我的对象转成JSON时,出现了一个错误,提到“循环引用”。我大概明白什么是循环引用,但在我的代码中找不到任何例子。
对象之间的关系(拥有/是)
- Signup 拥有一个
- Registrant 拥有一个
- 地址
代码(Python):
class Address:
def __init__(self, address1, address2, city, state, zip):
self.address1 = address1
self.address2 = address2
self.city = city
self.state = state
self.zip = zip
class Signup:
def __init__(self, registrant, classId, date, time, paid, seatCost, notes, className, seats, groupId, agentName, agentCompany):
self.registrant = registrant
self.classId = classId
self.date = date
self.time = time
self.paid = paid
self.seatCost = seatCost
self.notes = notes
self.className = className
self.seats = seats
self.groupId = groupId
self.agentName = agentName
self.agentCompany = agentCompany
class Registrant:
def __init__(self, firstName, lastName, address, phone, email):
self.firstName = firstName
self.lastName = lastName
self.address = address
self.phone = phone
self.email = email
def scrape(br):
signups = []
soup = libStuff.getSoup(br, 'http://thepaintmixer.com/admin/viewdailysignups.php')
table = soup.find(id='Calendar')
rows = table.find_all('tr')
rowNumber = 0
for row in rows:
if rowNumber == 0:
rowNumber = rowNumber + 1
continue
cells = row.find_all('td')
cellNumber = 0
for cell in cells:
if cellNumber == 0:
try:
firstName = cell.contents[0]
except IndexError:
firstName = None
elif cellNumber == 1:
try:
lastName = cell.contents[0]
except IndexError:
lastName = None
elif cellNumber == 2:
try:
address1 = cell.contents[0]
except IndexError:
address1 = None
elif cellNumber == 3:
try:
address2 = cell.contents[0]
except IndexError:
address2 = None
elif cellNumber == 4:
try:
city = cell.contents[0]
except IndexError:
city = None
elif cellNumber == 5:
try:
state = cell.contents[0]
except IndexError:
state = None
elif cellNumber == 6:
try:
zip = cell.contents[0]
except IndexError:
zip = None
elif cellNumber == 7:
try:
phone = cell.contents[0]
except IndexError:
phone = None
elif cellNumber == 8:
try:
email = cell.contents[0]
except IndexError:
email = None
elif cellNumber == 9:
try:
classId = cell.contents[0]
except IndexError:
classId = None
elif cellNumber == 10:
try:
date = cell.contents[0]
except IndexError:
date = None
elif cellNumber == 11:
try:
time = cell.contents[0]
except IndexError:
time = None
elif cellNumber == 12:
try:
paid = cell.contents[0]
except IndexError:
paid = None
elif cellNumber == 13:
try:
seatCost = cell.contents[0]
except IndexError:
seatCost = None
elif cellNumber == 14:
try:
notes = cell.contents[0]
except IndexError:
notes = None
elif cellNumber == 15:
try:
className = cell.contents[0]
except IndexError:
className = None
elif cellNumber == 16:
try:
seats = cell.contents[0]
except IndexError:
seats = None
elif cellNumber == 17:
try:
groupId = cell.contents[0]
except IndexError:
groupId = None
elif cellNumber == 18:
try:
agentName = cell.contents[0]
except IndexError:
agentName = None
elif cellNumber == 19:
try:
agentCompany = cell.contents[0]
except IndexError:
agentCompany = None
cellNumber = cellNumber + 1
address = Address(address1, address2, city, state, zip)
registrant = Registrant(firstName, lastName, address, phone, email)
signup = Signup(registrant, classId, date, time, paid, seatCost, notes, className, seats, groupId, agentName, agentCompany)
signups.append(signup)
return signups
#I then call json.dumps() on that returned list
json.dumps(scrape(br), default=lambda o: o.__dict__)
我的构造函数有问题吗?我是不是传递了不该传递的东西?
2 个回答
0
我找不到错误,所以我重新整理了一下代码,使用了命名元组(感谢@metatoaster)。重新整理代码后,问题解决了。
def scrape(br):
signups = []
soup = libStuff.getSoup(br, 'http://thepaintmixer.com/admin/viewdailysignups.php')
table = soup.find(id='Calendar')
rows = table.find_all('tr')
rowNumber = 0
for row in rows:
if rowNumber == 0:
rowNumber = rowNumber + 1
continue
cells = row.find_all('td')
cells = [cell.string if cell.string != None else '' for cell in cells]
signup = Signup(*cells)
signups.append(signup)
return signups
2
很可能的原因是,cell.contents[0]
返回的是一个复杂的 BeautifulSoup 对象,而不是简单的文本。BeautifulSoup 对象知道它们的父元素、兄弟元素、解析器类型、属性,以及其他可能共享或循环引用的对象。
这种情况通常发生在一个 <td>
元素里面包含了内嵌的 HTML。这在表格中很常见,比如某个表格项可能是加粗或斜体的。
解决这个问题的一个好办法是确保使用 BeautifulSoup 的 .text
,这样可以确保你只获取文本,而不是内部的 BeautifulSoup 元素:
columns = [col.text for col in row.findAll('td')]
顺便说一下,这里有一个简单的诊断方法,可以看看实际发生了什么。只需修改 json.dumps() 的默认函数,让它的输出变得可见:
def view_dict(obj):
print '--------------'
print 'Type:', obj.__class__
d = obj.__dict__
pprint.pprint(d)
return d
json.dumps(scrape(br), default=view_dict)
这样就能很清楚地看到循环引用的问题。希望这能解开你的疑惑(因为从你的代码来看,它看起来没问题,并没有明确创建循环引用)。