Python将get\u文本项与列表项进行比较

2024-04-25 20:22:44 发布

您现在位置:Python中文网/ 问答频道 /正文

继续我的python项目,但是我发现了一个更令人沮丧的阶段。你知道吗

我没有从论坛中找到最后一个发布日期的代码片段,将其保存在临时变量(希望使用它检查每个日期)和公共/全局变量中,以便在整个范围内进一步使用。你知道吗

不过,我尝试使用的方法是从论坛中获取所有最后发布的日期,并将它们与.csv文件中已有的日期进行比较,以查看是否有任何新的发布,如果没有,就不要对数据进行刮取/挖掘。你知道吗

但这正是我正在努力解决的问题,我无法将我的挖掘(get_text)元素与.csv列表中的项目进行比较。你知道吗

任何想法都会被采纳,尝试多种方法,剩下的最后一种仍然不起作用。你知道吗

代码:

#Preparing csv file to be read through to check if dates match
storedDates = open(os.path.expanduser("PostDates.csv"))
csv_storedDates = csv.reader(storedDates)
dateRow = list(csv_storedDates) #Storing all the dates as a "List" object
listLength = len(dateRow) #Grabbing the csv List length
startingDate = 0 #Variable for looping through each date for each post.

lPostDate = lPostDate2 = ""

#Looping through 6 times (As that's how many pages each forum has, and collecting Next Page Link,Each Thread Title, It's Link
#.. last post date (To know how recent it is) and assigning next page link to current url, and continuing loop.
while number < 6:
    for postDate in soup.find_all(title=re.compile("^Replies:")):
        tempData = ""
        tempData += (postDate.get_text("\n", strip=True)[0:10] + "\n")
        lPostDate += (postDate.get_text("\n", strip=True)[0:10] + "\n")
        if any(tempData in s for s in dateRow[startingDate]):
            print("Matched a date" + tempData + "to one from database" + dateRow[startingDate])
            startingDate +=1
        else :
            startingDate += 1
            print("Date " + tempData + "was not matched to anything" + str(dateRow[startingDate]))

这只是代码的一部分,但这是我目前唯一想做的工作。假设发布日期.csv已经有信息了。另外,输出是这样的:

Date 02-11-2017
was not matched to anything['02-11-2017']
Date 01-10-2017
was not matched to anything['01-10-2017']
Date 02-12-2017
was not matched to anything['02-12-2017']
Date 10-01-2016
was not matched to anything['10-01-2016']
Date 09-30-2016
was not matched to anything['09-30-2016']
Date 08-10-2016
was not matched to anything['08-10-2016']
Date 10-01-2015
was not matched to anything['10-01-2015']
Date 10-01-2015
was not matched to anything['10-01-2015']
Date 08-29-2015
was not matched to anything['08-29-2015']
Date 03-16-2015
was not matched to anything['03-16-2015']
Date 07-16-2014
was not matched to anything['07-16-2014']
Date 07-13-2014
was not matched to anything['07-13-2014']
Date 02-11-2014
was not matched to anything['02-11-2014']
Date 07-02-2013
was not matched to anything['07-02-2013']
Date 06-28-2013
was not matched to anything['06-28-2013']
Date 04-22-2013
was not matched to anything['04-22-2013']
Date 05-28-2012
was not matched to anything['05-28-2012']
Date 05-25-2012
was not matched to anything['05-25-2012']
Date 05-09-2012
was not matched to anything['05-09-2012']
Date 06-10-2010
was not matched to anything['06-10-2010']
Date 01-18-2010
was not matched to anything['01-18-2010']
Date 01-18-2010
was not matched to anything['01-18-2010']
Date 12-29-2009
was not matched to anything['12-29-2009']
Date 06-08-2009
was not matched to anything['06-08-2009']
Date 02-02-2009
was not matched to anything['02-02-2009']
Date 11-24-2008
was not matched to anything['11-24-2008']
Date 09-02-2008
was not matched to anything['09-02-2008']
Date 08-07-2008
was not matched to anything['08-07-2008']
Date 06-05-2008
was not matched to anything['06-05-2008']
Date 05-22-2008
was not matched to anything['05-22-2008']
Date 04-21-2008
was not matched to anything['04-21-2008']
Date 03-29-2008
was not matched to anything['03-29-2008']
1
Date 02-11-2017
was not matched to anything['02-11-2017']
Date 01-10-2017
was not matched to anything['01-10-2017']
Date 11-07-2007
was not matched to anything['11-07-2007']
Date 11-07-2007
was not matched to anything['11-07-2007']
Date 09-19-2007
was not matched to anything['09-19-2007']
Date 09-01-2007
was not matched to anything['09-01-2007']
Date 08-31-2007
was not matched to anything['08-31-2007']
Date 08-31-2007
was not matched to anything['08-31-2007']
Date 08-30-2007
was not matched to anything['08-30-2007']
Date 08-24-2007
was not matched to anything['08-24-2007']
Date 08-19-2007
was not matched to anything['08-19-2007']
Date 08-08-2007
was not matched to anything['08-08-2007']
Date 08-03-2007
was not matched to anything['08-03-2007']
Date 07-29-2007
was not matched to anything['07-29-2007']
Date 07-18-2007
was not matched to anything['07-18-2007']
Date 06-26-2007
was not matched to anything['06-26-2007']
Date 06-26-2007
was not matched to anything['06-26-2007']
Date 01-12-2007
was not matched to anything['01-12-2007']
Date 12-05-2006
was not matched to anything['12-05-2006']
Date 11-16-2006
was not matched to anything['11-16-2006']
Date 11-05-2006
was not matched to anything['11-05-2006']
Date 11-05-2006
was not matched to anything['11-05-2006']
Date 11-03-2006
was not matched to anything['11-03-2006']
Date 09-19-2006
was not matched to anything['09-19-2006']
Date 09-19-2006
was not matched to anything['09-19-2006']
Date 09-19-2006
was not matched to anything['09-19-2006']
Date 09-12-2006
was not matched to anything['09-12-2006']
Date 08-17-2006
was not matched to anything['08-17-2006']
Date 08-07-2006
was not matched to anything['08-07-2006']
Date 08-02-2006
was not matched to anything['08-02-2006']
Date 07-16-2006
was not matched to anything['07-16-2006']
Date 07-07-2006
was not matched to anything['07-07-2006']

我不再把otput粘贴在第2页之后,因为它有6页那么长,所以有很多数据。你知道吗

这就是它以前被刮取并存储在.csv文件(dateRow变量)中时的样子:

Date,
02-11-2017
01-10-2017
02-12-2017
10-01-2016
09-30-2016
08-10-2016
10-01-2015
10-01-2015
08-29-2015
03-16-2015
07-16-2014
07-13-2014
02-11-2014
07-02-2013
06-28-2013
04-22-2013
05-28-2012
05-25-2012
05-09-2012
06-10-2010
01-18-2010
01-18-2010
12-29-2009
06-08-2009
02-02-2009
11-24-2008
09-02-2008
08-07-2008
06-05-2008
05-22-2008
04-21-2008
03-29-2008
02-11-2017
01-10-2017
11-07-2007
11-07-2007
09-19-2007
09-01-2007
08-31-2007
08-31-2007

任何建议如何处理它,以便它会找到匹配的日期将不胜感激,谢谢!你知道吗


Tags: csvto代码textforgetdatenot
1条回答
网友
1楼 · 发布于 2024-04-25 20:22:44

总结一下我们的谈话: 您键入了any(tempData in s for s in dateRow[startingDate]),我认为这一定是类型不匹配。原来是这样的。这是因为any()的定义如下:

any(iterable) Return True if any element of the iterable is true. If the iterable is empty, return False. Equivalent to:

def any(iterable):
    for element in iterable:
        if element:
            return True
    return False

当你把代码分开的时候,会得到这样的结果:

>>> # Curly brackets make it syntactically correct
>>> iterable = (tempData in s for s in dateRow[startingDate]) 
>>> any(iterable)
False

但这真的很难接受吗?让我们看看:

>>> type(iterable)
<class 'generator'>

不是的!哈!但是这个:

>>> type([tempData in s for s in dateRow[startingDate]])
<class 'list'>

太可怕了!你知道吗

>>> hasattr([tempData in s for s in dateRow[startingDate]], '__iter__')
True

问题解决了,只需记住在生成器周围添加一些圆括号,使其成为可编辑的!你知道吗

相关问题 更多 >