从波士顿学院agora门户网站获取和解析数据的网络爬虫库
pygora-phchcc的Python项目详细描述
比哥拉
从BC Agora Portal获取和分析数据的网络爬虫库。
入门(python 3):
pip install pygora-phchcc
示例
登录agora,下载并打印指向所有主题页的链接
frompygoraimport*session,gen_time=get_session("myAgoraUsername","myAgoraPassword",check_valid=True)# if gen_time == 0, we know something goes wrong (maybe you did not input the correct credential)print(gen_time)subjects=download_subjects(session,simple=True)# simple: each subject is a stringfori,lineinenumerate(subjects):print(i,line)# subjects = download_subjects(session) #eacg subject is a dict, with more information
缓存用户名和密码,这样就不必在脚本中显式地编写它们
frompygoraimport*# to set credential, run it once so that username & password are stored locallyset_credential("myAgoraUsername","myAgoraPassword")# to clear out credentialset_credential("","")
示例parse_subject_page
:打印出所有生物课程(学校和科目代码可以在subject.txt
中找到),前提是如果您运行了set_credential
frompygoraimport*session,gen_time=get_session(*get_credential(),check_valid=True)# if you are confident that your username & password are correct, do# session, gen_time = get_session(*get_credential())url=SUBJECT_URL.format('2MCAS','2BIOL')# get you a url stringresp=session.get(url)# use your session to HTTP get the urlcourses=parse_subject_page(resp)# parse the subject pageforcourseincourses:print(course)
示例parse_course_page
:在课程页面上打印所有信息(课程代码可以在parse_subject_page
的输出中找到)
frompygoraimport*session,gen_time=get_session(*get_credential())url=COURSE_URL.format('ACCT102101')# a dummy dict in this example, could be your data fetched from databaseinfo_dict=dict()resp=session.get(url)parse_course_page(resp,info_dict)# update the dictforkey,valueininfo_dict.items():print(key,value)
相关项目
后端EagleVision
后端New PEPS (planning)
加入开发团队/联系我们:
在github上打开一个问题,宣布您要处理的功能/错误
或通过电子邮件:(haochen)phchcc_at_gmail_dot_com
或在BC目录中搜索我们的姓名
特别感谢
特别感谢那些让eaglevision(这个项目的原型)和pygora活着的人(名字按字母顺序排列):
Baichuan (Patrick) Guo——最初的“诚实团队”
David Shen——eaglevision开发团队
Estevan Feliz——最初的“诚实团队”&eaglevision dev团队
Roger Wang——eaglevision开发团队
Yuning (Tommy) Yang——最初的“诚实团队”
Yuxuan (Jacky) Jin——eaglevision开发团队