BYR Achieve · 镜像论坛

领导要下点电影，我来爬个BT

2015/10/23镜像同步3 回复

1.登录问题登录需要验证码太麻烦，直接从Chrome里抄cookie 2.URL 最好按照上传数排序，去除死种；免得领导到时候要下载确没有源 URL = 'http://bt.byr.cn/torrents.php?cat408=1&incldead=1&spstate=0&inclbookmarked=0&search=&search_area=0&search_mode=0&sort=7&type=desc&page=%d' cat408=1对应类别，最后的page对应排序页面编号最后写入文件就可以了，再导入excel from requests import session from bs4 import BeautifulSoup URL = 'http://bt.byr.cn/torrents.php?cat408=1&incldead=1&spstate=0&inclbookmarked=0&search=&search_area=0&search_mode=0&sort=7&type=desc&page=%d' cookie = { 'c_secure_login': 'XXXXX', 'c_secure_pass': 'XXXXXXXXX', 'c_secure_ssl': 'XXXXX', 'c_secure_tracker_ssl': 'XXXXXX', 'c_secure_uid': 'XXXXX' } with session() as c: with open('torrents.dat', 'w') as f: for pn in range(15): print 'Process %d' % pn r = c.get(URL % pn, cookies=cookie) items = BeautifulSoup(r.text).find_all( 'table', {'class': 'torrentname'}) for item in items: title = item.find('a').get('title') tid_string = item.find('a').get('href') tid = tid_string[15:tid_string.find('&')] tmp = title + '|' + tid + '\n' f.write(tmp.encode('utf-8'))

订阅后，新回复会通过你的通知中心匿名送达。