��λ�ã�老铁SEO > 站长新闻 > 网站技术 >

利用python代码爬取torrentkitty上的种子

��Դ��http://www.6cu.com

��ߣ�论坛外链

��15

2021-03-26 06:29:49

话不多说上源代码，只要把lxml的库安装下就好了

这个程序完全是解放双手，而且没有弹窗网页等困扰

__author__ = 'JianqingJiang'

# -*- coding: utf-8 -*-

import urllib2

from lxml import etree

import os

pre_url ='http://torrentkitty/search/tokyohot/'

os.chdir('/Users/JianqingJiang/Downloads/')

def steve(page_num,file_name):

url = pre_url + str(page_num)

print url

ht = urllib2.urlopen(url).read()

content = etree.HTML(ht.lower().decode('utf-8'))

mags = content.xpath("//a[@rel='magnet']")

with open(file_name,'a') as p: # '''Note'''：Ａppend mode, run only once!

for mag in mags:

p.write("%s \n \n"%(mag.attrib['href'])+"\n") ##!!encode here to utf-8 to avoid encoding

print "%s \n \n"%(mag.attrib['href'])

for page_num in range(0,10):

print (page_num)

steve(page_num, 'steve.txt')

差不多爬了10页就这样了。。。。

上一篇：快速排序，冒泡排序和快速排序时间复杂度

下一篇：ajax ajax的用法一个完整的ajax请求过程

��

��Ʒ

��

伪原创工具蜘蛛池出租百度快速排名一周见效_先上首页后付费软文推广营销,新闻源发布推广,媒体发稿投放天涯论坛发帖百度手机快排百度排名关键词seo优化排名/网站优化/百度快排首页发新闻稿门户行业地方网站媒体新闻发布发稿套餐网络宣传推广百度知道|百度问答|内容编辑发布百度主动提交插件织梦dedecms百度实时推送主动推送百度搜狗百科搜搜百科 SOSO百科创建

品质保证
所售商品均有品牌授权
隐私保护
5重保护确保私密
100天质保
100天质保
材料安全
材料安全认证
满79元包邮
支持货到付款

火热咨询微信

juxia_com

投诉邮箱:37442552@qq.com

新手指南
- 网购流程
- 会员优惠
- 常见问题
支付方式
- 网上支付
- 银行支付
- 货到付款
售后服务
- 100天质保
- 换货流程
- 发票制度
信用保障
- 隐私保护
- 真伪识别
- 防骗声明

关注我们

利用python代码爬取torrentkitty上的种子

������Դ��http://www.6cu.com ���ߣ�论坛外链 ������15 2021-03-26 06:29:49

�������

���߽���

��������

������Ʒ

��������

��������

juxia_com

��Դ��http://www.6cu.com

��ߣ�论坛外链

��15

2021-03-26 06:29:49

��

��߽��

��

��Ʒ

��

��