宜配屋

python抓取京东价格分析京东商品价格走势

yipeiwu_com5年前 (2020-03-06)Python爬虫

from creepy import Crawler
from BeautifulSoup import BeautifulSoup
import urllib2
import json

class MyCrawler(Crawler):
    def process_document(self, doc):
        if doc.status == 200:
            print '[%d] %s' % (doc.status, doc.url)
            try:
                soup = BeautifulSoup(doc.text.decode('gb18030').encode('utf-8'))
            except Exception as e:
                print e
                soup = BeautifulSoup(doc.text)
            print soup.find(id="product-intro").div.h1.text
            url_id=urllib2.unquote(doc.url).decode('utf8').split('/')[-1].split('.')[0]
            f = urllib2.urlopen('http://p.3.cn/prices/get?skuid=J_'+url_id,timeout=5)
            price=json.loads(f.read())
            f.close()
            print price[0]['p']
        else:
            pass

crawler = MyCrawler()
crawler.set_follow_mode(Crawler.F_SAME_HOST)
crawler.set_concurrency_level(16)
crawler.add_url_filter('\.(jpg|jpeg|gif|png|js|css|swf)$')
crawler.crawl('http://item.jd.com/982040.html')

python抓取京东价格分析京东商品价格走势

相关文章

python爬虫之快速对js内容进行破解

Python数据抓取爬虫代理防封IP方法

Python爬虫框架Scrapy实战之批量抓取招聘信息

Python实现爬取需要登录的网站完整示例

Python爬虫实例爬取网站搞笑段子

© YiPeiWu.com 【宜配屋】粤ICP备17031333号

Powered By Z-BlogPHP. Theme by TOYEAN.

宜配屋

python抓取京东价格分析京东商品价格走势

相关文章

python爬虫之快速对js内容进行破解

Python数据抓取爬虫代理防封IP方法

Python爬虫框架Scrapy实战之批量抓取招聘信息

Python实现爬取需要登录的网站完整示例

Python爬虫实例爬取网站搞笑段子

© YiPeiWu.com 【宜配屋】 粤ICP备17031333号 var _hmt = _hmt || [];(function() { var hm = document.createElement("script"); hm.src = "https://hm.baidu.com/hm.js?8aa60ae04b767b2af31903508928acc0"; var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(hm, s);})();

Powered By Z-BlogPHP. Theme by TOYEAN.

© YiPeiWu.com 【宜配屋】粤ICP备17031333号