python3.7---爬取网页图片

发布时间：2019-07-19 09:56:49编辑：auto阅读（2573）

#!/usr/bin/python

import re
import urllib
import urllib.request #python3中urlopen、urlritrieve都在request库里面了，所以要导入此库

def htmlGet(url):
page = urllib.request.urlopen(url)
html = page.read()
return html

def imgGet(html):
res = r'src="(https.*?.jpg)"'
imgre = re.compile(res)
imglist = re.findall(imgre,html.decode("utf-8")) #html不加后面的会报错typeerror，因为编码格式的变化，这里需要指定一下
x = 0
for i in imglist:
urllib.request.urlretrieve(i,"%s.jpg" % x)
x+=1

html = htmlGet("http://***")
imgGet(html)

关键字：

上一篇： python提取文件名改进

下一篇： Python字符串，列表



Run博客上线，欢迎访问
内容如有侵犯，请立即联系管理员删除
本站内容仅供学习和参阅，不做任何商业用途

搜索

热门推荐

H3C基本命令大全
 53092
H3C IRF原理及配置
 40017
Python exit()函数
 34398
python全系列官方中文文档
 30150
python 获取网卡实时流量
 25047
1.常用turtle功能函数
 24847
python 获取Linux和Windows硬件信息
 23221
天天基金网数据接口
 16700
Selenium使用代理IP&无头模式访问网站
 14855
Selenium&Pytesseract模拟登录+验证码识别
 14351

最新文章

LangGraph Studio可视化
 689°
LangSmith开发-应用入门
 639°
LangGraph开发-多轮对话问答机器人
 706°
LangGraph开发-条件分支/循环图实战
 721°
LangGraph开发-生态介绍，入门demo实战
 765°
LangChain-接入12306-HTTP MCP智能体
 899°
LangChain接入自定义爬虫-MCP工具
 862°
LangChain接入Filesystem-MCP工具
 880°
LangChain搭建MCP服务端和客户端流程
 974°
LangGraph与MCP技术概述
 900°

博主信息

姓名：Run
职业：谜
邮箱：383697894@qq.com
定位：上海 · 松江

扫我打开

友情链接

百度 淘宝 腾讯 慕课网 CSDN 博客园 51cto博客