python如何导出微信公众号文章方法详解_Python

python如何导出微信公众号文章方法详解

1.安装wkhtmltopdf

下载地址:https://wkhtmltopdf.org/downloads.html

我测试用的是windows的，下载安装后结果如下

python如何导出微信公众号文章方法详解

2 编写python 代码导出微信公众号文章

不能直接使用wkhtmltopdf 导出微信公众号文章，导出的文章会缺失图片，所以需要使用 wechatsogou 将微信公众号文章页面抓取，之后将html文本转化为pdf

									pip install wechatsogou --upgrade

									pip install pdfkit

踩坑！！！，看了很多人的代码，都是一个模板，大家都是抄来抄去，结果还是运行不了，可能是因为依赖包更新的原因，也可能是因为我本地没有配置wkhtmltopdf 的环境变量

									import os

									import pdfkit

									import datetime

									import wechatsogou

									# 初始化API

									ws_api = wechatsogou.WechatSogouAPI(captcha_break_time=3)

									def url2pdf(url, title, targetPath):

									 '''

									 使用pdfkit生成pdf文件

									 :param url: 文章url

									 :param title: 文章标题

									 :param targetPath: 存储pdf文件的路径

									 '''

									 try:

									 content_info = ws_api.get_article_content(url)

									 except:

									 return False

									 # 处理后的html

									 html = f'''

									{title}

									 {content_info['content_html']}

									 '''

									 try:

									 path_wk="E:/softwareAPP/wkhtmltopdf/bin/wkhtmltopdf.exe";

									 config=pdfkit.configuration(wkhtmltopdf=path_wk)

									 pdfkit.from_string(input=html, output_path=targetPath,configuration=config)

									 except:

									 # 部分文章标题含特殊字符，不能作为文件名

									 filename = datetime.datetime.now().strftime('%Y%m%d%H%M%S') + '.pdf'

									 pdfkit.from_string(html, targetPath + os.path.sep + filename)

									if __name__ == '__main__':

									 # 此处为要爬取公众号的名称

									 url2pdf("https://mp.weixin.qq.com/s/wwT5n2JwEEAkrrmOhedziw", "HBase的系统架构全视角解读","G:/test/hbase文档.pdf" )

									 # gzh_name = ''

									 # # 如果不存在目标文件夹就进行创建

									 # if not os.path.exists(targetPath):

									 # os.makedirs(targetPath)

									 # # 将该公众号最近10篇文章信息以字典形式返回

									 # data = ws_api.get_gzh_article_by_history(gzh_name)

									 # article_list = data['article']

									 # for article in article_list:

									 # url = article['content_url']

									 # id="codetool">



	到此这篇关于python如何导出微信公众号文章方法详解的文章就介绍到这了,更多相关python导出微信公众号文章内容请搜索服务器之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持服务器之家！

	原文链接：https://www.php.cn/python-tutorials-459324.html
标签：Python 导出 微信公众号文章 
相关文章
python+requests接口自动化框架的实现2020-09-01
Python生成并下载文件后端代码实例2020-09-01
Python Opencv图像处理基本操作代码详解2020-09-01
Python Matplotlib绘图基础知识代码解析2020-09-01
一些关于python 装饰器的个人理解2020-09-01
Python常用模块函数代码汇总解析2020-09-01
热门资讯
2020微信伤感网名听哭了 让对方看到心疼的伤感网名大全 2019-12-26
Intellij idea2020永久破解，亲测可用！！！ 2020-07-29
歪歪漫画vip账号共享2020_yy漫画免费账号密码共享 2020-04-07
最新idea2020注册码永久激活(激活到2100年) 2020-07-29
iPhone12什么时候上市 iPhone12手机真实图片 苹果iphone12多少钱 2020-06-03
返回顶部
首页 l 电脑版 l 网站标签 l 网站地图