美烦资源网

专注技术文章分享,涵盖编程教程、IT 资源与前沿资讯

用Python 爬取并保存小说

1. 安装requests pip install requests

2. 安装lxml pip install lxml

3. 斗罗大陆网页

  1. 代码
4 import requests
from lxml import etree
url='https://www.85xs.cc/book/douluodalu1/1.html'
while True:
   headers={
      'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0'
      }
   resp=requests.get(url,headers=headers)
   resp.encoding='utf-8'
   #print(resp.text)
   e=etree.HTML(resp.text)
   info='\n'.join(e.xpath('//div[@class="m-post"]/p/text()'))
   title=e.xpath('//h1/text()')[0]
   url=f'https://www.85xs.cc{e.xpath("//tr/td[2]/a/@href")[0]}'
   #print(info)
   #print(title,'utf-8')
   with open('斗罗大陆.txt','w',encoding='utf-8') as f:
      f.write(title+'\n\n'+info+'\n\n')
  1. 效果
控制面板
您好,欢迎到访网站!
  查看权限
网站分类
最新留言