celery如何在python爬虫中进行定时操作？-中华考试网

导航

优质课程直播间官网直达

网络开发

首页> python> python爬虫> 文章内容

celery如何在python爬虫中进行定时操作？

来源 :中华考试网 2020-11-27

中

　　爬虫由于其特殊性，可能需要定时做增量抓取，也可能需要定时做模拟登陆，以防止cookie过期，而celery恰恰就实现了定时任务的功能。在上述基础上，我们将`tasks.py`文件改成如下内容

　　from celery import Celery

　　app = Celery('add_tasks', broker='redis:''//223.129.0.190:6379/2', backend='redis:''//223.129.0.190:6379/3')

　　app.conf.update(

　　# 配置所在时区

　　CELERY_TIMEZONE='Asia/Shanghai',

　　CELERY_ENABLE_UTC=True,

　　# 官网推荐消息序列化方式为json

　　CELERY_ACCEPT_CONTENT=['json'],

　　CELERY_TASK_SERIALIZER='json',

　　CELERY_RESULT_SERIALIZER='json',

　　# 配置定时任务

　　CELERYBEAT_SCHEDULE={

　　'my_task': {

　　'task': 'tasks.add', # tasks.py模块下的add方法

　　'schedule': 60, # 每隔60运行一次

　　'args': (23, 12),

　　}

　　)

　　@app.task

　　def add(x, y):

　　return x + y

　　然后先通过`ctrl+c`停掉前一个worker，因为我们代码改了，需要重启worker才会生效。我们再次以`celery -A tasks worker -l info`这个命令开启worker。

　　这个时候我们只是开启了worker，如果要让worker执行任务，那么还需要通过beat给它定时发送，我们再开一个命令行，切换到项目根目录，通过

　　celery beat -A tasks -l info

　　celery beat v3.1.25 (Cipater) is starting.

　　__ - ... __ - _

　　Configuration ->

　　. broker -> redis://223.129.0.190:6379/2

　　. loader -> celery.loaders.app.AppLoader

　　. scheduler -> celery.beat.PersistentScheduler

　　. db -> celerybeat-schedule

　　. logfile -> [stderr]@%INFO

　　. maxinterval -> now (0s)

　　[2017-05-19 15:56:57,125: INFO/MainProcess] beat: Starting...

　　这样就表示定时任务已经开始运行了。

分享到

网络开发

celery如何在python爬虫中进行定时操作？

您可能感兴趣的文章

celery如何在python爬虫中进行定时操作？

scrapy如何在python分布式爬虫中构建?

python分布式爬虫中spider_Worker节点是什么？

scrapy可以独立在python分布式爬虫内使用吗?

python分布式爬虫中的消息队列是什么？

rpop方法如何在python分布式爬虫中使用？

资讯

我的

网络开发

celery如何在python爬虫中进行定时操作？

python课程免费试听预约

您可能感兴趣的文章

celery如何在python爬虫中进行定时操作？

scrapy如何在python分布式爬虫中构建?

python分布式爬虫中spider_Worker节点是什么？

scrapy可以独立在python分布式爬虫内使用吗?

python分布式爬虫中的消息队列是什么？

rpop方法如何在python分布式爬虫中使用？

资讯

我的