Skip to content Skip to sidebar Skip to footer

Python-rq Worker Closes Automatically

I am implementing python-rq to pass domains in a queue and scrape it using Beautiful Soup. So i am running multiple workers to get the job done. I started 22 workers as of now, and

Solution 1:

Okay I figured out the problem. It was because of worker timeout.

try:
  --my code goes here--
except Exception, ex:
  self.error += 1withopen("error.txt", "a") as myfile:
     myfile.write('\n%s' % sys.exc_info()[0] + "{}".format(self.url))
  pass

So according to my code, the next domain is dequeued if 200 url(s) is fetched from each domain. But for some domains there were insufficient number of urls for the condition to terminate (like only 1 or 2 urls).

Since the code catches all the exception and appends to error.txt file. Even the rq timeout exception rq.timeouts.JobTimeoutException was caught and was appended to the file. Thus making the worker to wait for x amount of time, which leads to termination of the worker.

Post a Comment for "Python-rq Worker Closes Automatically"