Skip to content Skip to sidebar Skip to footer

Getting TCP Connection Timed Out: 110: Connection Timed Out. On AWS While Using Scrapy?

This is my scrapy code. import scrapy from scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.selector import Selector from scrapy.contrib.linkextractors.sgml import SgmlL

Solution 1:

There are few checks that you can.

  • Try opening the same url in the using requests module or urllib.
  • try doing "wget" of the page.

if above steps yields result it means we are getting response and there is problem with Our Spider way of requesting. Few things we can do w.r.t to Spider now.

  1. Increase the DOWNLOAD_TIMEOUT in the settings file.
  2. Increase RETRY_TIMES = 10
  3. Increase the DOWNLOAD_DELAY
  4. This is the last resort. Chances are there that website has reorganized that we are bot and trying to get away with us. In this case we need to proxy. Use this middleware [https://github.com/aivarsk/scrapy-proxies Scrapy Proxy Middleware][1]

Solution 2:

Maybe your ip is blocked by the website


Post a Comment for "Getting TCP Connection Timed Out: 110: Connection Timed Out. On AWS While Using Scrapy?"