I ran this command to install conda install -c conda-forge scrapy pylint autopep8 -y
then I ran scrapy bench to get the below error.
The same thing is happening on global installation via pip
command.
Please help as I can't understand the reason for this error
scrapy bench
2025-01-25 13:52:30 [scrapy.utils.log] INFO: Scrapy 2.12.0 started (bot: scrapybot)
2025-01-25 13:52:30 [scrapy.utils.log] INFO: Versions: lxml 5.3.0.0, libxml2 2.13.5, cssselect 1.2.0, parsel 1.10.0, w3lib 2.2.1, Twisted 24.11.0, Python 3.12.8 | packaged by conda-forge | (main, Dec 5 2024, 14:06:27) [MSC v.1942 64 bit (AMD64)], pyOpenSSL 25.0.0 (OpenSSL 3.4.0 22 Oct 2024), cryptography 44.0.0, Platform Windows-11-10.0.26100-SP0
2025-01-25 13:52:31 [scrapy.addons] INFO: Enabled addons:
[]
2025-01-25 13:52:31 [scrapy.extensions.telnet] INFO: Telnet Password: 1d038a25605956ac
2025-01-25 13:52:31 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.closespider.CloseSpider',
'scrapy.extensions.logstats.LogStats']
2025-01-25 13:52:31 [scrapy.crawler] INFO: Overridden settings:
{'CLOSESPIDER_TIMEOUT': 10, 'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO'}
2025-01-25 13:52:32 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.offsite.OffsiteMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2025-01-25 13:52:32 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2025-01-25 13:52:32 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2025-01-25 13:52:32 [scrapy.core.engine] INFO: Spider opened
2025-01-25 13:52:32 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2025-01-25 13:52:32 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2025-01-25 13:52:32 [scrapy.core.scraper] ERROR: Spider error processing <GET http://localhost:8998?total=100000&show=20> (referer: None)
Traceback (most recent call last):
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\utils\defer.py", line 327, in iter_errback
yield next(it)
^^^^^^^^
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\utils\python.py", line 368, in __next__
return next(self.data)
^^^^^^^^^^^^^^^
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\utils\python.py", line 368, in __next__
return next(self.data)
^^^^^^^^^^^^^^^
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
yield from iterable
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\spidermiddlewares\referer.py", line 379, in <genexpr>
return (self._set_referer(r, response) for r in result)
^^^^^^
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
yield from iterable
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 57, in <genexpr>
return (r for r in result if self._filter(r, spider))
^^^^^^
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
yield from iterable
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\spidermiddlewares\depth.py", line 54, in <genexpr>
return (r for r in result if self._filter(r, response, spider))
^^^^^^
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
yield from iterable
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\commands\bench.py", line 70, in parse
assert isinstance(Response, TextResponse)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
2025-01-25 13:52:32 [scrapy.core.engine] INFO: Closing spider (finished)
2025-01-25 13:52:32 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 241,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 1484,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'elapsed_time_seconds': 0.140934,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2025, 1, 25, 8, 22, 32, 389327, tzinfo=datetime.timezone.utc),
'items_per_minute': None,
'log_count/ERROR': 1,
'log_count/INFO': 10,
'response_received_count': 1,
'responses_per_minute': None,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'spider_exceptions/AssertionError': 1,
'start_time': datetime.datetime(2025, 1, 25, 8, 22, 32, 248393, tzinfo=datetime.timezone.utc)}
2025-01-25 13:52:32 [scrapy.core.engine] INFO: Spider closed (finished)
This is a bug on Scrapy introduced on 2.12.0.
It's passing the wrong param to isinstance()
.
This function expects the first param to be the object to be verified (see the docs), but it's currently passing Response
class, which leads to the AssertionError
we can see in your logs:
File "C:\Users\Risha\anaconda3\envs\scrapy\Lib\site-packages\scrapy\commands\bench.py", line 70, in parse
assert isinstance(Response, TextResponse)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
I submitted a PR with a fix here replacing the Response
class passed as param with the response
object. The PR was merged, but a new version wasn't yet released.
Therefore, to move forward, you can choose one of the options below:
a) Clone the Scrapy repository and install it based on the latest master
b) Downgrade your scrapy version to 2.11.2
c) Wait until Scrapy officially releases the fix (likely on 2.13 version)