I created a simple python app on Heroku to launch scrapyd. The scrapyd service starts, but it launches on port 6800. Heroku requires you to bind it the $PORT variable, and I was able to run the heroku app locally. The logs from the process are included below. I looked at a package scrapy-heroku, but wasn't able to install it due to errors. The code in app.py of this package seems to provide some clues as to how it can be done. How can I implement this as a python command to start scrapyd on the port provided by Heroku?
Procfile:
web: scrapyd
Heroku Logs:
2022-01-24T05:17:27.058721+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 21.7.0 (/app/.heroku/python/bin/python 3.10.2) starting up.
2022-01-24T05:17:27.058786+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.
2022-01-24T05:17:27.059190+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [-] Site starting on 6800
2022-01-24T05:17:27.059301+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f1706e3eaa0>
2022-01-24T05:17:27.059649+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [Launcher] Scrapyd 1.3.0 started: max_proc=32, runner='scrapyd.runner'
2022-01-24T05:18:25.204305+00:00 heroku[web.1]: Error R10 (Boot timeout) -> Web process failed to bind to $PORT within 60 seconds of launch
2022-01-24T05:18:25.231596+00:00 heroku[web.1]: Stopping process with SIGKILL
2022-01-24T05:18:25.402503+00:00 heroku[web.1]: Process exited with status 137
You just need to read the PORT environment variable and write it into your scrapyd config file. You can check out this code that does the same.
# init.py
import os
import io
PORT = os.environ['PORT']
with io.open("scrapyd.conf", 'r+', encoding='utf-8') as f:
f.read()
f.write(u'\nhttp_port = %s\n' % PORT)
Source: https://github.com/scrapy/scrapyd/issues/367#issuecomment-591446036