My Django 1.10 app provides a search functionality using Haystack + Elastic Search. It works great for models data, but I need to make it work for static content too (basically HTML files).
I was thinking on scrapping the content from the HTML files (BeautifulSoup?) and save them to the database, this way the templates content could be indexed.
I found this module that does exactly what I need but seems deprecated:
https://github.com/trapeze/haystack-static-pages
So, what's the best way to allow haystack to find the content included in HTML pages?
I forked the module haystack-static-pages and adapted it to my needs. Now is compatible with Django 1.10 + haystack 2.5 and support login to scrap logged pages :)
Updated version: https://github.com/pisapapiros/haystack-static-pages