ibm-cloudibm-watsonwatson-discoverywatson-assistant

IBM Watson Discovery crawling issue


We want to index our client website and store all the data in IBM Watson Discovery service. When user asks question related to client data then (we will connect discovery with Watson Assistant). The chatbot should connect to Discovery and fetch the data to respond.

Problem: The client website has multiple links and each link will have further links, we want crawl all the data from website and index and store it in Watson Discovery service. We tried crawling the site but Discovery service is taking much time to crawl the site and also its not completed the task after 1 week also. Please let us know how we can achieve this in better and faster way.


Solution

  • Note that the web crawling is a current beta and the Watson Discovery documentation for web crawl states that, depending on the website, it will not ingest all data.

    I used the web crawl in Discovery in a similar scenario like yours and query my website using a chat built with Watson Assistant. What you should do: