pythonkeen-io

QueryIncompleteError: Your query did not finish in 300 seconds. Most likely something is wrong on our side


When making an extraction query using the Python Keen client, we're consistentently encountering the same error:

Message: Your query did not finish in 300 seconds. Most likely something is wrong on our side. Please let us know at team@keen.io.

Code: QueryIncompleteError

The parameters of the query are: (in JSON format)

{
    "timezone": -18000,
    "event_collection": "Loaded a Page",
    "filters": [
        {
            "operator": "eq",
            "property_name": "reportType",
            "property_value": "Profile"
        }
    ],
    "timeframe": {
        "start": "2017-04-24",
        "end": "2017-06-19"
    }
}

My suspicions are that the requested date range is too large and the Keen API is choking on the size of that dataset, but it's not clear from the error message.


Solution

  • Your guess is correct! This 504 error happens when your query times out (runs longer than 5 minutes). Here are ways to reduce the runtime of your query:

    1. Shorten the timeframe in your query

    The smaller the timeframe, the faster the query. A query on one week of data will be 4X faster than a query on 1 month of data (roughly).

    A relatively easy fix would be to split this query into two queries by dividing the timeframe into two or more parts.

    2. Reduce the number of properties extracted.

    The extraction query type accepts a parameter called property_names. Here you can list an array of properties that you need in your extraction. Without this parameter, the API will return all properties on your events. Using property_names to extract only the needed properties can dramatically reduce your compute costs and the overhead for the query.

    3. Reduce your collection size

    This probably doesn't apply in this case because it looks like your data model is already established and reasonable, but a query analyzing 400M events will take about twice as long as a query analyzing 200M events. For this reason, we recommend that you don’t store all event types in a single mega-collection. Keen was designed to have multiple collections for each type of action (Signups, Opens, Messages, etc.). If you narrowed this collection to "Loaded a Profile Page", for example, your query would be a lot faster as it wouldn't have to sort through all of the other reportTypes.

    For others encountering this error, caching can also reduce query response times to milliseconds. Caching works for all analysis types like count, sum, median, etc, except for extractions.