I'm using the Art Institute of Chicago API (https://api.artic.edu/docs/#introduction) and there is an element called subject_titles which is an array of strings. I want to query the API to show me all the results which contain the string "landscapes" in the subject_titles, rather than scrape the API and search for the string on my end.
Some failed examples of what I have tried:
https://api.artic.edu/api/v1/artworks/search?q=[subject_titles]=landscapes
https://api.artic.edu/api/v1/artworks/search?query[terms][subject_titles]=landscape
I reckon it would be replacing '[terms]' with a different specifier, but I can't find which. All my research comes up with results that use the Elasticsearch API, but I'm pretty new to this and that seems like a can of worms I don't want to open (why do I need one API to query another API? Also DSL looks like a headache to learn synatx-wise), but I will learn it if I have to. Is there a way to do this using the simple REST style url endpoint?
TL;DR: https://api.artic.edu/api/v1/artworks/search?query[match][subject_titles]=landscape
If you want an more detailed explanation, I think they tried to make the interface powerful and concise so you can do structured queries with just an url, but I agree, it is a bit confusing.
It looks like the URL parameters are getting translated into top level elements in the DSL and if values start with something like foo[bar]
they are getting translated into foo with bar nested inside. So if you have foo[bar][baz]=10
it will be translated into
{
"foo": {
"bar": {
"baz": 10
}
}
}
With this information in mind we can reverse engineer query[term][is_public_domain]=true
into
{
"query": {
"term": {
"is_public_domain": true
}
}
}
If we now open elasticsearch documentation we can figure out that term
is the type of the query and this query will search all documents were the field is_public_domain
contains true
. We need to search for another field and another value. So, if we replace is_public_domain
with subject_titles
and true
with landscape
. Term works well for boolean fields such as is_public_domain
but it is better to search strings with another query type - match
. So we should also replace term
with match
. At the end we will get the following query:
{
"query": {
"match": {
"subject_titles": "landscape"
}
}
}
Now we can convert it back into URL representation: query[match][subject_titles]=landscape
and if we stick it back on the URL we get
https://api.artic.edu/api/v1/artworks/search?query[match][subject_titles]=landscape
This will give us the first 10 hits. If we want more, we can add limit:
https://api.artic.edu/api/v1/artworks/search?query[match][subject_titles]=landscape&limit=100
and if we want even more we can start paging through the results using the page parameter
https://api.artic.edu/api/v1/artworks/search?query[match][subject_titles]=landscape&limit=100&page=2