I've been using "scroll" to obtain sequences of data from an ES index (v 8.6.2) in Python.
For various reasons I want change to doing reqwest
requests in Rust (PyO3) rather than requests
requests in Python. My first request is thus:
let es_url = "https://localhost:9500"; // ES server is using this port
let data = json!({
"size": 100,
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": [
{
"term": {
"ldoc_type": "docx_doc"
}
}
]
}
},
});
let data_string = serde_json::to_string(&data).unwrap();
let url = format!("{es_url}/{}/_search?scroll=1m", INDEX_NAME.read().unwrap());
let send_result = reqwest_client
.post(url)
.header("Content-type", "application/json")
.body(data_string)
.basic_auth("mike12", Some("mike12"))
.send();
let response = match send_result {
Ok(response) => response,
Err(e) => {
error!("SEND error was {}", e);
return false
},
};
let text = match response.text() {
Ok(text) => text,
Err(e) => {
error!("TEXT error was {}", e);
return false
},
};
let json_hashmap: HashMap<String, Value> = serde_json::from_str(&text).unwrap();
let scroll_id = &json_hashmap["_scroll_id"];
let scroll_id_str = scroll_id.to_string();
let scroll_id_str = rem_first_and_last(&scroll_id_str); // strip "\"" either side
info!("|{:#?}| type {}", scroll_id_str, utilities::str_type_of(&scroll_id_str));
... this shows the returned &str
"scroll ID". A typical example is:
"FGluY2x1ZGVfY29udGV4dF91dWlkDnF1ZXJ5VGhlbkZldGNoBxZQcGUxbjhnQ1RzU0M1bG1ZR05od1VBAAAAAAAAAMgWbkU3ck5YUTNSazZlazJtUXctSklWQRZQcGUxbjhnQ1RzU0M1bG1ZR05od1VBAAAAAAAAAMkWbkU3ck5YUTNSazZlazJtUXctSklWQRZQcGUxbjhnQ1RzU0M1bG1ZR05od1VBAAAAAAAAAMoWbkU3ck5YUTNSazZlazJtUXctSklWQRZQcGUxbjhnQ1RzU0M1bG1ZR05od1VBAAAAAAAAAMsWbkU3ck5YUTNSazZlazJtUXctSklWQRZQcGUxbjhnQ1RzU0M1bG1ZR05od1VBAAAAAAAAAMwWbkU3ck5YUTNSazZlazJtUXctSklWQRZQcGUxbjhnQ1RzU0M1bG1ZR05od1VBAAAAAAAAAM0WbkU3ck5YUTNSazZlazJtUXctSklWQRZQcGUxbjhnQ1RzU0M1bG1ZR05od1VBAAAAAAAAAM4WbkU3ck5YUTNSazZlazJtUXctSklWQQ=="
... this is strange, because this id string is about 4 times as long as the ID string I get when I use Python. But the above string, although seeming to have a certain repetitive character, isn't actually a repeat of a smaller string.
After that, to do repeated requests to get further batches of 100 hits, I do this:
let mut n_loops = 0;
loop {
n_loops += 1;
let data = json!({
"scroll": "1m",
"scroll_id": scroll_id_str,
});
let data_string = serde_json::to_string(&data).unwrap();
let url = format!("{es_url}/_search?scroll");
let send_result = reqwest_client
.post(url)
.header("Content-type", "application/json")
.body(data_string)
.basic_auth("mike12", Some("mike12"))
.send();
let response = match send_result {
Ok(response) => response,
Err(e) => {
error!("SEND error was {}", e);
return false
},
};
info!("response {:?} type {}", response, utilities::str_type_of(&response));
let response_status = response.status();
... this is where I get 400, "Bad Request". My initial suspicion is that the scroll ID is wrong. In Python the equivalent "follow-on" "scroll" requests work fine.
In your second call, the URL is wrong. Instead of this
let url = format!("{es_url}/_search?scroll");
it should be this
let url = format!("{es_url}/_search/scroll");
^
|
change this