I'm drawing a blank trying to formulate an AQL query to aggregate a count of PASS/FAIL's for multiple attributes of a sizable dataset. I have a collection of testing data that looks something like this:
[
{
workorder: "123456",
device_id: "1",
pull_force: "PASS",
cut_force: "PASS",
kpa_force: "FAIL",
...
}
]
The output that I'd like to return from the collection would be something like:
{
pull_force: {
PASS: 300321,
FAIL: 400
},
cut_force: {
PASS: 300211,
FAIL: 200
},
...
}
I could collect the data that I'm looking for running the below, but it feels wrong and since I am looking for quite a few pass/fail attributes its very slow to run:
LET pull_force_data = mydataset[*].pull_force
LET cut_force_data = mydataset[*].cut_force
LET kpa_force_data = mydataset[*].kpa_force
...
RETURN {
pull_force: {
PASS: pull_force_data[* FILTER CURRENT == "PASS"],
FAIL: pull_force_data[* FILTER CURRENT == "FAIL"]
},
cut_force: {
PASS: cut_force_data[* FILTER CURRENT == "PASS"],
FAIL: cut_force_data[* FILTER CURRENT == "FAIL"]
},
kpa_force: {
PASS: kpa_force_data[* FILTER CURRENT == "PASS"],
FAIL: kpa_force_data[* FILTER CURRENT == "FAIL"]
},
...
}
Could anyone provide me with some guidance?
Use collect:
The COLLECT operation can group data by one or multiple grouping criteria, retrieve all distinct values, count how often values occur, and calculate statistical properties efficiently
If you are aggregating over all documents the no specific group key needed, then a simple Aggregate
can be used:
FOR doc IN collection
COLLECT
pull_force = doc.pull_force
//rest...
AGGREGATE
pull_force_PASS = SUM(pull_force == "PASS" ? 1 : 0),
pull_force_FAIL = SUM(pull_force == "FAIL" ? 1 : 0)
//rest...
RETURN
{
pull_force: {
PASS: pull_force_PASS,
FAIL: pull_force_FAIL
}
//rest...
}