mongodbaggregation-frameworkstandard-deviation

How to do operations to a document other than the current one in MongoDB's setWindowFields


I have a MongoDB collection like this:

[
  {
    "_id": 1,
    "price": 1
  },
  {
    "_id": 2,
    "price": 2
  },
  {
    "_id": 3,
    "price": 3
  },
  {
    "_id": 4,
    "price": 4
  },
  {
    "_id": 5,
    "price": 5
  },
  {
    "_id": 6,
    "price": 6
  }
]

I want to calculate standard deviation myself (I know there's a built in operator but I want to change some parameters, so implementing it myself).

I calculated the running mean, but how do use the last mean in a setWindowFields stage:

const aggregation1 = [
    {
        $setWindowFields: {
            sortBy: {
                _id: 1
            },
            output: {
                mean: {
                    $avg: "$price",
                    window: {
                        documents: [-4, 0]
                    }
                }
            }
        }
    },
    {
        $setWindowFields: {
            sortBy: {
                _id: 1
            },
            output: {
                field_new: {
                    $sum: [
                        "$price",
                        { $last: "$mean" } //Gives error
                    ],
                    window: {
                        documents: [-4, 0]
                    }
                }
            }
        }
    }
];
db.collection.aggregate(aggregation);

I'm looking to perform an operation on each price field in a document (sum), with the last mean. e.g. x1 + mean at x5 , x2 + mean at x5, ... , x6 + mean at x10, x7 + mean at x10, ...

Like we do in a standard deviation formula: Summation of square of difference between price and average price.

Here's how the expected output should look like:

[
    {
      "_id": 1,
      "price": 1
    },
    {
      "_id": 2,
      "price": 2
    },
    {
      "_id": 3,
      "price": 3
    },
    {
      "_id": 4,
      "price": 4
    },
    {
      "_id": 5,
      "price": 5,
      "field_new": 8 // 5 + 3 (3=(1+2+3+4+5)/5 mean from last 5 docs)
    },
    {
      "_id": 6,
      "price": 6,
      "field_new": 10 // 6 + 4 (4=(2+3+4+5+6)/5 mean from last 5 docs)
    }
]

Solution

  • EDIT: for the updated question with the expected output:

    db.collection.aggregate([
      {$setWindowFields: {
          sortBy: {_id: 1},
          output: {
            mean: {
              $push: "$price",
              window: {documents: [-N, 0]}
            }
          }
      }},
      {$set: {
          mean: "$$REMOVE",
          field_new: {
            $cond: [
              {$gt: [{$size: "$mean"}, N]},
              {$add: ["$price", {$avg: "$mean"}]},
              "$$REMOVE"
            ]
          }
      }}
    ])
    

    See how it works on the playground example

    ** For the original question: ** One option is to add another $setWindowFields step with an opposite direction sorting:

    db.collection.aggregate([
      {$setWindowFields: {
          sortBy: {_id: 1},
          output: {
            mean: {
              $avg: "$price",
              window: {documents: [-4, 0]}
            }
          }
      }},
      {$setWindowFields: {
          sortBy: {_id: -1},
          output: {
            lastMean: {
              $first: "$mean",
              window: {documents: ["unbounded", "current" ]}
            }
          }
      }}
    ])
    

    See how it works on the playground example