marklogicmarklogic-optic-api

Optic API: how do I get the value of a column that goes with a max value?


My data set has two arrays holding objects with two properties, a date and a value. For each array, I need to get the object that has the most recent date. I'm trying to do this from indexes and am exploring using the Optic API for the query.

My view has three columns: "statusType", which indicates which array the values come from; "statusDate"; and "value". With the following query, I can get the most recent date for each type, but I don't see how I can get the value that goes with it.

const op = require('/MarkLogic/optic');

op.fromView('Parent', 'info')
  .where(cts.documentQuery('/test/doc1.json'))
  .groupBy([op.col('statusType')], [op.max('maxdate', op.col('statusDate'))])
  .result()

Produces:

{
  "statusType": "subtype1", 
  "maxdate": "2020-09-29T16:33:18.6301434-04:00"
},
{
  "statusType": "subtype2", 
  "maxdate": "2020-08-29T16:33:18.6301434-04:00"
}

If I add value to the first parameter to groupBy, I get all distinct combinations of type and value (with the maxdate). If I add value to the second parameter to groupBy, I get the last value, not the one associated with the maxdate.

Expected output:

{
  "statusType": "subtype1", 
  "value": "valueB",
  "maxdate": "2020-09-29T16:33:18.6301434-04:00"
},
{
  "statusType": "subtype2", 
  "value": "valueC",
  "maxdate": "2020-08-29T16:33:18.6301434-04:00"
}

Sample data:

'use strict';
declareUpdate();

xdmp.documentInsert(
  '/test/doc1.json',
  {
    "parent": {
      "subtype1": [
        {
          "value": "valueA", 
          "date": "2020-07-29T16:33:18.6301434-04:00"
        }, 
        {
          "value": "valueB", 
          "date": "2020-09-29T16:33:18.6301434-04:00"
        }
      ], 
      "subtype2": [
        {
          "value": "valueC", 
          "date": "2020-08-29T16:33:18.6301434-04:00"
        }, 
        {
          "value": "valueD", 
          "date": "2020-07-29T16:33:18.6301434-04:00"
        }
      ]
    }
  }
)

Template 1:

declareUpdate();
const tde = require("/MarkLogic/tde.xqy");

let template = 
  xdmp.toJSON(
    {
      "template": {
        "context": "/parent/subtype1",
        "rows": [
          {
            "schemaName": "Parent",
            "viewName": "info",
            "columns": [
              {
                "name": "statusType",
                "scalarType": "string",
                "val": "'subtype1'"
              },
              {
                "name": "value",
                "scalarType": "string",
                "val": "value"
              },
              {
                "name": "statusDate",
                "scalarType": "dateTime",
                "val": "date"
              }
            ]
          }
        ]
      }
    }
  );

// comment and uncomment based on which action you want to take
let action = 
  //'validate'
  //'extract'
  'insert'
;

if (action === 'validate') {
  tde.validate([template]);
} else if (action === 'extract') {
  tde.nodeDataExtract([cts.doc( "/test/doc1.json" )],  [template])
} else if (action === 'insert') {
  tde.templateInsert("/tde/subtype1.json", template, xdmp.defaultPermissions(), ["TDE"])
}

Template 2:

declareUpdate();
const tde = require("/MarkLogic/tde.xqy");

let template = 
  xdmp.toJSON(
    {
      "template": {
        "context": "/parent/subtype2",
        "rows": [
          {
            "schemaName": "Parent",
            "viewName": "info",
            "columns": [
              {
                "name": "statusType",
                "scalarType": "string",
                "val": "'subtype2'"
              },
              {
                "name": "value",
                "scalarType": "string",
                "val": "value"
              },
              {
                "name": "statusDate",
                "scalarType": "dateTime",
                "val": "date"
              }
            ]
          }
        ]
      }
    }
  );

// comment and uncomment based on which action you want to take
let action = 
  //'validate'
  //'extract'
  'insert'
;

if (action === 'validate') {
  tde.validate([template]);
} else if (action === 'extract') {
  tde.nodeDataExtract([cts.doc( "/test/doc1.json" )],  [template])
} else if (action === 'insert') {
  tde.templateInsert("/tde/subtype2.json", template, xdmp.defaultPermissions(), ["TDE"])
}

Solution

  • Salutations, Mr Cassel:

    If I understand the requirement correctly, the only approach that I know is to join the groupBy() result with the original view:

    1. The groupBy() emits rows with the grouping key and max() aggregate values, passing an alias / qualifier name on the fromView() accessor.
    2. Get the rest of the columns for the max row by joining with the same view (with join keys on maxInfo.statusType=info.statusType and maxInfo.maxdate=info.statusDate).

    The groupBy() operation samples any column in the aggregates argument.

    Hoping that helps,