javascriptmongodbmeteorminimongo

Aggregate query in MongoDB using javascript trying to convert UTC string date to Date object


I'm trying to group datetimes by the hour and I am uploading data from a json file. The datetimes are in UTC format as strings though. How do I convert them to Date objects using javascript in meteor?

var data = Subjects.aggregate([{$match: no_null},
                  { $project: 
                    { bin: { $hour: new Date("$metrics."+metric") } 
                  } },
                  {$group: { _id: "$bin", count: {$sum: 1} } } ])

Metric is already in UTC format.

Here is one document.

{
    "metrics": {
      "ACT_04": 3.733333333, 
      "ACT_05": 14.5, 
      "ACT_02": "1970-01-01T05:02:30Z", 
      "ACT_03": 7.0833333329999997, 
      "ACT_01": "1970-01-01T20:03:00Z"
    }, 
    "subject_id": "M10965219", 
    "name": "sub-M10965219_Act"
  }

Solution

  • If the "string" is in "lexical" ISO Format as in "2018-05-15T20:59:31.502Z", then you "could" get the $substr parts:

    Subjects.aggregate([
      { "$group": {
        "_id": { "$substr": ["$metrics."+metric,0,10] },
        "count": { "$sum": 1 }
      }}
    ])
    

    If you have MongoDB 3.6 available then you "could" use $dateFromString to actually get that into a BSON date format, but you probably want to similarly truncate that string in a similar way for any date rounding you actually want:

    Subjects.aggregate([
      { "$group": {
        "_id": {
          "$dateFromString": {
            "dateString": { "$substr": ["$metrics."+metric,0,10] }
          }            
        },
        "count": { "$sum": 1 }
      }}
    ])
    

    MongoDB 4.0 has $toDate for a little shorter and does the same BSON Date conversion:

    Subjects.aggregate([
      { "$group": {
        "_id": {
          "$toDate": { "$substr": ["$metrics."+metric,0,10] }
        },
        "count": { "$sum": 1 }
      }}
    ])
    

    But the basic fact this if your "string" does not actually look like that and cannot really be broken into parts of the string in a consistent way by supported operators, then you are generally far better off converting the dates stored in your collection(s) to be actual BSON Dates in the first place.

    The "recommended" thing to do here is instead to actually run the conversion of the data. The mongo shell should be sufficient to run "one-off" conversions:

    var batch = [];
    
    db.subjects.find().forEach(doc => {
    
      var fields = Object.keys(doc.metrics).map(k => ({
        ['metrics.'+k]: new Date(doc.metrics[k]),
      })).reduce((acc,curr) => Object.assign(acc,curr),{});
    
      batch.push({
        "updateOne": {
          "filter": doc._id",
          "update": { "$set": fields }
        }
      });
    
      if ( batch.length > 1000 ) {
         db.subjects.bulkWrite(batch);
         batch = []
      }
    })
    
    if ( batch.length > 0 ) {
      db.subjects.bulkWrite(batch);
      batch = [];
    }
    

    That's the basic process. Of course if your "strings" are not in an ISO Format or any suitable for passing to Date() for conversion, then you'll actually need to take other measures.

    At the very least a BSON Date is is an 8-btye data structure with an "internal" numeric representation, where as the equivalent "string" it at least 24-bytes and even longer with words naming days and months. It makes sense to keep data that is essentially "numeric" in nature in that form when working with a database.

    My advice is to convert the data, and possibly on import or if you cannot work that out then at least once loaded to the collection. It's better that trying to work with strings for something they are really not meant to do.

    Note $substr is considered deprecated and actually now aliases $substrBytes, or you can use $substrCP depending on the actual encoding of the data. Older MongoDB releases before 3.2 have the $substr expression only.