algorithmcoldfusionset-theory

Algorithm to aggregate key-value pairs where keys contain numbers


I will try my best to give a complete problem definition. To illustrate the problem I will give an example later on in my question. You might want to jump to the example first and read my problem definition afterwards.

The problem

I have got a map representing key value pairs. The keys may end with -NUMBER where NUMBER is an integer number. However there may also be keys not ending with a dash and a number.

The key prior to the leading -NUMBER may contain dashes as well.

There may be multiple keys starting with the same string and ending with different numbers.

There may also be multiple keys starting with different strings and also ending with numbers.

General conditions

  1. All keys are unique
  2. The map is not ordered
  3. The order of the keys is random
  4. It is safe to assume that all strings in the keys are upper case
  5. If there is a key ending with a dash and a number n greater one it is guaranteed that that there are all keys starting with the same string and ending with all numbers m with 1 < m < n are present in the map.
  6. It doesn't matter whether the original keys ending with numbers remain in the final set or not

Focus for the solution

The solution should not strictly focus on optimizing runtime or spacial complexity but rather readability and maintainability. There are about 200 entries in the map at most and there no high traffic is expected on the application.

Example

Input:

{
    "FIRST-KEY"   = "FOO",
    "SECOND-KEY-3"= "BAZ",
    "THIRD-KEY-2" = "BAR",
    "SECOND-KEY-1"= "FOO",
    "SECOND-KEY-2"= "BAR",
    "THIRD-KEY-1" = "FOO"
}

Expected Output:

{
    "FIRST-KEY" = "FOO",
    "SECOND-KEY"= ["FOO", "BAR", "BAZ"],
    "THIRD-KEY" = ["FOO", "BAR"]
}

or (if original keys remain in result):

{
    "FIRST-KEY"   = "FOO",
    "SECOND-KEY-3"= "BAZ",
    "THIRD-KEY-2" = "BAR",
    "SECOND-KEY-1"= "FOO",
    "SECOND-KEY-2"= "BAR",
    "THIRD-KEY-1" = "FOO",
    "FIRST-KEY"   = "FOO",
    "SECOND-KEY"  = ["FOO", "BAR", "BAZ"],
    "THIRD-KEY"   = ["FOO", "BAR"]
}

Final notes

My solution has to be implemented in ColdFusion. The input which I have to refered to as map in the beginning of my question is called struct in ColdFusion land.

You can formulate your answer in ColdFusion (script syntax prefered) but you can also choose any other language you prefer (including pseudo code) as long as you don't use another language's standard library that i can't use in ColdFusion.


Solution

  • Thanks to the ColdFusion community on Slack we came up with a solution that meets my requirements:

    data = data.reduce(function(acc, k, v) {
        var lastElement = listLast(k, '-');
        if(isNumeric(lastElement)) {
            var newKey = reReplace(k, '-\d+$', '');
            // init array if not initialized yet
            if(!acc.keyExists(newKey)) acc[newKey] = [];
            acc[newKey][lastElement] = v;
        }
    
        // may be put into an else block. This is only in here to attach the original keys in any case
        acc[k] = v;
    
        return acc;
    }, {});
    

    IMO it's very consice, compact and elegant. Moreover it's mostly self explanatory and can be understood intuitively.

    The idea is, that the whole problem can be boiled down to a single reduction. For each element we look at the last "segment" (seperated by -) and if it's numeric we can build the additoinal array for that key.

    If anyone disagrees or has a better solution I would be more than happy to see it.

    Lastly thanks for the effords Ageax, Credits for the solution go to rodel30