linqtransformilookup

Linq - convert an ILookup into another ILookup


This should be simple, but I can't think of a good way to do it. How do you transform an ILookup into another ILookup? For example, how would you copy/clone an ILookup, producing another ILookup with the same keys and same groups?

Here's my lame attempt:

static ILookup<TKey, TValue> Copy<TKey, TValue>(ILookup<TKey, TValue> lookup)
{
    return lookup
        .ToDictionary(
            grouping => grouping.Key,
            grouping => grouping.ToArray())
        .SelectMany(pair =>
            pair
                .Value
                .Select(value =>
                    new KeyValuePair<TKey, TValue>(pair.Key, value)))
        .ToLookup(pair => pair.Key, pair => pair.Value);
}

Can anyone improve this?

-- Brian


Solution

  • Does this do what you want?

    static ILookup<TKey, TValue> Copy<TKey, TValue>(ILookup<TKey, TValue> lookup)
    {
        return lookup.
               SelectMany(g => g,
                         (g, v) => new KeyValuePair<TKey, TValue>(g.Key, v)).
               ToLookup(kvp => kvp.Key, kvp => kvp.Value);
    }
    

    Of course, if you want to transform the values somehow, maybe you want something like this:

    static ILookup<TKey, TValueOut> Transform<TKey, TValue, TValueOut>(
           ILookup<TKey, TValue> lookup,
           Func<TValue, TValueOut> selector)
    {
        return lookup.
               SelectMany(g => g,
                          (g, v) => new KeyValuePair<TKey, TValueOut>(g.Key, selector(v))).
               ToLookup(kvp => kvp.Key, kvp => kvp.Value);
    }
    

    Note that this method holds intermediate values in a KeyValuePair which, being a value type, is stored on the stack and thus doesn't require any intermediate memory allocations. I profiled a test that creates a Lookup<int,int> with 100 keys, each having 10,000 items (for a total of 1,000,000).

    CPU-wise, even with 100,000 elements per key in the Lookup performance between the two copying methods was identical. With 1,000,000 elements per key, the performance was different between the two methods: