kotlinlambdasequencememory-footprint

How to use Kotlin's sequences and lambdas in a way that's memory efficient


So I'm in the process of writing some code that needs to be both memory efficient and fast. I have a working reference in java already, but was rewriting it in kotlin.

I basically need to load a lot of csv files and load them into a tree once and then traverse them repeatedly once they're loaded.

I originally wrote the whole thing using sequences, but found it cause the GC to spike repeatedly.

I can't really share this code, but was wondering if yall know what would cause this to happen.

I'll be happy to add details as you need them, but here's my basic pattern.

step1: inputStream -> csvLines: List<String>

step2: csvLines.drop(x).fold(emptySequence()) -> callOtherFunctionWithFold -> callOtherFunctionWithFold -> Sequence<OutputObjects>

I keep the csvLines as a seperate list because I'm access specific rows based on the rules I need.

step3: Sequence<OuputObjects> -> nodes

The result is functional, but this code is much less memory efficient and less performant than the java equivalent only using arraylists and modifying them in place.

After looking at the visualvm output, I created a ton of kotlin.*.ArrayIterators. It looks like I create one every time I use a lamda.

So what can I do to make this more efficient? I though sequences were supposed to reduce object creation lazily, but it looks like I'm doing things that break its ability to do so.

Do sequences reevaluate after ever GC run or run in general? If so, that would make them unsuitable to use in objects that are loaded at startup, right?

visual vm run


Solution

  • To use Kotlin sequences, you need to start with asSequence()

    csvLines.asSequence()
        .drop(x)
        .fold(...)
        ...
    

    If you leave that out, it uses Collection functions instead which creates a new (intermediate) collection after every function.