arraysgopointerstype-conversionslice

How do you flatten a slice of arrays without extra allocation?


In Go, if I have a slice of arrays of a given type, how can I convert that to just a slice of that same type?

Specifically, is it possible to do this without allocating memory for a new backing array? (Meaning, I'm ignoring the obvious option of looping through the original slice and copying its values into a new slice.)

For example, what I want is:

func convert(sliceOfArrays [][4]byte) []byte {
    // solution goes here
}

The exact types and array lengths are unimportant.

I asked and answered my own question since I couldn't find anything like this on SO and wanted it recorded.


Solution

  • (Note, this answer and its code comments assume an understanding of how slices in Go are represented in memory, specifically "backing arrays". If unfamiliar, read here: https://go.dev/blog/slices-intro)

    Yes, this is possible without allocating memory for a new backing array. Using the unsafe package, this is fairly straightforward. Here is a heavily-commented example:

    import "unsafe"
    
    func convert(sliceOfArrays [][4]byte) []byte {
        // create an *unsafe.ArbitraryType, specifically *[4]byte, from the original slice's backing array
        sliceData := unsafe.SliceData(sliceOfArrays)
    
        // convert to an unsafe.Pointer; this is necessary to allow the next step to be possible
        pointer := unsafe.Pointer(sliceData)
    
        // convert to a *byte
        // NOTE this is a pointer to a single byte, not a slice of bytes
        // from the documentation: "unsafe.Pointer can be converted to a pointer value of any type"
        bytePointer := (*byte)(pointer)
    
        // calculate how many total elements (bytes, in this case) will be in the resulting slice
        // in this case, this is just 4 * len(sliceOfArrays) / 1
        length := int(unsafe.Sizeof([4]byte{})) * len(sliceOfArrays) / int(unsafe.Sizeof(byte(0)))
        // more generally, this formula is:
        // numberOfResultingElements = sizeOfOriginalElement * numberOfOriginalElements / sizeOfResultingElement
    
        // unsafe.Slice creates a new slice of bytes
        // the backing array of this slice starts at the memory location of `bytePointer`
        // the slice has both length and capacity of `length`
        return unsafe.Slice(bytePointer, length)
    }
    

    Note, the original slice sliceOfArrays and the return value of convert() are sharing the same backing array, so modification to one will affect the other:

    mySliceOfArrays := [][4]byte{{0, 1, 2, 3}, {4, 5, 6, 7}}
    sliceOfBytes := convert(mySliceOfArrays)
    
    fmt.Println(mySliceOfArrays) // [[0 1 2 3] [4 5 6 7]]
    fmt.Println(sliceOfBytes)    // [0 1 2 3 4 5 6 7]
    
    // changing the original changes the result
    mySliceOfArrays[0][2] = 99
    fmt.Println(sliceOfBytes)    // [0 1 99 3 4 5 6 7]
    
    // changing the result changes the original
    sliceOfBytes[7] = 13
    fmt.Println(mySliceOfArrays) // [[0 1 99 3] [4 5 6 13]]
    

    All this code in the Go Playground

    (Note, new allocations such as from sliceOfBytes = append(sliceOfBytes, value) will likely result in a new backing array being transparently allocated for sliceOfBytes and its "connection" to mySliceOfArrays being broken.)

    This works, in part, because:

    unsafe.Slice doesn't really care about what the original data was meant to represent. You could change convert() to return []float64 if you wanted to get weird, as long as you re-calculate the slice length appropriately. In the above example, you'd need two [4]byte for each float64 in the result. Here is convertToFloat64s() in the Go Playground

    (If anyone knows of non-obvious issues with this method, please leave a comment.)