stringrandomgo

What is the fastest way to generate a long random string in Go?


Like [a-zA-Z0-9] string:

na1dopW129T0anN28udaZ

or hexadecimal string:

8c6f78ac23b4a7b8c0182d

By long I mean 2K and more characters.


Solution

  • This does about 200MBps on my box. There's obvious room for improvement.

    type randomDataMaker struct {
        src rand.Source
    }
    
    func (r *randomDataMaker) Read(p []byte) (n int, err error) {
        for i := range p {
            p[i] = byte(r.src.Int63() & 0xff)
        }
        return len(p), nil
    }
    

    You'd just use io.CopyN to produce the string you want. Obviously you could adjust the character set on the way in or whatever.

    The nice thing about this model is that it's just an io.Reader so you can use it making anything.

    Test is below:

    func BenchmarkRandomDataMaker(b *testing.B) {
        randomSrc := randomDataMaker{rand.NewSource(1028890720402726901)}
        for i := 0; i < b.N; i++ {
            b.SetBytes(int64(i))
            _, err := io.CopyN(ioutil.Discard, &randomSrc, int64(i))
            if err != nil {
                b.Fatalf("Error copying at %v: %v", i, err)
            }
        }
    }
    

    On one core of my 2.2GHz i7:

    BenchmarkRandomDataMaker       50000        246512 ns/op     202.83 MB/s
    

    EDIT

    Since I wrote the benchmark, I figured I'd do the obvious improvement thing (call out to the random less frequently). With 1/8 the calls to rand, it runs about 4x faster, though it's a big uglier:

    New version:

    func (r *randomDataMaker) Read(p []byte) (n int, err error) {
        todo := len(p)
        offset := 0
        for {
            val := int64(r.src.Int63())
            for i := 0; i < 8; i++ {
                p[offset] = byte(val & 0xff)
                todo--
                if todo == 0 {
                    return len(p), nil
                }
                offset++
                val >>= 8
            }
        }
    
        panic("unreachable")
    }
    

    New benchmark:

    BenchmarkRandomDataMaker      200000        251148 ns/op     796.34 MB/s
    

    EDIT 2

    Took out the masking in the cast to byte since it was redundant. Got a good deal faster:

    BenchmarkRandomDataMaker      200000        231843 ns/op     862.64 MB/s
    

    (this is so much easier than real work sigh)

    EDIT 3

    This came up in irc today, so I released a library. Also, my actual benchmark tool, while useful for relative speed, isn't sufficiently accurate in its reporting.

    I created randbo that you can reuse to produce random streams wherever you may need them.