crandomdatasetkeypredictive

Producing a semi-RNG with a reproducible outcome in C (no random/urandom)


I wish to create a "randomness", or at least a large amount of entropy given certain data sets. The outcome must be predictable/constant, though, based on various factors, what datasets are used, etc. So I'm not looking to just read from random/urandom. I need this to be completely reproducible. I just want the spread to be varied enough across datasets using fairly limited key sizes. I suppose a rough idea:

[Key] --> [DatasetA] --> [Series of reproducible numbers]. Numbers that will significantly change based on a small modification to the key and any additional variable. Please note I'm trying to avoid just hashing as my requirement demands a lookup of datasets from the that will still simulate randomness without being random. It's procedurally-generated, but reproducible. I would prefer it be implementable in C as I want to use it for a (non-school) project. It's a hobby, but essentially I want to in the end be able to know certain criteria will produce very very varied results. Everything must be self-contained (no external dependence, just within the code and datasets, and keys).

If I'm looking at this from the wrong perspective, I'm open to other suggestions, but obviously it'd be preferable to not have to write the rest of the code-base from scratch.

I've already tried several "made-up" "algorithms" based on key size and contents and dataset contents. I'm not getting sufficient entropy, though.

[b]Update:[/b] Thanks to Severrin's answer, I was able to get a baseline for my key. After some tweaking by using the found piece of data and they using it to seed the whole process over, it's working. My "rooms" are very nicely varied and I can still reproduce them. I supposed it took just out of the box thinking on my part as well.

Thanks everyone,


Solution

  • Ok, here is simple Linear Congruential Generator for 64bit -> 64bit full period mapping, together with inverse function.

    #include <stdint.h>
    #include <stdio.h>
    
    // parameters from https://arxiv.org/pdf/2001.05304.pdf
    uint64_t m = 0xd1342543de82ef95ULL;
    uint64_t c = 0x1ULL;
    
    uint64_t im = 6281218453581128637ULL; // modular inverse from m using Mathematica ModularInverse[m, 2^64]
    
    uint64_t lcg(uint64_t xi) { // direct LCG
        return m*xi + c;
    }
    
     uint64_t ilcg(uint64_t xp) { // inverse LCG, such that ilcq(lcg(q)) == q
        return (xp - c)*im;
    }
    
    int main() {
        
        uint64_t idx = 987610342345234534ULL;
        
        printf( "LCG is %llu\n", lcg(idx) );
        printf( "INV is %llu\n", ilcg(lcg(135797531ULL)) );
    
        return 0;
    }
    

    And here is 128bit LCG

    typedef union {
        uint8_t     c[16];
        __uint128_t ui;
    } x128;
    
    const x128 zm = { .c = {0xdb,0x36,0x35,0x77,0x34,0xe3,0x4a,0xbb,0x00,0x50,0xd0,0x76,0x1f,0xcd,0xfc,0x15} }; // parameters from https://arxiv.org/pdf/2001.05304.pdf
    const x128 zc = { .c = {0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x01} };
    
    __uint128_t lcg(__uint128_t xi) {
        return zm.ui*xi + zc.ui;
    }
    
    void prn128(__uint128_t x) {
        printf("0x");
        for (int i = 0; i != 16; ++i) {
            printf("%02x", (int)((x >> 8*i)) & 0xff);
        }
        printf("\n");    
    }
    
    int main() {
        __uint128_t a = (__uint128_t)9129178291273918391ULL * 19284633241ULL * 3173197ULL;
        
        prn128(zm.ui);
        prn128(zc.ui);
        prn128(a);
        prn128(lcg(a));    
    
        return 0;
    }