kotlinbase-conversion

Converting a BigInteger to a string in some base: speed issue (Kotlin)


I want to convert large positive BigIntegers (up to many thousands of digits) to strings in Kotlin, using number bases up to 100 (and possibly higher).

For bases ≤36, there is of course toString(base), but for larger bases, I wrote this extension function:

fun BigInteger.toStringX(base: Int): String {
    if (base <= 36) return toString(base)  // improve speed if base <= 36
    val bigBase = base.toBigInteger()
    var n = this
    val stringBuilder = StringBuilder()
    while (n > ZERO) {
        val div = n.divideAndRemainder(bigBase)
        stringBuilder.append(DIGITS[div[1].toInt()])
        n = div[0]
    }
    return stringBuilder.reverse().toString()
}

where DIGITS is a string containing the list of digits.

Now the native toString is faster by about an order of magnitude than my function – e.g. 60 ms for about 10,000 digits vs. 500 ms. Why is my function so slow? Any help improving on its speed (while maintinaing the ability to convert to bases > 36) would be appreciated.

(By the way, replacing append() with insert() and losing reverse() in the last line doesn't change much.)


Solution

  • Looking at the source code for the built-in toString, it seems to call this private toString, which implements a divide-and-conquer algorithm.

    /**
     * Converts the specified BigInteger to a string and appends to
     * {@code sb}.  This implements the recursive Schoenhage algorithm
     * for base conversions. This method can only be called for non-negative
     * numbers.
     * <p>
     * See Knuth, Donald,  _The Art of Computer Programming_, Vol. 2,
     * Answers to Exercises (4.4) Question 14.
     *
     * @param u      The number to convert to a string.
     * @param sb     The StringBuilder that will be appended to in place.
     * @param radix  The base to convert to.
     * @param digits The minimum number of digits to pad to.
     */
    private static void toString(BigInteger u, StringBuilder sb,
                                 int radix, int digits) {
    
        ...
    
        results = u.divideAndRemainder(v);
    
        int expectedDigits = 1 << n;
    
        // Now recursively build the two halves of each number.
        toString(results[0], sb, radix, digits - expectedDigits);
        toString(results[1], sb, radix, expectedDigits);
    
    }
    

    This means that there is only O(log(N)) divisions, for an N-bit number. Compare this to your algorithm, which does O(N) divisions.

    So for large numbers, you can implement this algorithm too - "split" the number up into smaller ones, then use your O(N) algorithm when they are small enough, all the while passing the string builder along, so the digits are appended.