I'm studying fast (sub-quadratic) GCD computation algorithms and looking for any details of them. I would like to have a look at their implementation, to get a chance to understand them better.
The Euclid GCD and Binary GCD algorithms (with quadratic running times) are obviously quite simple and I have no problems with them.
The algorithms that I'm looking for are:
• Lehmer GCD algorithm,
• Accelerated GCD algorithm,
• k-ary algorithm,
• Knuth-Schönhage with FFT.
All of them are very difficult for me to understand. Can anyone please share the code (ideally in C++) with the implementation of any algorithm from the list or share any hints of how to implement them? I would appreciate any information, clues, or advice.
I could not find any information whatsoever, about the accelerated GCD algorithm. I've seen it was sometimes referred to as the most effective and fast gcd computation method on the medium inputs (~1000 bits).
I'm most interested in having a working implementation of Lehmer GCD algorithm (should be the easiest from the list).
Knuth explores the GCD at length in The Art of Computer Programming, Volume 2, Section 4.5.2. Knuth gives Algorithm E as the original method of Euclid, Algorithm A as the modern version of Euclid's algorithm, Algorithm B as the binary method and Algorithm L as Lehmer's method, as well as the extended Euclidean algorithm in Algorithm X. I describe (with code) the original Euclidean algorithm, the modern Euclidean algorithm, the binary algorithm, and the extended Euclidean algorithm at my blog.
This paper gives a good description of several versions of Schönhage's algorithms, including code.