Whenever I tried to get back the original value, I'll get "Hello World►/", "Hello Worlc", "Hello World►/", "Hello Worle♦S" or some other variants. I follow from this website for the implementation of splitting and reconstruction: https://www.geeksforgeeks.org/implementing-shamirs-secret-sharing-scheme-in-python/. This is my code:
import random
from math import ceil
from decimal import Decimal
FIELD_SIZE = 10**5
def reconstruct_secret(shares):
sums = Decimal(0)
for j, share_j in enumerate(shares):
xj, yj = share_j
prod = Decimal(1)
for i, share_i in enumerate(shares):
xi, _ = share_i
if i != j:
# Check if xi is equal to xj
if xi == xj:
continue
prod *= Decimal(xi) / (Decimal(xi) - Decimal(xj))
prod *= Decimal(yj)
sums += prod
return int(round(sums))
def polynom(x, coefficients):
point = 0
for coefficient_index, coefficient_value in enumerate(coefficients[::-1]):
point += x ** coefficient_index * coefficient_value
return point
def coeff(t, secret):
coeff = [random.randrange(0, FIELD_SIZE) for _ in range(t - 1)]
coeff.append(secret)
return coeff
def generate_shares(n, m, secret):
coefficients = coeff(m, secret)
shares = []
for i in range(1, n+1):
x = random.randrange(1, FIELD_SIZE)
shares.append((x, polynom(x, coefficients)))
return shares
if __name__ == '__main__':
# (3,5) sharing scheme
t, n = 3, 5
tt = "Hello World!!!"
secret = int.from_bytes(tt.encode(), byteorder='big')
# Phase I: Generation of shares
shares = generate_shares(n, t, secret)
# Phase II: Secret Reconstruction
pool = random.sample(shares, t)
# Reconstruct the secret
r_secret_int = reconstruct_secret(pool)
r_secret_string = r_secret_int.to_bytes(
(r_secret_int.bit_length() + 7) // 8, byteorder='big').decode('utf-8', 'ignore')
print("Reconstructed secret:", r_secret_string)
I consulted chatgpt, it suggested changing the algorithm for reconstruct_secret(shares) or increasing the FIELD_SIZE but the code it gave, the secret either returned blank or some gibberish.
TL;DR: reconstruct_secret
is buggy. It won't work for certain cases. It works for the approximation where secret
is small, but if secret
is large, it struggles to reconstruct the values from shares
. I'll explain why here.
Let's recap how this implementation of SSS works. Assuming that one has a polynomial of the form:
P(x, n-1) = S + a*x + b*x^2 + ... c*x^(n-1)
then, the coefficients a, b, ..., c
can be found if you have the value of P
and x
at exactly n
places on the curve. This is the Lagrange interpolation theorem. You can see more about this on the wikipedia page: https://en.wikipedia.org/wiki/Shamir%27s_secret_sharing
The code smell is the rounding + use of Decimal:
def reconstruct_secret(shares):
sums = Decimal(0)
for j, share_j in enumerate(shares):
xj, yj = share_j
prod = Decimal(1)
for i, share_i in enumerate(shares):
xi, _ = share_i
if i != j:
# Check if xi is equal to xj
if xi == xj:
continue
prod *= Decimal(xi) / (Decimal(xi) - Decimal(xj))
prod *= Decimal(yj)
sums += prod
return int(round(sums))
When we generate shares, we do that by picking a random polynomial. We do that by randomly picking coefficients between 0
and FIELD_SIZE
. The division here is not 100% accurate:
prod *= Decimal(xi) / (Decimal(xi) - Decimal(xj))
If FIELD_SIZE << secret
, the coefficients are very insignificant. Think of it this way:
P(x, 2) = 1000000000000000 + 2x + 3x^2
This polynomial is wholly dominated by the secret, 1000000000000000
. It's very hard to find the exact shape of the curve, because it is "mostly flat" in the vicinity of x = 0
. We use the x = 0
value to find the secret. You can see this as follows: the derivative of P(x, n)
is much much less than the value at P(0, n)
. That means you need a high resolution solution to find the answer. Small errors will "corrupt" the data.
When you reconstruct the secret, the algorithm you use has rounding error. That's because when you do a/b
, you don't get an exact answer. You get an answer that is approximately correct. Try it yourself:
>>> 1/3
0.3333333333333333
Those 3s actually keep going forever. See this bit which corresponds to the basis polynomials:
prod *= Decimal(xi) / (Decimal(xi) - Decimal(xj))
To do that accurately, (xi - xj)
shouldn't be "too big". If it is much much bigger than xi
, we'll exceed the accuracy for representing decimals.
So, if you have P(x) = 1000000000000000 + 2x + 3x^2
, these coefficients vary a lot, and therefore, the rounding error for division dominates your result, and corrupts the secret.
This problem is essentially catastrophic cancellation.
The easiest fix is to use a different algorithmic approach. The crux of the problem is that reconstruct_secret
is error prone for large values of secret
. The shares are fine, but that function won't accurately reconstruct the secret.
See here for alternative approaches: https://en.wikipedia.org/wiki/Shamir%27s_secret_sharing#Python_code