I have built an xgboost model in R and exported it in PMML (with r2pmml).
I have tested the same dataset with R and PMML (with Java), the probabilities output are very close but they all have a small difference between 1e-8 and 1e-10.
These differences are too small to be caused by a issue with the input data.
Is it a classic behaviour of rounding between different language/software or I did a mistake somewhere.
the probabilities output are very close but they all have a small difference between 1e-8 and 1e-10.
The XGBoost library uses float32
data type (single-precision floating-point), which has a "natural precision" of around 1e-7 .. 1e-8 in this range (probability values, between 0 and 1).
So, your observed difference is less than the "natural precision", and should not be a cause for further concern.
The (J)PMML representation is carrying out exactly the same computations (summation of booster float values, applying a normalization function to it) as the native XGBoost representation.