I've seen examples of real time scoring used in credit card fraud detection, but I'm not seeing how scoring can achieve such a task. I think I'm fundamentally misunderstanding scoring.
My understanding is: "scoring a model" (in the case of classification models) means predicting on a series of datasets (where we know the answer to) and evaluating the predictions it's made by calculating all the wrong predictions the model made vs. the correct. i.e. if a model made 50 mistakes out of 100 predictions, the model is 50% accurate -- thus the score.
But I don't get how doing this in real time can detect fraud. If we don't know if the transaction is a fraud or not (since it's not historical data), how can scoring achieve fraud detection?
OR Is scoring actually the "confidence" of the prediction? i.e. When I make a prediction on an unseen dataset, a classification model will tell me that the confidence for the prediction can be 80% (the model is 80% sure it has the correct prediction). Is the score 80% in this case?
I've also seen scoring defined as applying a model to a new dataset. Isn't that the same as a prediction?
Firstly scoring depends on what metric you have defined yourself in order to measure the performance of your model. It can be anything whether like confidence or accuracy or any other metric for model evaluation. You have to define yourself which metric to use and which works best and its output will be called score.
The difference in Real Time Scoring & Batch Scoring:
Let us say you are building a fraud detection model. You will have to assign the scores to each transaction. There are two ways to do it.
Real Time Scoring - You get the features in real time and do all the preprocessing and pass it through the model in order to get the predictions. This all should be happening in real time itself giving immediate results. Pros are users or systems will not have to wait in order to get the results.
Batch Scoring - When you create a model which does the predictions in batch periodically, then it is called batch inferencing or batch scoring. Imagine you run your systems for predictions every hour or every midnight then it is doing it in batches.
They both have their pros & cons but generally, these decisions depend on business stakeholders and business requirements.