I am trying to implement a QA system using models from huggingface. One thing I do not understand is, when I don't specify which pre-trained model I am using for question-answering, is the model chosen at random?
from transformers import pipeline
# Allocate a pipeline for question-answering
question_answerer = pipeline('question-answering')
question_answerer({
'question': 'What is the name of the repository ?',
'context': 'Pipeline have been included in the huggingface/transformers repository'
})
Output:
{'score': 0.5135612454720828, 'start': 35, 'end': 59, 'answer': 'huggingface/transformers'}
I know how to specify a model by adding the name of the model (bert-base-uncased for example) as a model parameter, but which one is it using when you are not specifying anything? Does it use a combination of all models on huggingface? I could not find the answer.
The model is not chosen randomly. Ever task in the pipeline selects the appropriate model whichever is close to the task. A model which is closely trained on the objective of your desired task and dataset is chosen. For example, sentiment-analysis
pipeline can chose the model trained on SST task.
Likewise, for question-answering
, it chooses AutoModelForQuestionAnswering
class with distilbert-base-cased-distilled-squad
as the default model, as SQUAD dataset is associated with question answering task.
To get the list, you can look at the variable SUPPORTED_TASKS
here