python-3.xpytorchencoder-decoder

Prediction for pretrained model on handwritten text(images)-Pytorch


I have a problem making a prediction using a pre-trained model that contains an encoder and decoder for handwritten text recognition. What I did is the following:

checkpoint = torch.load("Model/SPAN/SPAN-PT-RA_rimes.pt",map_location=torch.device('cpu'))
encoder_state_dict = checkpoint['encoder_state_dict']
decoder_state_dict = checkpoint['decoder_state_dict']

img = torch.LongTensor(img).unsqueeze(1).to(torch.device('cpu'))
global_pred = decoder_state_dict(encoder_state_dict(img))

This generates this error:

TypeError: 'collections.OrderedDict' object is not callable

I would highly appreciate your help! ^_^


Solution

  • The encoder_state_dict and decoder_state_dict are not Torch models themselves, but rather some structure like dictionaries containing tensors. These tensors represent the pre-trained parameters of the checkpoint you have loaded.

    It's not logical to directly feed inputs (like your transformed input image) to these collections of tensors. Instead, you should load these state_dicts — essentially collections of pre-trained tensors — into the parameters of your model object. This model should be structured according to the network you're working with, as defined in the torch.nn.Module class.