[SOLVED] AttributeError: 'DynamicCache' object has no attribute 'seen

AttributeError: 'DynamicCache' object has no attribute 'seen_tokens'

I'm following the Hands-On Large Language Models book to learn more about LLMs. I'm trying to generate text using the "microsoft/Phi-3-mini-4k-instruct" model which is used in the book. Somehow I get an error while trying the example code:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    #device_map = "cuda",
    torch_dtype = "auto",
    trust_remote_code = True
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

# create a pipeline
generator = pipeline(
    "text-generation",
    model = model,
    tokenizer = tokenizer,
    return_full_text = False,
    max_new_tokens = 500,
    do_sample = False
)

# The prompt 
messages = [
    {"role": "user",
     "content": "Create a funny joke about chickens."}
]

# Generate output
output = generator(messages)
print(output[0]["generated_text"])

Which returns the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipython-input-1474234034.py in <cell line: 0>()
      6 
      7 # Generate output
----> 8 output = generator(messages)
      9 print(output[0]["generated_text"])

8 frames
~/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-mini-4k-instruct/0a67737cc96d2554230f90338b163bc6380a2a85/modeling_phi3.py in prepare_inputs_for_generation(self, input_ids, past_key_values, attention_mask, inputs_embeds, **kwargs)
   1289             if isinstance(past_key_values, Cache):
   1290                 cache_length = past_key_values.get_seq_length()
-> 1291                 past_length = past_key_values.seen_tokens
   1292                 max_cache_length = past_key_values.get_max_length()
   1293             else:

AttributeError: 'DynamicCache' object has no attribute 'seen_tokens'

The model is loading, but I don't know why this error is happening.

Solution

This issue is being caused by the trust_remote_code=True parameter. Please modify it as follows:

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    #device_map = "cuda",
    torch_dtype = "auto",
    trust_remote_code = False  # Change to False
)

trust_remote_code=True causes the download of outdated custom code from the Hugging Face Hub. This legacy code uses the past_key_values.seen_tokens attribute, which no longer exists in the current transformers library's DynamicCache class. Since the Phi-3 model is now included in the transformers library, custom code is no longer required. The transformers library code instead uses get_seq_length() in place of seen_tokens.