kerasimportartificial-intelligencetokenize

Keras tokenizer not appearing in import


I'm trying to generate captions using a model I trained (.keras) and I'm following this instructions: Link, I'm not following directly, I created and trained the model using Keras Image Captioning code example and saved using a function that GPT4 gave and it's working fine on that side.

Then on the generate captions, I tried GPT4 to give me some code examples but they don't work and I didn't understand so I did some research and found this Link, and I jumped to generate Image Captioning and on the code it says I need to open a tokenizer:

tokenizer = load(open('tokenizer.pkl', 'rb'))

after the debug it said to me (of course) that the file didn't exist, so I'm trying to create a tokenizer.pkl. And I can't create because I can't find the keras.preprocessing.text import Tokenizer

Trying to import the Tokenizer I realized that it can be on 2 directories, the from keras.preprocessing.text import Tokenizer or from keras.legacy.preprocessing.text import Tokenizer, and I have neither of them, my tensorflow version is: 2.17.0-dev20240410.

I've tried looking at the source code of Tokenizer and implement the Tokenizer function in my code but I doing that I can't found the API export:

from keras.api_export import keras_export

Thanks in advance.


Solution

  • All old documentation (most of all documentation nowadays) says to import from keras.preprocessing.text import Tokenizer, but keras 3 integrated the tokenizer in the textvetorization.

    So if you use the code example you will see that you import from keras.layers import TextVectorization, that is mostly what tokenizer does, in fact, tokenizer is a class IN TextVectorization.

    So to import Tokenizer you need to import TextVectorization from keras.layers and import from TextVectorization Tokenizer, like this:

    from keras.layers import TextVectorization; import tokenizer
    

    that is for Keras 3.

    If you want to use

    from keras.preprocessing.text import Tokenizer
    

    You can downgrade the version your using to Keras 2