Am working on a project of consumer behaviour analysis on websites and predict the malicious activity of users in real-time. Click data is being collected for each click made by users.
Am using multiple AWS services like kinesis stream, Lambda and sagemaker. I have created an autoencoder model and deployed it as sagemaker endpoint which will be invoked using lambda when it receives new click data from the website through Kinesis stream.
Since sagemaker endpoint contains the only model but click data which lambda function receives is raw data with URLs, texts and date. How can I pass raw data into required preprocessing steps and send processed data to sagemaker endpoint in the required format?
Example of raw data:-
{'URL':'www.amazon.com.au/ref=nav_logo', 'Text':'Home', 'Information':'Computers'}
You can use Sagemaker inference Pipeline. You need to create preprocessing script comprising of your preprocessing steps and create a Pipeline including Preprocess and model. Deploy pipeline to an endpoint for real time inference.