I am currently training a model using an azure ML pipeline that i build with sdk. I am trying to add cross-validation to my ml step. I have noticed that you can add this in the parameters when you configure the autoML. My dataset consists of 30% label 0 and 70% label 1.
My question is, does azure autoML stratify data when performing the cross-validation? If not i would have to do the split/stratify myself before passing it to autoML.
Auto ML can stratify the data when performing cross-validation. The following procedure needs to be followed to perform cross-validation
Create the workspace resource.
After giving all the details, click on create
Launch the Studio and go to AutoML and click on New Automated ML job
Upload the dataset from here and give the basic details required.
Dataset uploaded with some basic categories
After uploading dataset use that dataset for the prediction model performance
Here for prediction, we can choose the k-fold cross validation for validation type and number of cross validations as 5. There is no split we are performing. The model will perform according to the validation requirements.