I am currently training some models via Googles AutoML feature contained within their Vertex AI products.
The normal pipeline is creating a dataset, which I do by creating a table in Bigquery, and then starting the training process.
This has normally worked before but for my latest dataset I get the following error message:
Training pipeline failed with error message: The size of source BigQuery table is larger than 107374182400 bytes.
While it seemed unlikely to me that the table is actually too large for AutoML, I tried re-training on a new dataset that's a 50% sample of the original table but the same error occured.
Is my dataset really to large for AutoML to handle or is there another issue?
There are some perspectives of limits for AutoML Tables -- not only size in bytes (100GB as maximum supported size), but also number of rows (~200bi lines) and number of columns (up to 1000 columns).
You can find more details on AutoML Tables limits documentation.
Is your source data within those limits?