Any experience on how to specify the Google prediction buckets. I went ahead and created a bucket in Storage --> Cloud Storage --> Browser by clicking button Create bucket. I named the bucket a unique name say "UNIQUE001". I uploaded data based on their specifications, a CSV file, quoted if strings. I want to train the model and so want to run Prediction API v1.6 > prediction.trainedmodels.insert. Here is my post request
POST https://www.googleapis.com/prediction/v1.6/projects/XXXXXXXXX/trainedmodels?fields=created%2Cid%2Ckind%2CmodelInfo%2CmodelType%2CselfLink%2CstorageDataLocation%2CstoragePMMLLocation%2CstoragePMMLModelLocation%2CtrainingComplete%2CtrainingStatus
{
"id": "CodePrediction",
"storageDataLocation": "dataset.csv",
"modelType": "CLASSIFICATION"
}
I get a response as error "Training data not found" 400 OK
Show headers -
{
"error": {
"errors": [
{
"domain": "global",
"reason": "invalid",
"message": "Training data file not found."
}
],
"code": 400,
"message": "Training data file not found."
}
}
Not sure, how to specify storageDataLocation, I guess. I tried 1) gs://UNIQUE001 2) ProjectNumber/UNIQUE001
Here is the error when I give ProjectNumber/bucketName,
Request
POST https://www.googleapis.com/prediction/v1.6/projects/xxxx/trainedmodels?fields=created%2Cid%2Ckind%2CmodelInfo%2CmodelType%2CselfLink%2CstorageDataLocation%2CstoragePMMLLocation%2CstoragePMMLModelLocation%2CtrainingComplete%2CtrainingStatus
{
"id": "CodePrediction",
"storageDataLocation": "xxxxxxx/UNIQUE001",
"modelType": "CLASSIFICATION"
}
Error
400 OK
Show headers -
{
"error": {
"errors": [
{
"domain": "global",
"reason": "invalid",
"message": "Training data file is empty.",
"locationType": "other",
"location": "id"
}
],
"code": 400,
"message": "Training data file is empty."
}
}
The data set is not empty. It has million lines
The storageDataLocation value is bucketname/filename