azuremachine-learningcortana-intelligenceazure-machine-learning-service

AzureML: "Train Matchbox Recommender" is not working and does not descibe the error


I tried to create my own experiment using the module, but failed to make it work. here is the exception i got:

Error 0018: Training dataset of user-item-rating triples contains invalid data. [Critical] {"InputParameters":{"DataTable":[{"Rows":14,"Columns":3,"estimatedSize":12668928,"ColumnTypes":{"System.String":1,"System.Int32":1,"System.Double":1},"IsComplete":true,"Statistics":{"0":[10,0],"1":[5422.0,5999.0,873.0,6616.0,1758.0582820478173,7.0,0.0],"2":[1.0,1.0,1.0,1.0,0.0,1.0,0.0]}},{"Rows":2338,"Columns":3,"estimatedSize":1404928,"ColumnTypes":{"System.String":1,"System.Int32":1,"System.Double":1},"IsComplete":true,"Statistics":{"0":[2338,0],"1":[7.5367835757057318,3.0,0.0,704.0,17.738259318519511,64.0,0.0],"2":[3.3737234816082085,1.5,0.0,352.0,8.3956874404883841,122.0,0.0]}},{"Rows":2532,"Columns":22,"estimatedSize":4648960,"ColumnTypes":{"System.Int32":10,"System.String":5,"System.Double":6,"System.Boolean":1},"IsComplete":true,"Statistics":{"0":[4575.7263033175359,5326.5,539.0,6871.0,1987.9561375024909,2532.0,0.0],"1":[4575.7263033175359,5326.5,539.0,6871.0,1987.9561375024909,2532.0,0.0],"2":[613.0,613.0,613.0,613.0,0.0,1.0,0.0],"3":[0,2532],"4":[0,2532],"5":[4575.7263033175359,5326.5,539.0,6871.0,1987.9561375024909,2532.0,0.0],"6":[23.647231437598673,19.99,1.99,149.99,17.237723488320938,90.0,0.0],"7":[0.043827014218009476,0.0,0.0,45.99,1.3460680431173562,3.0,0.0],"8":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"9":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"10":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"11":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"12":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"13":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"14":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"15":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"16":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"17":[0.0,0.0,0.0,0.0,0.0,1.0,0.0],"18":[2524,0],"19":[242,18],"20":[1,0],"21":[2524,0]}}],"Generic":{"traitCount":10,"iterationCount":5,"batchCount":4}},"OutputParameters":[],"ModuleType":"Microsoft.Analytics.Modules.MatchboxRecommender.Dll","ModuleVersion":" Version=6.0.0.0","AdditionalModuleInfo":"Microsoft.Analytics.Modules.MatchboxRecommender.Dll, Version=6.0.0.0, Culture=neutral, PublicKeyToken=69c3241e6f0468ca;Microsoft.Analytics.Modules.MatchboxRecommender.Dll.MatchboxRecommender;Train","Errors":"Microsoft.Analytics.Exceptions.ErrorMapping+ModuleException: Error 0018: Training dataset of user-item-rating triples contains invalid data.\r\n at Microsoft.Analytics.Modules.MatchboxRecommender.Dll.Utilities.UpdateRatingMetadata(DataTable dataset, String datasetName) in d:\_Bld\8833\7669\Sources\Product\Source\Modules\MatchboxRecommender.Dll\Utilities.cs:line 179\r\n at Microsoft.Analytics.Modules.MatchboxRecommender.Dll.MatchboxRecommender.TrainImpl(DataTable userItemRatingTriples, DataTable userFeatures, DataTable itemFeatures, Int32 traitCount, Int32 iterationCount, Int32 batchCount) in d:\_Bld\8833\7669\Sources\Product\Source\Modules\MatchboxRecommender.Dll\MatchboxRecommender.cs:line 62","Warnings":[],"Duration":"00:00:00.6722068"} Module finished after a runtime of 00:00:01.1250071 with exit code -2 Module failed due to negative exit code of -2

i've check the input data i'm setting as input user-place-rating table, record by record (no worries it's only 14 records) here it is:

the input data

Here is a screenshot of the experiment: the experiment

since the error message is not very informative, I don't know where to start, so, if anybody has an idea, I would be happy to hear about it.

Update: A friend of mine suggested to add "Edit Metadata" module to change the "rating" feature into "int" or "float" types, and the two other(placeID and userID) into string features. that didn't help as well.


Solution

  • The matchbox recommender requires that ratings be numerical or categorical. Also when training, your ratings cannot all be the same.

    You need to use a metadata editor https://msdn.microsoft.com/en-us/library/azure/dn905986.aspx to convert the ratings into numerical features and you need to make sure you are using a range of ratings.

    Then this should work!