stockindicatorml.net

ML.NET C# finds out which indicators are best to predict a trend


I have a file that has a stock price and many indicators' values (for example MACD, RSI, EMA, and so on). I want to use ML.NET to find out which indicators are most important to identify a trend (up/down trend).

Anyone can help me?


Solution

  • I'm going to assume your data is organized in the following csv format:

    Price,MACD,RSI,EMA,Trend
    100,1.23,70.5,98.5,Up
    105,1.45,75.2,101.2,Up
    95,1.01,60.3,94.2,Down
    110,1.67,80.1,104.5,Up
    92,0.85,55.2,90.5,Down
    

    Furthermore, I will assume your Trend Labels are pre-determined through some simple heuristic such as an EMA 200 crossover, and that a human has optionally manually validated these labels.

    With this established, you can use feature selection techniques in ML.NET to help you identify the most important features in your data that impact your target variable (in this case, whether the trend is up or down).

    The following is an example of how you would apply feature selection in C# using ML.NET:

    using Microsoft.ML;
    using Microsoft.ML.Data;
    using Microsoft.ML.Transforms;
    
    // Define your data classes
    public class StockData
    {
        [LoadColumn(0)]
        public float Price { get; set; }
        [LoadColumn(1)]
        public float MACD { get; set; }
        [LoadColumn(2)]
        public float RSI { get; set; }
        [LoadColumn(3)]
        public float EMA { get; set; }
        // Add any other indicators as needed
        [LoadColumn(4)]
        public bool Trend { get; set; }
    }
    
    // Load your data
    var mlContext = new MLContext();
    var data = mlContext.Data.LoadFromTextFile<StockData>("path/to/your/data.csv", separatorChar: ',');
    
    // Split your data into training and testing sets
    var trainTestSplit = mlContext.Data.TrainTestSplit(data);
    
    // Define your feature selection pipeline
    var pipeline = mlContext.Transforms.Concatenate("Features", "Price", "MACD", "RSI", "EMA")
        .Append(mlContext.Transforms.Conversion.MapValueToKey("Label", "Trend"))
        .Append(mlContext.Transforms.FeatureSelection.SelectBestBinaryFeatures("Features", "Label", numberOfFeatures: 2))
        .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
    
    // Train your model
    var model = pipeline.Fit(trainTestSplit.TrainSet);
    
    // Evaluate your model on the test set
    var metrics = mlContext.BinaryClassification.Evaluate(model.Transform(trainTestSplit.TestSet));
    
    // Print the feature weights to see which indicators are most important
    var featureWeights = model.LastTransformer.Model.GetFeatureWeights();
    Console.WriteLine("Feature weights:");
    foreach (var featureWeight in featureWeights)
    {
        Console.WriteLine($"{featureWeight.FeatureName}: {featureWeight.Weight}");
    }
    

    This example uses the Concatenate transform to combine all of our feature series into a single Features column. The SelectBestBinaryFeatures transform is then used to select the two most important features (alternatively, you can also just adjust the numberOfFeatures parameter manually as needed).

    Hopefully, this helps give you a good starting point.