pythonlinear-regressionpredictiondeepnote

How to use LinearRegression to predict the total number of victims per month


I have a largar dataset which looks like the picture below, which also contains column "Month" and "Year". I try to use Linear Regression model to precit the total number of victims per month, but I don't know how to get the total number victimsDatafram

from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(df_pre[[]],df_pre["Year"]) #don't know how to fit the data in here.

Appreciate help!

I try to fit Vict Age and Month, but I got wrong answer. And I try to create a new datafram which contains only month and total victim, then the fit will have different size.


Solution

  • The concept behind fitting data to the model is:

    reg.fit([all_inputs], [outputs])
    
    In Machine learning terms:
    reg.fit([features], [target])
    

    Since, I couldn't preview your dataset properly, here is a simple example on how to fit data and predict with LinearRegression.

    Let say we have small dataset of x_1, x_2, y where x_1 and x_2 are the features (inputs to the model) while y is the target (what we want to predict).

    Our dataset:

    x_1 = [1, 1, 2, 2]
    x_2 = [1, 2, 2, 3]
    y = [6, 8, 9, 11]
    data = [[1, 1, 6], [1, 2, 8], [2, 2, 9], [2, 3, 11]]
    The nested lists are rows (that is data has 4 rows and 3 columns)
    

    Full code

    # Import the packages and libraries
    import numpy as np
    from sklearn.linear_model import LinearRegression
    import pandas as pd
    
    
    # Convert our data into DataFrame
    data = [[1, 1, 6], [1, 2, 8], [2, 2, 9], [2, 3, 11]] 
    columns = ["x_1", "x_2", "y"] # columns of the dataframe
    df = pd.DataFrame(data, columns=columns) # This will turn the data into a table like your data.
    
    # Split the data to features and label
    X_train = df.copy()
    
    y_train = X_train["y"] # This is the target/ label/ output
    
    del X_train["y"] # delete the label from the copied dataframe, so we are left with the features only.
    
    # To answer your question of how to fit and predict with LinearRegression
    model = LinearRegression() # Instantiate the class
    
    model.fit(X_train, y_train) # Fit the input (features i.e X_train "x_1, x_2") and the output (target "y") to the model.
    
    result = model.predict(np.array([[3, 5]])) # Now, we want to use the model to make prediction by passing a new set of input/ features x_1 and x_2 to the model to predict  
    
    # so we should get result = [16.].
    

    Note that we are using this simple equation y = (1 * x_1) + (2 * x_2) + 3 and if you should pass x_1 = 3 and x_2 = 5 to the equation, y = 16 which means our model works fine.