pythonpandasloopsiterationportfolio

Iteratively testing the performance of portfolio weights in pandas


I have a portfolio of x (let's say x = 8) stock factors with weights w. I can test the performance of these factors in an equally weighted manner by using function F(w) = p, where the output p is a performance metric. The df of portfolio weights for the first test looks like this:

    weight
0   12.57
1   12.49
2   12.49
3   12.49
4   12.49
5   12.49
6   12.49
7   12.49

Where I used 100/8 rounded down to two decimals and gave the first factor the remainder on top of that as below:

pd.DataFrame(columns = ['weight'], data = [(100/factorsdf.shape[0] // 0.01 / 100)]*factorsdf.shape[0])
weights['weight'].iloc[0] = weights['weight'].iloc[0] + round(100 - (100/factorsdf.shape[0] // 0.01 / 100) * factorsdf.shape[0],2)

What I now want to do is iteratively improve upon these weights. The way I think of doing this is by incrementally subtracting half of the weight (here: 12.49/2 = 6.245) of the bottom weight (with index 7) and adding it to the weight above (index 6).

incremental_weight = (100/factorsdf.shape[0] // 0.01 / 100)/2

Hence, in the first iteration I would test the equally weighted portfolio.

In the next iteration I would test:

    weight
0   12.57
1   12.49
2   12.49
3   12.49
4   12.49
5   12.49
6   18.735
7   6.245

In case this does not improve metric p, I will try adding it to the next weight element in the dataframe like so:

    weight
0   12.57
1   12.49
2   12.49
3   12.49
4   12.49
5   18.735
6   12.49
7   6.245

In case it does improve, I will subtract another 6.245 of the bottom element and try adding it to the one above. If that doesn't improve p, I will try adding it to the one above that, and so on, until I've tried adding it to all the other elements.

After that I will do the same procedure for the element above the bottom element (with index 6), subtracing 6.245 and iteratively adding it to the other elements. And so on. Until metric p can no longer be improved upon.

What would be a good way to program this in pandas in a relatively efficient manner?


Solution

  • The following should work if I have understood fully (I have added comments in the code):

    import pandas as pd; import numpy as np
    # my random generated factor scores
    factorsdf = pd.DataFrame({"Factor": np.random.randint(0, 100, 8)})
    # weights evenly allocated (from your code)
    weights = pd.DataFrame(columns = ['weight'], data = [(100/factorsdf.shape[0] // 0.01 / 100)]*factorsdf.shape[0])
    weights['weight'].iloc[0] = weights['weight'].iloc[0] + round(100 - (100/factorsdf.shape[0] // 0.01 / 100) * factorsdf.shape[0],2)
    
    # weight to increment by (from your code)
    incremental_weight = (100/factorsdf.shape[0] // 0.01 / 100)/2
    
    # the F(w) function: weights must be first, from np.apply_along_axis below
    def F_w(weights, factorsdf):
        # here it is just the dot product of the random factors and the weights.
        p = np.dot(factorsdf, weights.T/100)
        return p
    
    # initial value of p
    p = F_w(np.array(weights["weight"]), np.array(factorsdf["Factor"]))
    
    print("p: ", p)
    
    # rows above the weight that is being decreased
    i = len(weights) - 1
    
    # copy of weights to change
    weights_c = weights.copy(deep=True)
    
    # can't decrease the top weight, because none above to change
    while i >= 1:
        print("i: ", i)
        
        # series with 0s everywhere except on row that will be decreasing the weight
        dec = pd.Series([0]*(len(weights_c)-1) + [incremental_weight]).shift(-len(weights)+1+i).fillna(0)
        # increment array has all possible options for increasing weights above
        inc = np.identity(i)*incremental_weight
        # row that is decreasing should not increase
        inc = np.vstack([inc, [0]*inc.shape[1]])
        # if other rows below have a weight (are not 0) then these won't increase either
        while inc.shape[0] < len(weights_c):
            inc = np.vstack([inc, [0]*inc.shape[1]])
        # if weight will be below 0 then should move up, also if there is only one row left
        if ((weights_c["weight"].sub(dec)<0).sum() > 0) or (inc.shape[0] == 1):
            i -= 1
            continue
        # subtract the increment from the weight
        weights_c["weight"] = weights_c["weight"].sub(dec)
        
        # array of possible weightings when incrementing rows above
        weights_arr = np.repeat(np.array(weights_c), inc.shape[1], axis=1) + inc
        
        print("weights_arr: ", weights_arr)
        print("weights_c: ", weights_c)
        print("factorsdf: ", np.array(factorsdf).T[:, :weights_arr.shape[0]])
        
        # return the p values for F(w) for all the possible variations
        res = np.apply_along_axis(F_w, 0, weights_arr, np.array(factorsdf).T[:, :weights_arr.shape[0]])
        # if any of these are greater than current p...
        if res.max() > p:
            # set new value of p
            p = res.max()
            # find which row the increment should be added to
            x = res.argmax()
        
            # add the increment to the correct row
            weights_c["weight"].iloc[x] += incremental_weight
            
            print("x: ", x)
        # if not, move on
        else:
            weights_c["weight"] = weights_c["weight"].add(dec)
            i -= 1
        
        # if the bottom weight is 0, remove this row from calculations
        if weights_c["weight"].iloc[-1] == 0:
            weights_c = weights_c.iloc[:-1]
            i -= 1
        
        print("weights_c: ", weights_c)
    
    # print final weightings and max value of p
    print("weights_c: ", weights_c)
    print("p: ", p)
    

    It can definitely be cleaned up a bit, with some of the while loops removed, but this might give you a good starting point.

    As you can see, I have made use of numpy a lot more than pandas, and there would be less code if the weights and factors were arrays to start.

    Let me know if you have any questions.