pattern-matchingstocks

Programmatically compare two lines (stock pattern matching)


What I want to do is take a certain stock pattern (defined as a series of x and y coordinates) and compare it against historical stock prices. If I find anything in the historical prices similar to that pattern I defined, I would like to return it as a match.

I'm not sure how to determine how similar two curved lines are. I did some research, and you can find the similarity of two straight lines (with linear regression), but I haven't yet come across a good way to compare two curved lines.

My best approach right now is get several high and low points from the historical data range I'm looking at, find the slopes of the lines, and compare those to the slopes of the pattern I'm trying to match to see if they're roughly the same.

Any better ideas? I'd love to hear them!

Edit: Thanks for the input! I considered the least squares approach before, but I wasn't sure where to go with it. After the input I received though, I think computing the least squares of each line first to smooth out the data a little bit, then scaling and stretching the pattern like James suggested should get me what I'm looking for.

I plan on using this to identify certain technical flags in the stock to determine buy and sell signals. There are already sites out there that do this to some degree (such as stockfetcher), but of course I'd like to try it myself and see if I can do any better.


Solution

  • One of the problems is that curve fitting using non-linear functions is not always going to work for some of your patterns depending how complex they are. You could use quadratic or cubic or some other order of polynomials to get a more accurate result but it's not going to work in all situations, particularly with any sharp changes in the data over time.

    Honestly I think a reasonable and relatively simple solution is to 'scale' and 'stretch' your pattern so that it occurs over the same range as the historical data. You can use interpolation for the x axis and multiplication plus an offset for the y-axis. After that just look at the mean of the squared differences at each point and if that is lower than a threshold value then you can consider it a match. It will require a bit of tweaking to achieve predictable results but I think it's a nice approach that should allow you to define any sort of pattern without relying on regression producing a nicely fitted curve. Essentially it's just an application of statistics. You could also look at standard deviations or variance for a more comprehensive approach.