In my code I have two lines (line1, line2) that definitely visually intersect one another. They are simply stock lines with data scraped from yfinance, but when running line1.intersect(line2), it returns that there are no intersection points. I have attached some photos to bring clarity (note that line1=SMA-50, line2=SMA-200 in the graph legend)
This is the current section of code:
new50 = sma50.to_frame().reset_index() #set series to dataframe & preserve date as column
new200 = sma200.to_frame().reset_index()
new50.Date = pd.to_numeric(pd.to_datetime(new50.Date.dt.date)) #removing time from datetime, converting to numeric
new200.Date = pd.to_numeric(pd.to_datetime(new200.Date.dt.date))
new50.Close = round(new50.Close,1) #round to see if I can obtain intersections...
new200.Close= round(new200.Close,1)
line1 = LineString(np.column_stack([new50.Close, new50.Date]))
line2 = LineString(np.column_stack([new200.Close, new200.Date]))
print(line1.intersection(line2))
This is what one of the dataframes (new200) looks like, and the response I get when printing line1.intersection(line2)
:
Date Close
0 1644796800000000000 NaN
1 1644883200000000000 NaN
2 1644969600000000000 NaN
3 1645056000000000000 NaN
4 1645142400000000000 NaN
.. ... ...
495 1707091200000000000 442.7
496 1707177600000000000 444.7
497 1707264000000000000 446.9
498 1707350400000000000 449.0
499 1707436800000000000 451.3
[500 rows x 2 columns]
LINESTRING Z EMPTY
I have tried rounding my numbers to see if I can get an intersection this way. Unfortunately this approach did not work. I also removed time from datetime, to see if the reason for no intersections was due to my data being too precise. I've also looked on the internet to see if I can find solutions, but haven't had much luck with this approach.
I have seen a potential solution using numpy np.diff(), which I will test right now - however, I think it would be interesting to see why Shapely fails at recognising this intersection or if it is my own fault.
I have looked online quite a bit but have had no luck with this - would appreciate any help. Thanks all!
NaN
values are invalid for coordinates, and calling functions like intersection
on invalid geometries results in undefined behaviour, which is probably what you are seeing.
Normally the following warning should be printed on the lines creating the LineStrings because of the NaN
values:
RuntimeWarning: invalid value encountered in linestrings
If you use dropna()
to remove the rows with invalid NaN
values for coordinates before creating the linestrings, you should get the expected result:
import numpy as np
import pandas as pd
from shapely import LineString
dates = [
1644796800000000000,
1644883200000000000,
1644969600000000000,
1645056000000000000,
]
new50 = pd.DataFrame({"Date": dates, "Close": [np.nan, 410.0, 420.0, 500.0]})
new200 = pd.DataFrame({"Date": dates, "Close": [np.nan, 550.0, 500.0, 420.0]})
line1 = LineString(np.column_stack([new50.Close, new50.Date]))
line2 = LineString(np.column_stack([new200.Close, new200.Date]))
print(f"original: {line1.intersection(line2)}")
new50 = new50.dropna()
new200 = new200.dropna()
line1 = LineString(np.column_stack([new50.Close, new50.Date]))
line2 = LineString(np.column_stack([new200.Close, new200.Date]))
print(f"after dropna: {line1.intersection(line2)}")
Result:
original: LINESTRING EMPTY
after dropna: POINT (460 1645012800000000000)