I have millions of records in database and I want to read it through Python and store it in pandas data frame . The problem is the select query processing time is very high. To reduce the query processing time I try to perform multi threading on it I created 3 threads and make the query on basis of each thread like
Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=0
Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=1
Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=2
Then I run the each query with threading in Python by threading package.
But it also not reducing the time much
Is there any other approach I can take to reduce the query reading time. Note- I have used both jdbc and odbc connection
The below link helped me Multiprocessing with JDBC connection and pooling I can get around 25% gain on my local.machine.