djangodatabaseoracle-databaseprefetch

Django's Very slow initial execution of select query on remote Oracle 11 database when table contain big blobs


I actually understand what is causing the problem but can't figure out a solution that works with Django.

suppose you are trying to read 10,000 records from oracle database over network, normally that would result in 10,000 network connections being open and closed which causes bad performance.

Oracle's answer to that is to send say 1000 records at once per connection, that significantly improve the speed.

However in my case, each row contains blobs that are 1-2MBs, so trying to prefetch 1000 records means nothing happens until I've downloaded (and kept in memory) 1GB-2GB worth of data.

oracle's sqlplus provides option to control that, either using the fast parameter or issuing set rowprefetch 10 but I can't find solution that works with Django.

I only want to prefetch 10 records or so


Solution

  • The settings you appear to be referring to are usually related to the database driver and not, typically, to the frameworks above the database driver.

    For example, python-oracledb has documentation on Tuning python-oracledb which includes lots of advice ranging from:

    If you want to change how many rows are loaded at in each batch from the database then change cursor.arraysize and cursor.prefetchrows. This answer gives an example of setting the cursor.arraysize for Django and Oracle 11g. The value for prefetchrows can be set in an Optional Oracle Client Configuration File. However, it will not change the total amount of data that needs to be loaded just the size of each batch that is loaded.

    If you want to load less total data then paginate your queries (the Django pagination documentation is here).