Version of Oracle 12c Enterprise Edition release 12.1.0.2.0
Current process.
Database 1 I have two Cursor SQL queries (which join a number of tables) which basically write to a flatfile (both files have a similar file format) using a PL/SQL for loop. A number of Flatfile files are created and written to a destination directory.
Database 2 picks up the flatfiles from the destination directory and processes each flat file into it's system.
The writing of a number of large files to a directory from one database and then to be processed into a second database can I'm sure be time consuming and the company is looking at ways to improve this performance. This process happens one a month and between 200 to 1500 files are created. Each file could be 100k to 5gig in size.
New Process.
I have been asked to look into creating a new solution to make this process quicker.
The questions I am faced with any solution as a developer are the following a) Is this quicker? b) could this be done in PL/SQL script c) What problems could I face if I tried this? d) Is there a better solution? e) Any performance/system issues with this approach?
1. Transportable tables - could a staging table be created in database 1 where I bulk collect all the data from both SQL queries queries into one staging table. I then I use the tablespace where the staging table exists and transport that tablespace to database 2 to be used to process into database 2. The tablespace would then be deleted from database 2 after a week. I clear out the staging table from database 1 after a week too.
2. DataPump - I pretty unsure about datapump as your writing exporting DMP file (maybe using a query to select the data needed) out to a directory, then picking up that DMP file to be imported into the new database I'm assuming it would create a staging table in the new system ready for it to be processed into the main tables. This could be a large dump file, would that be a problem?
3. Golden gate - I'm not sure on Golden Gate, isn't this just a replication tool. Not sure where to go with this tool.
4. A view - Create a view on database 1 (could this be a materlized view?) which would hold both the SQL Queries (UNION ALL) , the second database would call this view using a database link to process the data into the second database. Would there be any problems with reading this data over a network?
Any Ideas would be great? Has anyone had any experience with the above? Is there a better solution than the above I need to look at?
thanks Shaun
I would definately go for option # 4 - getting all the data via a DB link. I will almost guarantee that it will be the fastest. Create a view in the source DB (could be an MVIEW if you need to run the query many times), and then do either a DROP TABLE and CREATE TABLE AS SELECT or TRUNCATE TABLE and INSERT INTO .. SELECT statement depending on your needs. Both the CTAS and IAS can utilise parallel capabilities.
A datapump import (option # 2) could be an option if option # 4 for some reason is not doable. In this case you should look into doing a datapump import via a database link. It makes the process much more simple.
If transferring the data between the two databases becomes a bottleneck you could look into using compression (check your licenses in that case).