I have two buckets in different regions in OCI object store. I want to compare files in the two buckets using Oracle Autonomous Database, to detect missing files and copy them over and synchronize the two buckets. I use the two buckets in different regions as replicating some configuration metadata between the services running in the two regions.
Eg:
Oracle Autonomous Database provides the DBMS_CLOUD package for accessing object store files.
DBMS_CLOUD package has a function LIST_OBJECTS, which provides a list of files in an object store location (bucket or subfolder). This function provides the object name , size in bytes and checksum, which can be used to compare files in the 2 buckets with the help of Oracle SQL FULL OUTER JOIN and WITH Clause.
SQL Script:
define bucket1_uri='https://objectstorage.us-phoenix-1.oraclecloud.com/n/mynamespace/b/mybucket/o/folder1/'
define bucket2_uri='https://objectstorage.us-ashburn-1.oraclecloud.com/n/mynamespace/b/mybucket/o/folder2/'
define credname1='OCI$RESOURCE_PRINCIPAL'
define credname2='OCI$RESOURCE_PRINCIPAL'
set linesize 120
set pages 1000
col object_name_A format a30
col object_name_B format a30
col object_checksum_A format a10
col object_checksum_B format a10
-- Use SQL FULL OUTER JOIN and compare the checksum exclude matching rows
WITH bucket1 AS
(SELECT object_name, bytes, checksum FROM DBMS_CLOUD.LIST_OBJECTS('&credname1', '&bucket1_uri')
ORDER BY object_name, bytes, checksum),
bucket2 AS
(SELECT object_name, bytes, checksum FROM DBMS_CLOUD.LIST_OBJECTS('&credname2', '&bucket2_uri')
ORDER BY object_name, bytes, checksum)
SELECT rownum id, diff.* FROM
(SELECT a.object_name object_name_A, a.bytes object_size_A, a.checksum object_checksum_A,
b.object_name object_name_B, b.bytes object_size_B, b.checksum object_checksum_B
FROM bucket1 a FULL OUTER JOIN bucket2 b
ON a.object_name = b.object_name) diff
WHERE diff.object_name_A IS NULL OR
diff.object_name_B IS NULL OR
(diff.object_name_A = diff.object_name_B AND
diff.object_checksum_A != diff.object_checksum_B)
/
Explanation:
In the above script:
Use Cases:
The DBMS_CLOUD package works for multiple object stores such as Oracle OCI, AWS S3, Azure BLOB, Google Cloud, Wasabi, so the above script can be used to compare buckets in multiple scenarios such as -
References: