postgresqljoinhashamazon-rdsbytea

Using bytea as table index and join condition


I recently took a position where i need to work with an AWS RDS Aurora Postgre database. I have several years of experience working with both MySQL, SQL Server & RDS but this is my first experience with postgre. My initial reaction is how different postgre seems to be and how application/developer friendly it seems to be out of the box. One thing i've noticed is that i need to be careful and not take for granted some of the pollicies/procedures I have implemented in the past will be the best fit for postge. This leads me to the issue, when i design tables & databases i have always implemented SHA or MD5 hash function to handle composite keys to simplify and make joins to other tables faster. I'ved used this same strategy (SHA512) in my current postgre project & in postgre this datatype is called "bytea". The problem is we seem to be getting terrible performance when using this bytea to join to other tables & makes me wonder if there is a better or standard way to handle this in postge? Should i be looking at something else besides the bytea as the cause of the performance issues?


Solution

  • I've never seen that approach for postgres in this forum. In postgres, we can define primary keys and foreign keys, both refering to one column or a group of columns. Then, JOIN usually refers to a group of columns in the joined tables, see the manual. I don't see what could be the benefits of using SHA or MD5 hash for improving the performances of JOIN, I would rather think that performances will be degraded because of the hash process. In postgres, the typical solution to accelerate queries is to define indexes on the tables. Index is a powerful solution in postgres including HASH indexes which may sounds like what you use to do in MySQL ?