I want to create a database for my blog website where I have many posts and categories. A post can have many categories and a category can have many posts. I couldn't figure out how to implement it in PostgreSQL.
This is the category
table:
CREATE TABLE category (
category_name VARCHAR(255) PRIMARY KEY
);
And this is the Post
table:
CREATE TABLE post (
post_id INTEGER NOT NULL PRIMARY KEY,
title VARCHAR(255),
category_name VARCHAR(255) REFERENCES category (category_name)
ON UPDATE CASCADE ON DELETE CASCADE,
date DATE
);
I find it difficult to INSERT
a record.
The following doesn't work:
INSERT INTO post(post_id, title, category_name, date)
VALUES (1, 'How to create a website',['technology','website'], '2025-06-04');
I could've inserted some data into the category
table, but only 1 value works at a time. How do I add multiple values?
PostgreSQL supports arrays of values. This will be the simplest way to implement your needs:
You'll just have to declare your column as category_name VARCHAR(255)[]
,
and insert as ARRAY['technology','website']
(instead of the ['technology','website']
you put in your question).
However, even if PostgreSQL provides a bunch of functions that make arrays easier to handle, quicker, and cleaner than comma-separating your values, this is still a shaky model that won't be helped by the RDBMS, and especially, as of 2025, you can't get them be FOREIGN KEYS
.
Maintaining integrity will rely on defining some triggers on both tables (INSERT OR UPDATE OF category_name ON post
, UPDATE OR DELETE ON category
). Hint: using serial
s instead of the name could remove the need for UPDATE ON category
, and make it lighter on storage, on the other hand SELECT
ing the categories will then of course require one additional indirection.
You can get a glimpse at it in a small fiddle to play with.
n..n
link tableThen of course you've got the all-terrain join table, that you could call post_category
, to remove that multivalued column and at least satisfy 1NF.
This will require a two-steps INSERT
, or using PostgreSQL's ability to include INSERT … RETURNING
in CTES:
WITH
i AS
(
INSERT INTO post(post_id, title, date)
VALUES (1, 'How to create a website', '2025-06-04')
RETURNING post_id
)
INSERT INTO post_category
SELECT post_id, category_name
FROM i, category
WHERE category_name in ('technology','website');
This is shown in its own fiddle.
Of course with CTEs you can add a lot of other things, for example:
VALUES
each with its own categoriescategory
before using them in the just inserted post
.SELECT
s in an 1NF modelYou perhaps already noticed that having a clean model forces to think of a clean implementation, even for SELECT
s.
For example, if you do not pay attention, with a blog containing 2 posts (post 1 with categories A and B, and post 2 with categories A, B and C), a summary SELECT 'Over the whole blog, you posted '||COUNT(post.post_id)||' posts over '||COUNT(category)||' different categories' FROM post LEFT JOIN post_category USING (post_id);
, would return "you posted 5 posts over 5 categories" instead of "2 posts over 3 categories".
DISTINCT
is a quick-and-really-dirty way (that will not work with joins of more that two 1-to-n or n-to-n tables),
so I'd advize you to always think in sets of rows that each represent 1 physical entity (while DISTINCT
is just hiding that you got 2 or 3 rows to represent each "physical unit" which is a post).
Better strategies would be based on grouped subqueries:
SELECT … FROM (SELECT … GROUP BY post_id) AS agg_cats_of_post
which is equivalent to CTE WITH agg_cats_of_post AS (SELECT … GROUP BY post_id) … FROM agg_cats_of_post
, possibly correlated (LATERAL
join))SELECT (SELECT COUNT(*) FROM post_cat pc WHERE pc.post_id = p.post_id) AS count_of_cats FROM post p
)EXISTS
filtersFROM post p WHERE EXISTS (SELECT 1 FROM post_cat pc WHERE pc.post_id = p.post_id AND p.category_name = 'website')
)