One user can have many jobs and one job can have many users.
users
+----+------+--------+
| ID | Name | gender |
+----+------+--------+
| 1 | Bob | male |
| 2 | Sara | female |
+----+------+--------+
jobs
+----+----------+
| id | job_id |
+----+----------+
| 1 | Engineer |
| 2 | Plumber |
| 3 | Doctor |
+----+----------+
users_jobs
+---------+--------+
| user_id | job_id |
+---------+--------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
+---------+--------+
I want to select all males and if they have at least 1 job return their info. If they have no jobs, then I don't select/return that user.
SELECT *
FROM users
INNER JOIN users_jobs
ON users_jobs.user_id = users.id
WHERE users.gender = 'male'
That returns Bob's info three times since he has 3 jobs. I don't want duplicates.
How can I get rid of duplicates without using DISTINCT
or GROUP BY
, since performance is very important?
To follow on from the comments, for performance, it's necessary to use a distinct
in your query, try:
SELECT DISTINCT Name FROM users
INNER JOIN users_jobs
ON users_jobs.user_id = users.id
WHERE users.gender = 'male'
If you're looking to get all the columns but keep the id's distinct you can use a GROUP BY
, try:
SELECT * FROM users
INNER JOIN users_jobs
ON users_jobs.user_id = users.id
WHERE users.gender = 'male'
GROUP BY users.id
Although this will also effect performance, it depends on what you prioritize the most.