Let's say I have these two tables. Where I insert employees to employee
table coming from the staging
table.
staging table:
id | employee_id | name |
---|---|---|
1 | 12 | Paul |
2 | 13 | Kyle |
employee table
id | employee_id | name |
---|---|---|
5 | 4 | Will |
6 | 13 | Kyle |
Now, on the employee table let's say I'd like to copy what's on my staging table currently, using the INSERT SELECT INTO
statement, Paul will be inserted but I don't want Kyle to be inserted since he's on the employee table already(employee.employee_id is the defining column).
I know this could be done by just setting a unique or primary key, on employee_id
and just using the statement ON DUPLICATE KEY UPDATE
then do nothing by just setting them back to their original values.
I'm new to SQL, and I'm stuck with the solution setting a UNIQUE
key and ON DUPLICATE KEY UPDATE
statement, but I'd like to know how to do this without that solution I mentioned?
First of all, you should keep in mind that the decision whether to create unique or primary keys or not does not depend on how to create insert statements or such. It's a matter of what your table should do and what not. In order to achieve your goal, you can add a where to your insert statement which excludes the already existing entries, as example:
INSERT INTO employees (id, employee_id, name)
SELECT id, employee_id, name
FROM staging
WHERE employee_id NOT IN (SELECT employee_id FROM employees)