mysqlsql-updatemysql-error-1062

MySQL: Strange behavior of UPDATE query (ERROR 1062 Duplicate entry)


I have a MySQL database the stores news articles with the publications date (just day information), the source, and category. Based on these I want to generate a table that holds the article counts w.r.t. to these 3 parameters.

Since for some combinations of these 3 parameters there might be no article, a simple GROUP BY won't do. I therefore first generate a table news_article_counts with all possible combinations of the 3 parameters, and an default article_count of 0 -- like this:

SELECT * FROM news_article_counts;
+--------------+------------+----------+---------------+
| published_at | source     | category | article_count |
+------------- +------------+----------+---------------+
| 2016-08-05   | 1826089206 |        0 |             0 |
| 2016-08-05   | 1826089206 |        1 |             0 |
| 2016-08-05   | 1826089206 |        2 |             0 |
| 2016-08-05   | 1826089206 |        3 |             0 |
| 2016-08-05   | 1826089206 |        4 |             0 |
| ...          | ...        |      ... |           ... |
+--------------+------------+----------+---------------+

For testing, I now created a temporary table tmp as the GROUP BY result from the original news article table:

SELECT * FROM tmp LIMIT 6;
+--------------+------------+----------+-----+
| published_at | source     | category | cnt |
+--------------+------------+----------+-----+
| 2016-08-05   | 1826089206 |        3 |   1 |
| 2003-09-19   | 1826089206 |        4 |   1 |
| 2005-08-08   | 1826089206 |        3 |   1 |
| 2008-07-22   | 1826089206 |        4 |   1 |
| 2008-11-26   | 1826089206 |        8 |   1 |
| ...          | ...        |      ... | ... |
+--------------+------------+----------+-----+

Given these two tables, the following query works as expected:

SELECT * FROM news_article_counts c, tmp t
WHERE c.published_at = t.published_at AND c.source = t.source AND c.category = t.category;

But now I need to update the article_count of table news_article_counts with the values in table tmp where the 3 parameters match up. For this I'm using the following query (I've tried different ways but with the same results):

UPDATE 
  news_article_counts c
INNER JOIN
  tmp t
ON
  c.published_at = t.published_at AND
  c.source = t.source AND
  c.category = t.category
SET
  c.article_count = t.cnt;

Executing this query yields this error:

ERROR 1062 (23000): Duplicate entry '2018-04-07 14:46:17-1826089206-1' for key 'uniqueIndex'

uniqueIndex is a joint index over published_at, source, category of table news_article_counts. But this shouldn't be a problem since I do not -- as far as I can tell -- update any of those 3 values, only article_count.

What confuses me most is that in the error it mentions the timestamp I executed the query (here: 2018-04-07 14:46:17). I have no absolutely idea where this comes into play. In fact, some rows in news_article_counts now have 2018-04-07 14:46:17 as value for published_at. While this explains the error, I cannot see why published_at gets overwritten with the current timestamp. There is no ON UPDATE CURRENT_TIMESTAMP on this column; see:

CREATE TABLE IF NOT EXISTS `test`.`news_article_counts` (
  `published_at` TIMESTAMP NOT NULL,
  `source` INT UNSIGNED NOT NULL,
  `category` INT UNSIGNED NOT NULL,
  `article_count` INT UNSIGNED NOT NULL DEFAULT 0,
  UNIQUE INDEX `uniqueIndex` (`published_at` ASC, `source` ASC, `category`  ASC))
ENGINE = MyISAM
DEFAULT CHARACTER SET = utf8mb4;

What am I missing here?

UPDATE 1: I actually checked the table definition of news_article_counts in the database. And there's indeed the following:

mysql> SHOW COLUMNS FROM news_article_counts;
+---------------+------------------+------+-----+-------------------+-----------------------------+
| Field         | Type             | Null | Key | Default           | Extra                       |
+---------------+------------------+------+-----+-------------------+-----------------------------+
| published_at  | timestamp        | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| source        | int(10) unsigned | NO   |     | NULL              |                             |
| category      | int(10) unsigned | NO   |     | NULL              |                             |
| article_count | int(10) unsigned | NO   |     | 0                 |                             |
+---------------+------------------+------+-----+-------------------+-----------------------------+

But why is on update CURRENT_TIMESTAMP set. I double and triple-checked my CREATE TABLE statement. I removed the joint index, I added an artificial primary key (auto_increment). Nothing help. I've even tried to explicitly remove these attributes from published_at with:

ALTER TABLE `news_article_counts` CHANGE `published_at` `published_at` TIMESTAMP NOT NULL;

Nothing seems to work for me.


Solution

  • It looks like you have the explicit_defaults_for_timestamp system variable disabled. One of the effects of this is:

    The first TIMESTAMP column in a table, if not explicitly declared with the NULL attribute or an explicit DEFAULT or ON UPDATE attribute, is automatically declared with the DEFAULT CURRENT_TIMESTAMP and ON UPDATE CURRENT_TIMESTAMP attributes.

    You could try enabling this system variable, but that could potentially impact other applications. I think it only takes effect when you're actually creating a table, so it shouldn't affect any existing tables.

    If you don't to make a system-level change like this, you could add an explicit DEFAULT attribute to the published_at column of this table, then it won't automatically add ON UPDATE.