mysqldatabasesnomed-ct

Slow *initial* mysql query


Running Mysql V5.6.22, running a complex query on a properly indexed database, the response time initially is very slow (10's of seconds). Subsequent queries (for different items) are very responsive (50 msec). so I guess the query caches are doing their job -- but how do i minimise the initial slow response?

The database is an externally maintained Medical database (SNOMED), and I use the recommended views for the current snapshots - I think these views are the limitation on speed.

Strangely restarting mysql does not make a difference - It seems to be a time thing - if the database isn't used for some time it just takes ages to start up.

So my question is, is there a mysql setting that is used for how long these caches are kept or should I use a different approach and not use views (the data for SNOMED is updated 2x year, and in another similar database for medicines there are monthly releases.?

Some of you would like to see the query so here goes. Warning it does get a little tricky and the row count for the base queries are given in comments ... ;-)

SELECT DISTINCT concat(c.id, '::', c.effectiveTime) as id, `d1`.`term` as label, `d2`.`term`
FROM (`snomed`.`rf2_ss_refset` as refset)
JOIN `snomed`.`rf2_ss_concepts` as c ON `c`.`id` = `refset`.`referencedCOmponentId`
JOIN `snomed`.`rf2_ss_descriptions` as d1 ON `d1`.`conceptId` = `refset`.`referencedComponentId`
JOIN `snomed`.`rf2_ss_descriptions` as d2 ON `d2`.`conceptId` = `d1`.`conceptId`
JOIN `snomed`.`rf2_ss_language_refset` as lang ON `lang`.`referencedComponentId` = `d1`.`id`
WHERE `refset`.`refSetId` =  32570071000036102
AND `refset`.`active` =  1
AND `d2`.`typeId` =  900000000000013009
AND `d1`.`active` =  1
AND `d2`.`active` =  1
AND `d1`.`moduleId` =  900000000000207008
AND `d2`.`moduleId` =  900000000000207008
AND `lang`.`active` =  1
AND `lang`.`acceptabilityId` =  900000000000548007
AND `d2`.`term` like "hypertension%"
ORDER BY `d1`.`term`, `d2`.`term`

WHERE:

CREATE TABLE `rf2_ss_refset` (
  `id` char(36) COLLATE utf8_unicode_ci NOT NULL,
  `effectiveTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `active` smallint(1) NOT NULL,
  `moduleId` bigint(20) unsigned NOT NULL,
  `refSetId` bigint(20) unsigned NOT NULL,
  `referencedComponentId` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`,`effectiveTime`),
  KEY `moduleId_idx` (`moduleId`),
  KEY `refSetId_idx` (`refSetId`),
  KEY `referencedComponentId_idx` (`referencedComponentId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE ALGORITHM=UNDEFINED DEFINER=`snomed`@`localhost` SQL SECURITY DEFINER VIEW `rf2_ss_concepts`
AS SELECT
   `t1`.`id` AS `id`,
   `t1`.`effectiveTime` AS `effectiveTime`,
   `t1`.`active` AS `active`,
   `t1`.`moduleId` AS `moduleId`,
   `t1`.`definitionStatusId` AS `definitionStatusId`
FROM `rf2_full_concepts` `t1` where (`t1`.`effectiveTime` = (select max(`t2`.`effectiveTime`) from `rf2_full_concepts` `t2` where (`t1`.`id` = `t2`.`id`)));

CREATE ALGORITHM=UNDEFINED DEFINER=`snomed`@`localhost` SQL SECURITY DEFINER VIEW `rf2_ss_descriptions`
AS SELECT
   `t1`.`id` AS `id`,
   `t1`.`effectiveTime` AS `effectiveTime`,
   `t1`.`active` AS `active`,
   `t1`.`moduleID` AS `moduleID`,
   `t1`.`conceptId` AS `conceptId`,
   `t1`.`languageCode` AS `languageCode`,
   `t1`.`typeID` AS `typeID`,
   `t1`.`term` AS `term`,
   `t1`.`caseSignificanceId` AS `caseSignificanceId`
FROM `rf2_full_descriptions` `t1` where (`t1`.`effectiveTime` = (select max(`t2`.`effectiveTime`) from `rf2_full_descriptions` `t2` where (`t1`.`id` = `t2`.`id`)));

CREATE ALGORITHM=UNDEFINED DEFINER=`snomed`@`localhost` SQL SECURITY DEFINER VIEW `rf2_ss_language_refset`
AS SELECT
   `t1`.`id` AS `id`,
   `t1`.`effectiveTime` AS `effectiveTime`,
   `t1`.`active` AS `active`,
   `t1`.`moduleId` AS `moduleId`,
   `t1`.`refSetId` AS `refSetId`,
   `t1`.`referencedComponentId` AS `referencedComponentId`,
   `t1`.`acceptabilityId` AS `acceptabilityId`
FROM `rf2_full_language_refset` `t1` where (`t1`.`effectiveTime` = (select max(`t2`.`effectiveTime`) from `rf2_full_language_refset` `t2` where (`t1`.`id` = `t2`.`id`)));

CREATE ALGORITHM=UNDEFINED DEFINER=`snomed`@`localhost` SQL SECURITY DEFINER VIEW `rf2_ss_relationships`
AS SELECT
   `t1`.`id` AS `id`,
   `t1`.`effectiveTime` AS `effectiveTime`,
   `t1`.`active` AS `active`,
   `t1`.`moduleId` AS `moduleId`,
   `t1`.`sourceId` AS `sourceId`,
   `t1`.`destinationId` AS `destinationId`,
   `t1`.`relationshipGroup` AS `relationshipGroup`,
   `t1`.`typeId` AS `typeId`,
   `t1`.`characteristicTypeId` AS `characteristicTypeId`,
   `t1`.`modifierId` AS `modifierId`
FROM `rf2_full_relationships` `t1` where (`t1`.`effectiveTime` = (select max(`t2`.`effectiveTime`) from `rf2_full_relationships` `t2` where (`t1`.`id` = `t2`.`id`)));

#select count(*) from rf2_full_concepts # 507046
CREATE TABLE `rf2_full_concepts` (
  `id` bigint(20) unsigned NOT NULL,
  `effectiveTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `active` tinyint(4) DEFAULT NULL,
  `moduleId` bigint(20) unsigned NOT NULL,
  `definitionStatusId` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`,`effectiveTime`),
  KEY `moduleId_idx` (`moduleId`),
  KEY `definitionStatusId_idx` (`definitionStatusId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

#select count(*) from rf2_full_descriptions # 1486373
CREATE TABLE `rf2_full_descriptions` (
  `id` bigint(20) unsigned NOT NULL,
  `effectiveTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `active` tinyint(4) DEFAULT NULL,
  `moduleID` bigint(20) unsigned NOT NULL,
  `conceptId` bigint(20) unsigned NOT NULL,
  `languageCode` char(2) COLLATE utf8_unicode_ci NOT NULL DEFAULT 'en',
  `typeID` bigint(20) unsigned NOT NULL,
  `term` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `caseSignificanceId` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`,`effectiveTime`),
  KEY `moduleID_idx` (`moduleID`),
  KEY `conceptId_idx` (`conceptId`),
  KEY `typeID_idx` (`typeID`),
  KEY `caseSignificanceId_idx` (`caseSignificanceId`),
  KEY `term_idx` (`term`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

#select count(*) from rf2_full_relationships = 4582286 
CREATE TABLE `rf2_full_relationships` (
  `id` bigint(20) unsigned NOT NULL,
  `effectiveTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `active` tinyint(4) DEFAULT '1',
  `moduleId` bigint(20) unsigned NOT NULL,
  `sourceId` bigint(20) unsigned NOT NULL,
  `destinationId` bigint(20) unsigned NOT NULL,
  `relationshipGroup` bigint(20) unsigned NOT NULL,
  `typeId` bigint(20) unsigned NOT NULL,
  `characteristicTypeId` bigint(20) unsigned NOT NULL,
  `modifierId` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`,`effectiveTime`),
  KEY `moduleId_idx` (`moduleId`),
  KEY `sourceId_idx` (`sourceId`),
  KEY `destinationId_idx` (`destinationId`),
  KEY `relationshipGroup_idx` (`relationshipGroup`),
  KEY `typeId_idx` (`typeId`),
  KEY `characteristicTypeId_idx` (`characteristicTypeId`),
  KEY `modifierId_idx` (`modifierId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

#select count(*) from rf2_full_language_refset # 624467
CREATE TABLE `rf2_full_language_refset` (
  `id` char(36) COLLATE utf8_unicode_ci NOT NULL,
  `effectiveTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `active` smallint(1) NOT NULL,
  `moduleId` bigint(20) unsigned NOT NULL,
  `refSetId` bigint(20) unsigned NOT NULL,
  `referencedComponentId` bigint(20) unsigned NOT NULL,
  `acceptabilityId` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`,`effectiveTime`),
  KEY `moduleId` (`moduleId`),
  KEY `refSetId` (`refSetId`),
  KEY `referencedComponentId` (`referencedComponentId`),
  KEY `acceptabilityId` (`acceptabilityId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

Solution

  • Diagnosing someone else's server without all parts of the puzzle is difficult, so I will make the following assumptions (correct me if I'm wrong):

    1. This isn't running on a DB-grade server
    2. The MySQL config hasn't been especially configured to this task

    For the examples you have shown your tables look over-indexed, and due to the large dataset I can imagine that the largest index file is too big to fit into MySQL's index cache.

    Without more information about the environment, it seems like the behaviour you describe is down to the configured buffers. At this point, the OS steps in and buffers the DB in memory.

    When you restart MySQL, it queries nice and fast because the OS is still holding the files in its buffers. When you stop accessing the DB, the OS will eventually unbuffer the files and you will be back to the slow initial query.

    I have seen the same behaviour on large indexes which exceed the configured buffer size, but this is a much larger set of data. Configuring the DB to have larger buffers for indexes and table data solved my particular problem. In my case, query speed came down from 10-15 seconds to milliseconds.

    http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_buffer_pool_size

    If you have some free RAM, try upping the Buffer Pool Size somewhat. The default size is 128Mb, and your largest index file will be roughly 279Mb (64 bytes * 4,582,286 rows). To start, try setting value in your config to 512Mb. Restart, retest. If it's still not great, add another 128Mb and repeat until it works. If this DB is on a dedicated machine, set the value quite high (50-75% total RAM) should be safe depending on your setup.

    A quick Google came up with a nice guide on what config values to fiddle with. http://www.tocker.ca/2013/09/17/what-to-tune-in-mysql-56-after-installation.html