The following shortened query selects all rows (entrys
) inside a given distance (calculated from user e.altloc = 0: location
or e.altloc = 1: altlocation
).
I have indexes on e.uid, al.eid, e.country, e.tmstmp and the id
's are primary keys.
The problem according to explain all rows are need to process the query, instead of 2 rows which I like with limit 2.
I have read this question, but I'm not able to do the Limit before using a join, because I need to join the location tables before I can do the limit 2
else the return would be wrong.
https://dba.stackexchange.com/questions/52079/does-using-limit-improve-the-performance-and-is-it-noticeable
The query:
SELECT
e.id, e.uid, e.title, e.description, l.place, l.placenonce, al.altplace, al.altplacenonce,
IF(e.altloc=0,
6371 * acos( cos( radians(:lat) ) * cos( radians( AES_DECRYPT(lat, UNHEX('###'), latnonce) ) ) * cos( radians( AES_DECRYPT(lng, UNHEX('###'), lngnonce) ) - radians(:lng) ) + sin( radians(:lat) ) * sin(radians(AES_DECRYPT(lat, UNHEX('###'), latnonce))) ) ,
6371 * acos( cos( radians(:lat) ) * cos( radians( AES_DECRYPT(altlat, UNHEX('###'), altlatnonce) ) ) * cos( radians( AES_DECRYPT(altlng, UNHEX('###'), altlngnonce) ) - radians(:lng) ) + sin( radians(:lat) ) * sin(radians(AES_DECRYPT(altlat, UNHEX('###'), altlatnonce))) )
) AS distance
FROM
entrys e
INNER JOIN
location l
ON l.id = e.uid
LEFT JOIN
altlocation al
ON al.eid = e.id
WHERE
IF(:border = 0, e.country = :countryid, e.country != 0 )
HAVING
distance <= 50
ORDER BY
e.tmstmp
DESC
LIMIT 2
A second example with a fixed location:
SELECT
s.id, s.image, s.description, s.title,
(
6371 * acos( cos( radians(:lat) ) * cos( radians( AES_DECRYPT(l.lat, :key, l.latnonce) ) ) * cos( radians( AES_DECRYPT(l.lng, :key, l.lngnonce) ) - radians(:lng) ) + sin( radians(:lat) ) * sin(radians(AES_DECRYPT(l.lat, :key, l.latnonce))) )
) AS distance
FROM
sponsors s
INNER JOIN
location l
ON l.id = s.id
WHERE
s.comp = 1 OR s.comp = 3 AND s.active = 1
HAVING
distance <= 50
ORDER BY
s.rotate
ASC
LIMIT 2
How to improve this location based querys, if there are million of rows in my database? I need to output only 2 rows of each query.
Create table for the first example:
CREATE TABLE `entrys` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uid` int(5) NOT NULL,
`tmstmp` bigint(11) NOT NULL,
`approx_lat` mediumint(9) NOT NULL,
`approx_lng` mediumint(9) NOT NULL,
`altloc` tinyint(4) NOT NULL,
`title` varchar(70) COLLATE latin1_general_ci NOT NULL,
`description` text COLLATE latin1_general_ci NOT NULL,
`country` tinyint(4) NOT NULL,
PRIMARY KEY (`id`),
KEY `uid` (`uid`),
KEY `tmstmp` (`tmstmp`),
KEY `country` (`country`),
) ENGINE=MyISAM CHARSET=latin1 COLLATE=latin1_general_ci
CREATE TABLE `location` (
`id` int(5) NOT NULL,
`lat` varbinary(50) NOT NULL,
`latnonce` varbinary(25) NOT NULL,
`lng` varbinary(50) NOT NULL,
`lngnonce` varbinary(25) NOT NULL,
`place` tinyblob NOT NULL,
`placenonce` tinyblob NOT NULL,
UNIQUE KEY `id` (`id`),
KEY `lat` (`lat`),
KEY `lng` (`lng`)
)
CREATE TABLE `altlocation` (
`id` int(5) NOT NULL,
`eid` int(5) NOT NULL,
`altlat` varbinary(50) NOT NULL,
`altlatnonce` varbinary(25) NOT NULL,
`altlng` varbinary(50) NOT NULL,
`altlngnonce` varbinary(25) NOT NULL,
`altplace` tinyblob NOT NULL,
`altplacenonce` tinyblob NOT NULL,
UNIQUE KEY `eid` (`eid`),
KEY `altlat` (`altlat`),
KEY `altlng` (`altlng`)
)
sidenote: the engine for entrys should probably be innodb, having ~70% read. The location tables both running with innodb.
EDIT question to Willem Renzema for his answer:
Would it be more efficient like that?
SELECT
e.id, e.uid, e.title, e.description, l.place, l.placenonce, al.altplace, al.altplacenonce,
IF(e.altloc=0,
6371 * acos( cos( radians(:lat) ) * cos( radians( AES_DECRYPT(lat, UNHEX('###'), latnonce) ) ) * cos( radians( AES_DECRYPT(lng, UNHEX('###'), lngnonce) ) - radians(:lng) ) + sin( radians(:lat) ) * sin(radians(AES_DECRYPT(lat, UNHEX('###'), latnonce))) ) ,
6371 * acos( cos( radians(:lat) ) * cos( radians( AES_DECRYPT(altlat, UNHEX('###'), altlatnonce) ) ) * cos( radians( AES_DECRYPT(altlng, UNHEX('###'), altlngnonce) ) - radians(:lng) ) + sin( radians(:lat) ) * sin(radians(AES_DECRYPT(altlat, UNHEX('###'), altlatnonce))) )
) AS distance
FROM
(
SELECT id, uid, title, description
FROM
entrys
WHERE
approx_lat > :min_lat
AND approx_lat < :max_lat
AND approx_lng > :min_lng
AND approx_lng < :min_lng
ORDER BY
e.tmstmp
DESC
LIMIT 2
) AS e
INNER JOIN
location l
ON l.id = uid
LEFT JOIN
altlocation al
ON al.eid = e.id
HAVING
distance <= 50
If I would add approx_lat and approx_lng to the entry table.
The clue would be moving approx_lat and approx_lng to the entry table, that I could insert altlocation OR location only, so I could get rid off IF
inside the query.
Is HAVING distance <= 50
still necessary?
Use a bounding box in your query.
Example (only changes are in the WHERE clause):
SELECT
e.id, e.uid, e.title, e.description, l.place, l.placenonce, al.altplace, al.altplacenonce,
IF(e.altloc=0,
6371 * acos( cos( radians(:lat) ) * cos( radians( AES_DECRYPT(lat, UNHEX('###'), latnonce) ) ) * cos( radians( AES_DECRYPT(lng, UNHEX('###'), lngnonce) ) - radians(:lng) ) + sin( radians(:lat) ) * sin(radians(AES_DECRYPT(lat, UNHEX('###'), latnonce))) ) ,
6371 * acos( cos( radians(:lat) ) * cos( radians( AES_DECRYPT(altlat, UNHEX('###'), altlatnonce) ) ) * cos( radians( AES_DECRYPT(altlng, UNHEX('###'), altlngnonce) ) - radians(:lng) ) + sin( radians(:lat) ) * sin(radians(AES_DECRYPT(altlat, UNHEX('###'), altlatnonce))) )
) AS distance
FROM
entrys e
INNER JOIN
location l
ON l.id = e.uid
LEFT JOIN
altlocation al
ON al.eid = e.id
WHERE
e.country = :countryid
AND l.approx_lat > :min_lat
AND l.approx_lat < :max_lat
AND l.approx_lng > :min_lng
AND l.approx_lng < :min_long
HAVING
distance <= 50
ORDER BY
e.tmstmp
DESC
LIMIT 2
You would compute the :min_lat
, :max_lat
, :min_lng
, and :max_lng
before you execute the query. Those values would be generated from the desired radius from your :lat
and :lng
values (in this case, 50).
How exactly to do that I suggest reading one of the many other answers, such as this one, that are all over the internet. Just search for geolocation bounding box to get started.
Then, you can further improve performance by adding an index on the approx_lat
and approx_lng
columns. You can also try adding a couple composite indexes, of (approx_lat,approx_lng)
and/or (approx_lng,approx_lat)
, as the optimizer MAY be able to make use of those. However, those are things I would strongly advise benchmarking to see if they provide any improvement or not. Additional columns to make these covering indexes may also help, but I'm focusing on the most basic issues at the moment.
Note that what you are trying to optimize is already a difficult optimization problem. The fact that you need to encrypt your data, makes it even harder. However, so long as you can store these approximate values, we can bypass most of that extra difficulty.
I would also strongly advice you to keep IF
logic out of your WHERE
clause. By including that, you force the optimizer to look up every single record to see if it matches that condition or not.
In general, to get good performance, you need to limit the number of records that need to be checked. The IF
statement cannot be optimized (it is not sargable). This is also why my answer requires you to store approximate values in order to be effective. If the data has to be decrypted first, that means that every single record has to be looked up and checked. That is what is going to kill your performance.
Also note that in my example query I ignored the altlocation
table in the WHERE
clause. Ideally, if location
and altlocation
are identical, you should have only a single table for the data, and then JOIN to that table on the from a record that records a location id
as being a primary or an "alternate".
I hope this can at least help you on the right direction.