I'm trying to select first names from a lookup table at random in MySQL to build a test dataset. I have a table with 200 first names, genders and a row id going from 1 to 200. Something like this:
id firstname gender
1 Aaron m
2 Adam m
3 Alan m
etc...
I'm selecting from this table using a random generator with the following query:
SELECT id, firstname FROM firstname WHERE id = round(1 + (rand() * 199));
I am expecting the random number to tally up with exactly one id from the lookup table, thus producing a single results like
id firstname
43 Jason
Running the code again and again instead gives me a selection of
id firstname
29 Ethan
147 Jean
If I run the random generator on its own, it will always generate a number between 1 and 200. As you can see below, the id field is INT, and the query behaves the same way if I cast the result as SIGNED. I have also tried to use FLOOR instead of ROUND, just to see if that worked any differently - alas, no.
Can anyone tell my why the anomaly? What am I missing?
Here is some code to create the first 20 rows of the original table for testing purposes:
-- First Name --
drop table if exists firstname;
CREATE TABLE firstname (
id INT NOT NULL,
firstname VARCHAR(20) NOT NULL,
gender VARCHAR(1) NOT NULL,
PRIMARY KEY (id),
UNIQUE (firstname)
);
INSERT INTO firstname
(id,firstname,gender)
VALUES
(1,"Aaron","m"),
(2,"Adam","m"),
(3,"Alan","m"),
(4,"Albert","m"),
(5,"Alexander","m"),
(6,"Andrew","m"),
(7,"Anthony","m"),
(8,"Arthur","m"),
(9,"Austin","m"),
(10,"Benjamin","m"),
(11,"Billy","m"),
(12,"Bobby","m"),
(13,"Brandon","m"),
(14,"Brian","m"),
(15,"Bruce","m"),
(16,"Bryan","m"),
(17,"Carl","m"),
(18,"Charles","m"),
(19,"Christian","m"),
(20,"Christopher","m");
Since RAND()
is not deterministic, the WHERE condition is evaluated/executed once per each row. Thus each row has a chance of 1/199 to be selected. You can use a subquery in the FROM clause (derived table) instead to generate exactly one random number:
SELECT f.id, f.firstname
FROM firstname f
JOIN (SELECT floor(rand()*200)+1 as rnd) r ON r.rnd = f.id