I'm making one custom app that already has over 10.000.000 users.
I had to create one custom query that extracts users from a specific member type.
The problem is if I want to see only 20 users and do pagination, it needs a lot of time to get the data out of the database and often there a critical error occurs because the query is very heavy.
The same happens if I use BP_User_Query
(Buddypress class) because my code is based on it.
The biggest problem is that Buddypress member_type
has been recorded within WP terms and is reached via a complicated relation.
Here is my query:
SELECT
wp_users.*,
t_ex.name AS member_type,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='first_name' AND wp_usermeta.user_id=wp_users.ID ) AS first_name,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='last_name' AND wp_usermeta.user_id=wp_users.ID ) AS last_name,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='nickname' AND wp_usermeta.user_id=wp_users.ID ) AS nickname,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='description' AND wp_usermeta.user_id=wp_users.ID ) AS description,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='last_update' AND wp_usermeta.user_id=wp_users.ID ) AS last_update,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='wp_capabilities' AND wp_usermeta.user_id=wp_users.ID ) AS caps,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='rich_editing' AND wp_usermeta.user_id=wp_users.ID ) AS rich_editing
FROM
wp_users
LEFT JOIN wp_term_relationships tr_ex ON tr_ex.object_id = wp_users.ID
LEFT JOIN wp_term_taxonomy tt_ex ON tt_ex.term_taxonomy_id = tr_ex.term_taxonomy_id
LEFT JOIN wp_terms t_ex ON t_ex.term_id = tt_ex.term_id
WHERE
1=1
AND
tt_ex.taxonomy = 'bp_member_type'
AND
t_ex.name = 'platformuser'
GROUP BY wp_users.ID
ORDER BY wp_users.display_name ASC
LIMIT 0, 20
Is there any better solution how I can get the desired results without it being so difficult for the database.
The parameter I am looking for is located in t_ex.name
, in my case 'platformuser'
I think the problem is in group and order:
...
GROUP BY wp_users.ID
ORDER BY wp_users.display_name ASC
...
But don't know why. I need that also.
Some things to work with.
First of all, ten million users is a very large number for WordPress. You already know that.
Second, please consider rewriting your query to do a so-called "deferred join." You have the notorious performance antipattern SELECT many_long_rows ... ORDER BY something LIMIT small_number
. This forces MySQL to sort a whole mess of data, just to discard all but a tiny fraction of it.
Suggested rewrite: Start with a subquery to get the wp_users.ID
values you're interested in. Working from your example, the subquery is this.
SELECT wp_users.ID
FROM wp_users
LEFT JOIN wp_term_relationships tr_ex ON tr_ex.object_id = wp_users.ID
LEFT JOIN wp_term_taxonomy tt_ex ON tt_ex.term_taxonomy_id = tr_ex.term_taxonomy_id
LEFT JOIN wp_terms t_ex ON t_ex.term_id = tt_ex.term_id
WHERE 1=1
AND tt_ex.taxonomy = 'bp_member_type'
AND t_ex.name = 'platformuser'
GROUP BY wp_users.ID
ORDER BY wp_users.display_name ASC
LIMIT 0, 20
This has to sort a whole lot less data ... just the ID values. So it should be faster and take a whole lot less RAM in MySQL.
Then use that subquery in your main query to process the twenty users you care about.
SELECT
wp_users.*,
'platformuser' AS member_type,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='first_name' AND wp_usermeta.user_id=wp_users.ID ) AS first_name,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='last_name' AND wp_usermeta.user_id=wp_users.ID ) AS last_name,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='nickname' AND wp_usermeta.user_id=wp_users.ID ) AS nickname,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='description' AND wp_usermeta.user_id=wp_users.ID ) AS description,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='last_update' AND wp_usermeta.user_id=wp_users.ID ) AS last_update,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='wp_capabilities' AND wp_usermeta.user_id=wp_users.ID ) AS caps,
( SELECT meta_value FROM wp_usermeta WHERE wp_usermeta.meta_key='rich_editing' AND wp_usermeta.user_id=wp_users.ID ) AS rich_editing
FROM
wp_users
WHERE wp_users.ID IN (
SELECT wp_users.ID
FROM wp_users
LEFT JOIN wp_term_relationships tr_ex ON tr_ex.object_id = wp_users.ID
LEFT JOIN wp_term_taxonomy tt_ex ON tt_ex.term_taxonomy_id = tr_ex.term_taxonomy_id
LEFT JOIN wp_terms t_ex ON t_ex.term_id = tt_ex.term_id
WHERE 1=1
AND tt_ex.taxonomy = 'bp_member_type'
AND t_ex.name = 'platformuser'
GROUP BY wp_users.ID
ORDER BY wp_users.display_name ASC
LIMIT 0, 20
)
GROUP BY wp_users.ID
ORDER BY wp_users.display_name ASC
LIMIT 0, 20
It won't take long to look up the metadata for just twenty users.
Third, WordPress's wp_users
table, as defined by https://ma.tt and company, has no index on the display_name
column. So MySQL's index trickery for avoiding a gigantic sort doesn't work. You can remediate this
ORDER BY wp_users.user_nicename
in place of display_name
, or bydisplay_name
column like this.
ALTER TABLE wp_users ADD KEY display_name (display_name);
Fourth, if you're running an old version of MySQL (before 8) or MariaDB (before 10.3), upgrade it. And convert your tables to use InnoDB and DYNAMIC row format. You're pushing the envelope hard with ten million users. Why use obsolete software when you do that?
Fifth (much less important if you do the above things) the indexes on wp_usermeta are not optimal for the kind of lookup you do. This sequence of MySQL statements will reindex the table to give you much faster lookup. This may take a while.
ALTER TABLE wp_usermeta ADD UNIQUE KEY umeta_id (umeta_id);
ALTER TABLE wp_usermeta DROP PRIMARY KEY;
ALTER TABLE wp_usermeta ADD PRIMARY KEY (user_id, meta_key, umeta_id);
ALTER TABLE wp_usermeta DROP KEY user_id;
ALTER TABLE wp_usermeta DROP KEY meta_key;
ALTER TABLE wp_usermeta ADD KEY meta_key (meta_key, user_id);
@RickJames and I have published a (free open source) WordPress plugin to handle the InnoDB conversion and reindexing for you, or you can certainly do it yourself. You can use it with WP-CLI to avoid timeouts.
I think the next version of our plugin should also add the display_name
index to wp_users
to cover your case. (I've heard of others with a similar problem.)