performancesearchindexingldapopendj

Does limiting an LDAP search by baseDN provide any benefit when the attribute being searched on has an index?


We are designing an LDAP schema (specifically for OpenDJ) and we primarily need to be able to search on the mail attribute. We don't need to do a substring search as the user would provide the whole email address when they log in.

We already have an index on the mail attribute. However we are also considering to sub-divide the user directory by the first letter of the email address as well (so all users with an email address that starts with the letter A would be in an ou=A subdirectory under ou=users. The only value I can see in doing this is that when we do searches for a user by email, we can limit the baseDN of the search, thus reducing the scope of the search to approximately 1/26 of the entire directory.

My primary question is, does limiting the baseDN of an LDAP search like this provide any improvement on performance if the attribute already has an index? Do indexes take into account the baseDN, or are they indexed over the whole directory?

A secondary question, if I'm allowed, is there any other usage for splitting the users directory by first letter (or any other arrangement) other than providing a more specific baseDN when searching?


Solution

  • What you are thinking about seems like premature optimization when you don't even know if you have a performance issue. Also, indexes and processing a query is not a standard element of LDAP, it's an implementation detail of the technology you are using.

    In OpenDJ, an index is configured and maintain for a whole database backend. The cost of a lookup in the email equality index and returning a single entry is the same whether you have 1 entry or 1 billion entries.

    I have more than 20 years of experiences with LDAP and directory services, I've never seen any directory structured with splitting entries by the first letter of an attribute.