databaseoracleindexingdatabase-fragmentation

What is Index fragmentation in Oracle SQL?


Can anyone explain what index fragmentation is? I googled Index Fragmentation but couldn't find satisfactory answer.


Solution

  • Leaving all unrelated technical aspects aside; Fragmentation in context of a database is ordered data being stored in a non-ordered way. This causes unwanted performance overhead causing slowdowns for the database.

    Say you have a table with employees. The index contains sorted data for accessing employees by their identification number. An index contains data stored in sequence. For simplicity reasons we have a table containing four employees:

    1   Anne
    3   Charly
    4   James
    5   William
    

    A database engine stores a few employees in a page. This is typically a fixed sized sorted bucket. So let's put the employees in a page. Let's assume that we can only put two employees in a page. We'd end up with:

    [ Page 1, next page is page 2, there is no previous page ]
        1   Anne
        2   Charly
    
    [ Page 2, there is no next page, but there is a previous page 1 ]
        4   James
        5   William
    

    Now the problem rises when we want to add Bert with identification number 3. It doesn't fit in either page. Not at the end of page 1, or at the start of page 2. We need to create a new page for Bert, and fix the references of (previous and next) pages so that they're still ordered.

    [ Page 1, next page is page 3, there is no previous page ]
        1   Anne
        2   Charly
    
    [ Page 2, there is no next page, but there is a previous page 3 ]
        4   James
        5   William
    
    [ Page 3, next page is page 2, previous page 1 ]
        3   Bert
    

    Note that page 3 is at the end of the list. The database engine can still start at page 1 and go through the pages in an ordered fashion; namely by going to the next page 3, and then to the next page 2. This isn't optimal however. The engine needs to jump back and forth in order to find it's data, instead of just walking from page 1 until the last page. This is exactly what index fragmentation is.

    We can defragment the index by sorting (and rebuilding) the pages again. I'll spare you the specific steps, but the result of this, is that the pages are in-order and that the data it contains is ordered as well.

    [ Page 1, next page is page 2, there is no previous page ]
        1   Anne
        2   Charly
    
    [ Page 2, next page is page 3, previous page 1  ]
        3   Bert
        4   James
    
    [ Page 3, there is no next page, previous page 2 ]
        5   William
    

    You might ask, why not do this right away? There's always a trade-off. Changing the least amount of data (in this case the pages), bothers the least amount of other users (in this case other queries or changes to the database). In the scenario where the new page is placed at the end, we only need to change a few pages. If we would update the index to be fully ordered, it would require changing most of the pages, if not all of them. While changing a page, other changes to the same page (or worse) have to wait for the former change to be committed.