<h3>Introduction</h3> In this tutorial we are going to review indexes. I never cease to be amazed at how many people do not know what indexes are or how to use them. An index is simply a data structure that provides a fast access path to rows in a table based on the values in one or more columns (the index key). This allows for fast search techniques to be used to find the values ,rather than having to scan the entire table row by row, which results in much faster data retrieval. This can be a big performance booster in your environment, lowering the amount of time it takes to run a SELECT query and get data back.
The concept of an index is to make it easier for the DBMS to relate data to an index to provide fast access to the rows in the table. Lets say we have a table called customers with the following columns: name, address, city, state, phone number. The index would be applied to the name column. This would result in an ordered list of keys (the customers name) along with the locations of the associated rows in the customers table. The rows in the table would be in a relatively random order, but because the index is in alphabetical order by name, it can be searched quickly to locate a specific customer. Once the customer is found, the DBMS can use that information to go directly to the correct row(s) in the table and avoid having to perform a slow sequential scan.
Once the index is created, the DBMS will use the index when it determines that the index would speed up data retrieval. When you create a primary key for a table, the DBMS will automatically create an index using the key or require you to create a unique index for the primary key. That means that the first step in inserting a new row into a table is to verify that the primary key (index) is unique to the index. Uniqueness is therefore enforced on index entries, rather than the base table, as verifying uniqueness directly on the base table would be much slower than on the ordered index. When it comes to primary keys, the DBMS will create an index on them automatically. Additionally, you may create indexes to provide fast access to any column, or combination of columns, if you choose.
Now you may be wondering why we don't index every column, and the reasons for that are pretty simple. First, and most importantly if you are on a smaller server, an index takes up disk space in the database. If you have the money, which doesn't require much since disk space can be bought for such a cheap price, this isn't a huge drawback. It all comes down to the size of your server and the size of your already existing database. The second reason comes down to modifications of the indexed column. Any time data is INSERTED, UPDATED or DELETED against the indexed column, the DBMS must update the index as well as the base table. This can slow down data modification, especially in larger tables, so knowing ahead of time how many reads an indexed column will have is good to know. It will all come down to reads VS. writes on that column, and if the benefits outweigh the cons. You can typically be pretty safe creating indexes on foreign keys, columns frequently used in WHERE clauses and columns whos table has a high volume of reads and very little writes. Remember that you can always delete, and create, indexes at a later time. <h3>That's great, but does it really help?</h3> The answer to this question is YES. Lets take the customer table example we used earlier. Say we are a large credit company with thousands of customers...we'll say 100,000 customers (this is not unrealistic). Now if we were to do a search against this table with no indexes, you'd be looking at sequentially searching through about 50,000 names before you find the one you want. If there are duplicate names in the table, you're looking at searching all 100,000 names. With an index this list of names becomes alphabetically ordered. The DBMS is going to take the name, start somewhere in the middle of the list, and determine if the name occurs before or after the randomly selected one (or if it got lucky and selected it right off). From there we have cut the list in half and the DBMS will take the half that contains the name and cut it in half. It will do this a few times until it finds the name we want. If the name does not exist we've still only performed a handful of searches, as opposed to searching the entire table of 100,000 entries.