All About Indexes

All About Indexes in Oracle

What is an Index?

An index is used to increase read access performance. A book, having an index, allows rapid access to a particular subject area within that book. Indexing a database table provides rapid location of specific rows within that table, where indexes are used to optimize the speed of access to rows. When indexes are not used or are not matched by SQL statements submitted to that database then a full table scan is executed. A full table scan will read all the data in a table to find a specific row or set of rows, this is extremely inefficient when there are many rows in the table.

i  It is often more efficient to full table scan small tables. The optimizer will often assess full table scan on small tables as being more efficient than reading both index and data space, particularly where a range scan rather than an exact match would be used against the index.

An index of columns on a table contains a one-to-one ratio of rows between index and indexed table, excluding binary key groupings, more on this later. An index is effectively a separate table to that of the data table. Tables and indexes are often referred to as data and index spaces. An index contains the indexed columns plus a ROWID value for each of those column combination rows. When an index is searched through the indexed columns rather than all the data in the row of a table is scanned. The index space ROWID is then used to access the table row directly in the data space. An index row is generally much smaller than a table row, thus more index rows are stored in the same physical space, a block. As a result less of the database is accessed when using indexes as opposed to tables to search for data. This is the reason why indexes enhance performance.

The Basic “How to” of Indexing

There are a number of important factors with respect to efficient and effective creation and use of indexing.

â  The number of indexes per table.

â  The number of table columns to be indexed.

â  What datatypes are sensible for use in columns to be indexed?

â  Types of indexes from numerous forms of indexes available.

â  How does SQL behave with indexes?

â  What should be indexed?

â  What should not be indexed?

Number of Indexes per Table

Whenever a table is inserted into, updated or deleted from, all indexes plus the table must be updated. Thus if one places ten indexes onto a single table then every change to that table requires an effective change to a single table and ten indexes. The result is that performance will be substantially degraded since one insert requires eleven inserts to insert the new row into both data and index spaces. Be frugal with indexing and be conscious of the potential ill as well as the good effects produced by indexing. The general rule is that the more dynamic a table is the fewer indexes it should have.

i  A dynamic table is a table changes constantly, such as a transactions table. Catalog tables on the other hand store information such as customer details; customers change a lot less often than invoices. Customer details are thus static in nature and over-indexing may be advantageous to performance.

Number of Columns to Index

Composite indexes are indexes made up of multiple columns. Minimize on the number of columns in a composite key. Create indexes with single columns. Composite indexes are often a requirement of traditional relational database table structures.

i  With the advent of object-oriented application programming languages such as Java, sequence identifiers tend to be used to identify every row in every table uniquely. The result is single column indexes for every table. The only exceptions are generally many-to-many join resolution entities.

It may sometimes be better to exclude some of the lower-level or less relevant columns from the index since at that level there may not be much data, if there are not many rows to index it can be more efficient to read a group of rows from the data space. For instance, a composite index comprised of five columns could be reduced to the first three columns based on a limited number of rows traversed as a result of ignoring the last two columns. Look at your data carefully when constructing indexes. The more columns you add to a composite index the slower the search will be since there is a more complex requirement for that search and the indexes get physically larger. The benefit of indexes is that an index occupies less physical space than the data. If the index gets so large that it is as large as the data then it will become less efficient to read both the index and data spaces rather than just the data space.

i  Most database experts recommend a maximum of three columns for composite keys.

Datatypes of Index Columns

Integers make the most efficient indexes. Try to always create indexes on columns with fixed length values. Avoid using VARCHAR2 and any object data types. Use integers if possible or fixed length, short strings. Also try to avaoid indexing on dates and floating-point values. If using dates be sure to use the internal representation or just the date, not the date and the time. Use integer generating sequences wherever possible to create consistently sequential values.

Types of Indexes

There are different types of indexes available in different databases. These different indexes are applicable under specific circumstances, generally for specific search patterns, for instance exact matches or range matches.

The simplest form of indexing is no index at all, a heap structure. A heap structure is effectively a collection of data units, rows, which is completely unordered. The most commonly used indexed structure is a Btree (Binary Tree). A Btree index is best used for exact matches and range searches. Other methods of indexing exist.

1. Hashing algorithms produce a pre-calculated best guess on general row location and are best used for exact matches.

2. ISAM or Indexed Sequential Access Method indexes are not used in Oracle.

3. Bitmaps contain maps of zero’s and 1’s and can be highly efficient access methods for read-only data.

4. There are other types of indexing which involve clustering of data with indexes.

In general every index type other than a Btree involves overflow. When an index is required to overflow it means that the index itself cannot be changed when rows are added, changed or removed. The result is inefficiency because a search to find overflowing data involves a search through originally indexed rows plus overflowing rows. Overflow index space is normally not ordered. A Btree index can be altered by changes to data. The only exception to a Btree index coping with data changes in Oracle is deletion of rows. When rows are deleted from a table, physical space previously used by the index for the deleted row is never reclaimed unless the index is rebuilt. Rebuilding of Btree indexes is far less common than that for other types of indexes since non-Btree indexes simply overflow when row changes are applied to them.

Oracle uses has the following types of indexing available.

â  Btree index. A Btree is a binary tree. General all-round index and common in OLTP systems. An Oracle Btree index has three layers, the first two are branch node layers and the third, the lowest, contains leaf nodes. The branch nodes contain pointers to the lower level branch or leaf node. Leaf nodes contain index column values plus a ROWID pointer to the table row. The branch and leaf nodes are optimally arranged in the tree such that each branch will contain an equal number of branch or leaf nodes.

â  Bitmap index. Bitmap containing binary representations for each row. A zero implies that a row does not have a specified value and a 1 denotes that row having that value. Bitmaps are very susceptible to overflow in OLTP systems and should only be used for read-only data such as in Data Warehouses.

â  Function-Based index. Contains the result of an expression pre-calculated on each row in a table.

â  Index Organized Tables. Clusters index and data spaces together physically for a single table and orders the merged physical space in the order of the index, usually the primary key. An index organized table is a table as well as an index, the two are merged.

â  Clusters. Partial merge of index and data spaces, ordered by an index, not necessarily the primary key. A cluster is similar to an index organized table except that it can be built on a join (more than a single table). Clusters can be ordered using binary tree structures or hashing algorithms. A cluster could also be viewed as a table as well as an index since clustering partially merges index and data spaces.

â  Bitmap Join index. Creates a single bitmap for one table in a join.

â  Domain index. Specific to certain application types using contextual or spatial data, amongst others.

Indexing Attributes

Various types of indexes can have specific attributes or behaviors applied to them. These behaviors are listed below, some are Oracle specific and some are not.

â  Ascending or Descending. Indexes can be order in either way.

â  Uniqueness. Indexes can be unique or non-unique. Primary keys must be unique since a primary key uniquely identifies a row in a table referentially. Other columns such as names sometimes have unique constraints or indexes, or both, added to them.

â  Composites. A composite index is an index made up of more than one column in a table.

â  Compression. Applies to Btree indexes where duplicated prefix values are removed. Compression speeds up data retrieval but can slow down table changes.

â  Reverse keys. Bytes for all columns in the index are reversed, retaining the order of the columns. Reverse keys can help performance in clustered server environments (Oracle8i Parallel Server / RAC Oracle9i) by ensuring that changes to similar key values will be better physically spread. Reverse key indexing can apply to rows inserted into OLTP tables using sequence integer generators, where each number is very close to the previous number. When searching for and updating rows with sequence identifiers, where rows are searched for

â  Null values. Null values are generally not included in indexes.

â  Sorting (NOSORT). This option is Oracle specific and does not sort an index. This assumes that data space is physically ordered in the desired manner.

What SQL does with Indexes

In general a SQL statement will attempt to match the structure of itself to an index, the where clause ordering will attempt to match available indexes and use them if possible. If no index is matched then a full table scan will be executed. A table scan is extremely inefficient for anything but the smallest of tables. Obviously if a table is read sequentially, in physical order then an index is not required. A table does not always need an index.

What to Index

Use indexes where frequent queries are performed with where and order by clause matching the ordering of columns in those indexes. Use indexing generally on larger tables or multi-table, complex joins. Indexes are best created in the situations listed below.

â  Columns used in joins.

â  Columns used in where clauses.

â  Columns used in order by clauses.

â  In most relational databases the order-by clause is generally executed on the subset retrieved by the where clause, not the entire data space. This is not always unfortunately the case for Oracle.

i  Traditionally the order-by clause should never include the columns contained in the where cause. The only case where the order-by clause will include columns contained in the where clause is the case of the where clause not matching any index in the database or a requirement for the order by clause to override the sort order of the where, typically in highly complex, multi-table joins.

â  The group-by clause can be enhanced by indexing when the range of values being grouped is small in relation to the number of rows in the table selected.

What not to Index

Indexes will degrade performance of inserts, updates and deletes, sometimes substantially.

â  Tables with a small number of rows.

â  Static tables.

â  Columns with a wide range of values.

â  Tables changed frequently and with a low amount of data retrieval.

â  Columns not used in data access query select statements.

Tuning Oracle SQL Code and Using Indexes

What is SQL Tuning?

Tune SQL based on the nature of your application, OLTP or read-only Data Warehouse. OLTP applications have high volumes of concurrent transactions and are better served with exact match SQL where many transactions compete for small amounts of data. Read-only Data Warehouses require rapid access to large amounts of information at once and thus many records are accessed at once, either by many or a small number of sessions.

The EXPLAIN PLAN command can be used to compare different versions of SQL statements, and tune your application SQL code as required. When tuning OLTP applications utilize sharing of SQL code in PL/SQL procedures and do not use triggers unless absolutely necessary. Triggers can cause problems such as self-mutating transactions where a table can expect a lock on a row already locked by the same transaction. This is because triggers do not allow transaction termination commands such as COMMIT and ROLLBACK. In short, do not use triggers unless absolutely necessary.