Mysql学习系列之四:Mysql索引
时间:2008-09-19 来源:lincolnrainbow
MySQL索引简介... 1
一、索引分单列索引和组合索引……………………………………………………………………………………… 1
二、介绍一下索引的类型…………………………………………………………………………………………………. 1
三、单列索引和组合索引…………………………………………………………………………………………………. 2
四、使用索引... 4
CREATE INDEX Syntax. 4
Note. 7
MySQL索引简介
在数据库表中,使用索引可以大大提高查询速度。
All storage engines support at least 16 indexes per table and a total index length of at least 256 bytes. Most storage engines have higher limits.
假如我们创建了一个testIndex表:
CREATE TABLE testIndex(i_testID INT NOT NULL,vc_Name VARCHAR(16) NOT NULL);
我们随机向里面插入了1000条记录,其中有一条
i_testID vc_Name
555 erquan
在查找vc_Name="erquan"的记录
SELECT * FROM testIndex WHERE vc_Name='erquan';
时,如果在vc_Name上已经建立了索引,MySql无须任何扫描,即准确可找到该记录!相反,MySql会扫描所有记录,即要查询1000次啊~~可以索引将查询速度提高100倍。
一、索引分单列索引和组合索引
单列索引:即一个索引只包含单个列,一个表可以有多个单列索引,但这不是组合索引。
组合索引:即一个索包含多个列。
二、介绍一下索引的类型
1.普通索引。
这是最基本的索引,它没有任何限制。它有以下几种创建方式:
(1)创建索引:CREATE INDEX indexName ON tableName(tableColumns(length));如果是CHAR,VARCHAR类型,length可以小于字段实际长度;如果是 BLOB 和 TEXT 类型,必须指定length,下同。
(2)修改表结构:ALTER tableName ADD INDEX [indexName] ON (tableColumns(length))
(3)创建表的时候直接指定:CREATE TABLE tableName ( [...], INDEX [indexName] (tableColumns(length)) ;
2.唯一索引。
它与前面的"普通索引"类似,不同的就是:索引列的值必须唯一,但允许有空值。如果是组合索引,则列值的组合必须唯一。它有以下几种创建方式:
(1)创建索引:CREATE UNIQUE INDEX indexName ON tableName(tableColumns(length))
(2)修改表结构:ALTER tableName ADD UNIQUE [indexName] ON (tableColumns(length))
(3)创建表的时候直接指定:CREATE TABLE tableName ( [...], UNIQUE [indexName] (tableColumns(length));
3.主键索引
它是一种特殊的唯一索引,不允许有空值。一般是在建表的时候同时创建主键索引:CREATE TABLE testIndex(i_testID INT NOT NULL AUTO_INCREMENT,vc_Name VARCHAR(16) NOT NULL,PRIMARY KEY(i_testID)); 当然也可以用ALTER命令。
记住:一个表只能有一个主键。
4.全文索引
MySQL从3.23.23版开始支持全文索引和全文检索。这里不作讨论,呵呵~~
删除索引的语法:DROP INDEX index_name ON tableName
三、单列索引和组合索引
为了形象地对比两者,再建一个表:
CREATE TABLE myIndex ( i_testID INT NOT NULL AUTO_INCREMENT, vc_Name VARCHAR(50) NOT NULL, vc_City VARCHAR(50) NOT NULL, i_Age INT NOT NULL, i_SchoolID INT NOT NULL, PRIMARY KEY (i_testID) );
在这10000条记录里面7上8下地分布了5条vc_Name="erquan"的记录,只不过city,age,school的组合各不相同。
来看这条T-SQL:
SELECT i_testID FROM myIndex WHERE vc_Name='erquan' AND vc_City='郑州' AND i_Age=25;
首先考虑建单列索引:
在vc_Name列上建立了索引。执行T-SQL时,MYSQL很快将目标锁定在了vc_Name=erquan的5条记录上,取出来放到一中间 结果集。在这个结果集里,先排除掉vc_City不等于"郑州"的记录,再排除i_Age不等于25的记录,最后筛选出唯一的符合条件的记录。
虽然在vc_Name上建立了索引,查询时MYSQL不用扫描整张表,效率有所提高,但离我们的要求还有一定的距离。同样的,在vc_City和i_Age分别建立的单列索引的效率相似。
为了进一步榨取MySQL的效率,就要考虑建立组合索引。就是将vc_Name,vc_City,i_Age建到一个索引里:
ALTER TABLE myIndex ADD INDEX name_city_age (vc_Name(10),vc_City,i_Age);--注意了,建表时,vc_Name长度为50,这里为什么用10呢?因为一般情况下名字的长 度不会超过10,这样会加速索引查询速度,还会减少索引文件的大小,提高INSERT的更新速度。
执行T-SQL时,MySQL无须扫描任何记录就到找到唯一的记录!!
肯定有人要问了,如果分别在vc_Name,vc_City,i_Age上建立单列索引,让该表有3个单列索引,查询时和上述的组合索引效率一样 吧?嘿嘿,大不一样,远远低于我们的组合索引~~虽然此时有了三个索引,但MySQL只能用到其中的那个它认为似乎是最有效率的单列索引。
建立这样的组合索引,其实是相当于分别建立了
vc_Name,vc_City,i_Age
vc_Name,vc_City
vc_Name
这样的三个组合索引!为什么没有vc_City,i_Age等这样的组合索引呢?这是因为mysql组合索引"最左前缀"的结果。简单的理解就是只从最左面的开始组合。并不是只要包含这三列的查询都会用到该组合索引,下面的几个T-SQL会用到:
SELECT * FROM myIndex WHREE vc_Name="erquan" AND vc_City="郑州"
SELECT * FROM myIndex WHREE vc_Name="erquan"
而下面几个则不会用到:
SELECT * FROM myIndex WHREE i_Age=20 AND vc_City="郑州"
SELECT * FROM myIndex WHREE vc_City="郑州"
Another example:
Suppose that a table has the following specification:
CREATE TABLE test (
id INT NOT NULL,
last_name CHAR(30) NOT NULL,
first_name CHAR(30) NOT NULL,
PRIMARY KEY (id),
INDEX name (last_name,first_name)
);
The name index is an index over the last_name and first_name columns. The index can be used for queries that specify values in a known range for last_name, or for both last_name and first_name. Therefore, the name index is used in the following queries:
SELECT * FROM test WHERE last_name='Widenius';
SELECT * FROM test
WHERE last_name='Widenius' AND first_name='Michael';
SELECT * FROM test
WHERE last_name='Widenius'
AND (first_name='Michael' OR first_name='Monty');
SELECT * FROM test
WHERE last_name='Widenius'
AND first_name >='M' AND first_name < 'N';
However, the name index is not used in the following queries:
SELECT * FROM test WHERE first_name='Michael';
SELECT * FROM test
WHERE last_name='Widenius' OR first_name='Michael';
四、使用索引
到此你应该会建立、使用索引了吧?但什么情况下需要建立索引呢?一般来说,在WHERE和JOIN中出现的列需要建立索引,但也不完全如此,因为 MySQL只对 <,<=,=,>,>=,BETWEEN,IN,以及某些时候的LIKE(后面有说明)才会使用索引。
SELECT t.vc_Name FROM testIndex t LEFT JOIN myIndex m ON t.vc_Name=m.vc_Name WHERE m.i_Age=20 AND m.vc_City='郑州' 时,有对myIndex表的vc_City和i_Age建立索引的需要,由于testIndex表的vc_Name开出 现在了JOIN子句中,也有对它建立索引的必要。
刚才提到了,只有某些时候的LIKE才需建立索引?是的。因为在以通配符 % 和 _ 开头作查询时,MySQL不会使用索引,如
SELECT * FROM myIndex WHERE vc_Name like'erquan%'
会使用索引,而
SELECT * FROM myIndex WHEREt vc_Name like'%erquan'
就不会使用索引了。
五、索引的不足之处
上面说了那么多索引的好话,它真的有像传说中那么优秀么?当然会有缺点了。
1.虽然索引大大提高了查询速度,同时却会降低更新表的速度,如对表进行INSERT、UPDATE和DELETE。因为更新表时,MySQL不仅要保存数据,还要保存一下索引文件
2.建立索引会占用磁盘空间的索引文件。一般情况这个问题不太严重,但如果你在一个大表上创建了多种组合索引,索引文件的会膨胀很快。
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name
[index_type]
ON tbl_name (index_col_name,...)
[index_type]
|
index_col_name: col_name [(length)] [ASC | DESC]
|
index_type: USING {BTREE | HASH | RTREE}
|
CREATE INDEX is mapped to an ALTER TABLE statement to create indexes. See Section 12.1.2, “ALTER TABLE Syntax”. CREATE INDEX cannot be used to create a PRIMARY KEY; use ALTER TABLE instead. For more information about indexes, see Section 7.4.5, “How MySQL Uses Indexes”.
Normally, you create all indexes on a table at the time the table itself is created with CREATE TABLE. See Section 12.1.5, “CREATE TABLE Syntax”. CREATE INDEX enables you to add indexes to existing tables.
A column list of the form (col1,col2,...) creates a multiple-column index. Index values are formed by concatenating the values of the given columns.
Indexes can be created that use only the leading part of column values, using col_name(length) syntax to specify an index prefix length:
· Prefixes can be specified for CHAR, VARCHAR, BINARY, and VARBINARY columns.
· BLOB and TEXT columns also can be indexed, but a prefix length must be given.
· Prefix lengths are given in characters for non-binary string types and in bytes for binary string types. That is, index entries consist of the first length characters of each column value for CHAR, VARCHAR, and TEXT columns, and the first length bytes of each column value for BINARY, VARBINARY, and BLOB columns.
· For spatial columns, prefix values can be given as described later in this section.
The statement shown here creates an index using the first 10 characters of the name column:
CREATE INDEX part_of_name ON customer (name(10));
If names in the column usually differ in the first 10 characters, this index should not be much slower than an index created from the entire name column. Also, using column prefixes for indexes can make the index file much smaller, which could save a lot of disk space and might also speed up INSERT operations.
Prefix lengths are storage engine-dependent (for example, a prefix can be up to 1000 bytes long for MyISAM tables, 767 bytes for InnoDB tables). Note that prefix limits are measured in bytes, whereas the prefix length in CREATE INDEX statements is interpreted as number of characters for non-binary data types (CHAR, VARCHAR, TEXT). Take this into account when specifying a prefix length for a column that uses a multi-byte character set. For example, utf8 columns require up to three index bytes per character.
A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. This constraint does not apply to NULL values except for the BDB storage engine. For other engines, a UNIQUE index allows multiple NULL values for columns that can contain NULL. If you specify a prefix value for a column in a UNIQUE index, the column values must be unique within the prefix.
FULLTEXT indexes are supported only for MyISAM tables and can include only CHAR, VARCHAR, and TEXT columns. Indexing always happens over the entire column; column prefix indexing is not supported and any prefix length is ignored if specified. See Section 11.8, “Full-Text Search Functions”, for details of operation.
The MyISAM, InnoDB, NDB, BDB, and ARCHIVE storage engines support spatial columns such as (POINT and GEOMETRY. (Chapter 20, Spatial Extensions, describes the spatial data types.) However, support for spatial column indexing varies among engines. Spatial and non-spatial indexes are available according to the following rules.
Spatial indexes (created using SPATIAL INDEX):
· Available only for MyISAM tables. Specifying a SPATIAL INDEX for other storage engines results in an error.
· Indexed columns must be NOT NULL.
· In MySQL 5.0, the full width of each column is indexed by default, but column prefix lengths are allowed. However, as of MySQL 5.0.40, the length is not displayed in SHOW CREATE TABLE output. mysqldump uses that statement. As of that version, if a table with SPATIAL indexes containing prefixed columns is dumped and reloaded, the index is created with no prefixes. (The full column width of each column is indexed.)
Non-spatial indexes (created with INDEX, UNIQUE, or PRIMARY KEY):
· Allowed for any storage engine that supports spatial columns except ARCHIVE.
· Columns can be NULL unless the index is a primary key.
· For each spatial column in a non-SPATIAL index except POINT columns, a column prefix length must be specified. (This is the same requirement as for indexed BLOB columns.) The prefix length is given in bytes.
· The index type for a non-SPATIAL index depends on the storage engine. Currently, B-tree is used.
In MySQL 5.0:
· You can add an index on a column that can have NULL values only if you are using the MyISAM, InnoDB, BDB, or MEMORY storage engine.
· You can add an index on a BLOB or TEXT column only if you are using the MyISAM, BDB, or InnoDB storage engine.
An index_col_name specification can end with ASC or DESC. These keywords are allowed for future extensions for specifying ascending or descending index value storage. Currently, they are parsed but ignored; index values are always stored in ascending order.
Some storage engines allow you to specify an index type when creating an index. The allowable index type values supported by different storage engines are shown in the following table. Where multiple index types are listed, the first one is the default when no index type specifier is given.
Storage Engine |
Allowable Index Types |
MyISAM |
BTREE, RTREE |
InnoDB |
BTREE |
MEMORY/HEAP |
HASH, BTREE |
NDB |
HASH, BTREE (see note in text) |
Note
BTREE indexes are implemented by the NDBCLUSTER storage engine as T-tree indexes.
For indexes on NDBCLUSTER table columns, the USING clause can be specified only for a unique index or primary key. In such cases, the USING HASH clause prevents the creation of an implicit ordered index. Without USING HASH, a statement defining a unique index or primary key automatically results in the creation of a HASH index in addition to the ordered index, both of which index the same set of columns.
The RTREE index type is allowable only for SPATIAL indexes.
If you specify an index type that is not legal for a given storage engine, but there is another index type available that the engine can use without affecting query results, the engine uses the available type.
Examples:
CREATE TABLE lookup (id INT) ENGINE = MEMORY;
CREATE INDEX id_index USING BTREE ON lookup (id);
TYPE type_name is recognized as a synonym for USING type_name. However, USING is the preferred form.
Before MySQL 5.0.60, the index_type option can be given only before the ON tbl_name clause. Use of the option in this position is deprecated as of 5.0.60; support for it is to be dropped in a future MySQL release. As of 5.0.60, the option should be given following the index column list. If an index_type option is given in both the earlier and later positions, the final option applies.
Previous / Next / Up / Table of Contents