Create SAI index
Use CREATE INDEX
to create secondary indexes on one or more columns after creating a keyspace and defining a table.
Secondary indexes created with SAI can be used to query a table using any column other than the table’s partition key. If a composite partition key is defined, a SAI index can be created on any of the partition key columns.
If you decide to create an index, always create either a secondary index (2i) or a storage-attached index (SAI) on a table column, but not both. CQL will be harder to tune and understanding performance will be more difficult if you create both types of indexes on the same table. See the difference between these index types in the overview. |
Syntax
To create an SAI index, define the table name and column name for the column to be indexed.
The full syntax for creating an SAI index in CQL is as follows:
CREATE CUSTOM INDEX index_name ON keyspace_name.table_name (column_name) USING 'StorageAttachedIndex' ;
The index name is optional, but the default index name might not be meaningful. Index names are unique per keyspace because indexes are created at the keyspace level and not at the table level.
The index name must be a unique identifier for the index for each table within a keyspace. Enclose in quotes to use special characters or preserve capitalization. If you do not specify an index name, CQL generates one for you, with the pattern <table_name>_<column_name>_idx. This requirement is true for all indexes.
If you use IF NOT EXISTS
in a CREATE [CUSTOM] INDEX
command, the command fails silently if an index with the same name already exists in the keyspace.
If you want the command to return an error when an index with the same name already exists, don’t use IF NOT EXISTS
.
For details about the options and syntax, see the reference for CREATE INDEX
.
Syntactical sugar
Some products have added syntactical sugar to make it easier to create SAI indexes. For example, you can create an SAI index using the following syntax in Hyper-Converged Database and Apache Cassandra 5.0:
CREATE INDEX index_name ON keyspace_name.table_name (column_name) USING 'SAI' ;
Examples
Create a simple SAI index:
CREATE CUSTOM INDEX id_idx ON cycling.cyclist_alt_stats (id)
USING 'StorageAttachedIndex';
Create SAI indexes with options:
CREATE CUSTOM INDEX lastname_idx ON cycling.cyclist_semi_pro (lastname)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX age_idx ON cycling.cyclist_semi_pro (age)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX country_idx ON cycling.cyclist_semi_pro (country)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX registration_idx ON cycling.cyclist_semi_pro (registration)
USING 'StorageAttachedIndex';
For more examples, see SAI examples.
Vector index options
Vector indexes support source_model
and similarity_function
options for the CREATE INDEX
command:
Parameter | Description | Default |
---|---|---|
|
Configures the index for optimal performance for your vectors.
Options are: |
|
|
Specified similarity function for your index.
If you selected a |
|
For example:
-
Create the following table:
CREATE TABLE IF NOT EXISTS cycling.comments_vs ( record_id timeuuid, id uuid, commenter text, comment text, comment_vector VECTOR <FLOAT, 5>, created_at timestamp, PRIMARY KEY (id, created_at) ) WITH CLUSTERING ORDER BY (created_at DESC);
-
Create an index on the
comment_vector
column with the similarity function set todot_product
:CREATE CUSTOM INDEX comment_sim_function_idx ON cycling.comments_vs (comment_vector) USING 'StorageAttachedIndex' WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT'};
-
Create an index on the
comment_vector
column with the source model set tobert
:
CREATE CUSTOM INDEX comment_source_model_idx
ON cycling.comments_vs (comment_vector) USING 'StorageAttachedIndex'
WITH OPTIONS = { 'source_model': 'bert'};