Understand Indexing in MongoDB

Understand Indexing in MongoDB

Table of contents

No heading

No headings in the article.

MongoDB indexes
Indexes are special data structures that store small information of collections data in a way that can be queried easily. The index stores the value of a specific field or set of fields, ordered by the value of the field. The ordering of the index entries supports efficient equality matches and range-based query operations.

What is indexing
MongoDB uses indexing to make query processing more efficient. MongoDB indexing is similar to a book. Whenever we need to find some topic in our book we first open the index page and check for a particular topic and its page number. MongoDB performs the same flow when indexing is applied for any collection; it first checks the index which points to the document and returns the desired document.

How do Indexes work?
Collection Scan
When we query our collection without an index, the database will have to search through each document. Our collection is expanding. As a result, in order to satisfy our query, we will have to look through an increasing number of documents.
COLLSCAN scans the entire collection in order to find documents that fall under query criteria.

Index Scan Rather than searching through every document, we can search through the ordered index first. Key-Value pair, where the key is the field's value that we’ve indexed on, and the key's value is the actual document itself.
* _id is automatically indexed
In one line we can say IXSCAN scans only a set of documents that fall under the index key.

It is possible to have many indexes in the same collection. You might create multiple indexes in different fields if you find other queries for various fields.
MongoDB uses a data structure called a b-tree to store its indexes.

Index Types
MongoDB provides a number of different index types to support specific types of data and queries.
1. Single field Index: A single field index means index on a single field of a document. This index is helpful for fetching data in ascending as well as descending order.

db.students.createIndex({studentsId:1})

2. Compound Index: We can combine multiple fields for compound indexing and that will help for searching or filtering documents in that way. Or in other words, the compound index is an index where a single index structure holds multiple references.

3. Multikey Index: MongoDB uses the multikey indexes to index the values stored in arrays. When we index a field that holds an array value then MongoDB automatically creates a separate index of each and every value present in that array. Using these multikey indexes we can easily find a document that contains an array by matching the items. In MongoDB, you don’t need to explicitly specify the multikey index because MongoDB automatically determines whether to create a multikey index if the indexed field contains an array value.

4. Geospatial Indexes: It is an important feature in MongoDB. MongoDB provides two geospatial indexes known as 2d indexes and 2d sphere indexes using these indexes we can query geospatial data. Here, the 2d indexes support queries that are used to find data that is stored in a two-dimensional plane. It only supports data that is stored in legacy coordinate pairs. Whereas 2d sphere indexes support queries that are used to find the data that is stored in spherical geometry. It supports data that is stored in legacy coordinate pairs as well as GeoJSON objects. It also supports queries like queries for inclusion, intersection, and proximity, etc.

5. Text Index: MongoDB supports query operations that perform a text search of string content. Text index allows us to find the string content in the specified collection. It can include any field that contains string content or an array of string items. A collection can contain at most one text index. You are allowed to use text index in the compound index.

6. Hash Index: To maintain the entries with hashes of the values of the indexed field(mostly _id field in all collections), we use Hash Index. This kind of index is mainly required in the even distribution of data via sharding. Hashed keys are helpful to partition the data across the sharded cluster.