|
| 1 | +--- |
| 2 | +title: Optimize indexing performance by analyzing batch statistics |
| 3 | +description: Learn how to analyze the `progressTrace` to identify and resolve indexing bottlenecks in Meilisearch. |
| 4 | +--- |
| 5 | + |
| 6 | +# Optimize indexing performance by analyzing batch statistics |
| 7 | + |
| 8 | +Indexing performance can vary significantly depending on your dataset, index settings, and hardware. The [batch object](/reference/api/batches) provides information about the progress of asynchronous indexing operations. |
| 9 | + |
| 10 | +The `progressTrace` field within the batch object offers a detailed breakdown of where time is spent during the indexing process. By analyzing this data, you can identify bottlenecks and adjust configuration settings to improve indexing speed. |
| 11 | + |
| 12 | +## Understanding the `progressTrace` |
| 13 | + |
| 14 | +The `progressTrace` is a hierarchical trace showing each phase of indexing and how long it took. |
| 15 | +Each entry follows the structure: |
| 16 | + |
| 17 | +```json |
| 18 | +"processing tasks > indexing > extracting word proximity": "33.71s" |
| 19 | +``` |
| 20 | + |
| 21 | +This means: |
| 22 | + |
| 23 | +- The step occurred during **indexing**. |
| 24 | +- The subtask was **extracting word proximity**. |
| 25 | +- It took **33.71 seconds**. |
| 26 | + |
| 27 | +Your goal is to focus on the **longest-running steps** and understand which index settings or data characteristics influence them. |
| 28 | + |
| 29 | +## Key phases and how to optimize them |
| 30 | + |
| 31 | +### Document processing |
| 32 | + |
| 33 | +| Trace key | Description | Optimization | |
| 34 | +|------------|--------------|--------------| |
| 35 | +| `computing document changes`, `extracting documents` | Meilisearch compares incoming documents to existing ones. | No direct optimization possible. The duration scales with the number and size of incoming documents.| |
| 36 | + |
| 37 | +### Filterable attributes |
| 38 | + |
| 39 | +| Trace key | Description | Optimization | |
| 40 | +|------------|--------------|--------------| |
| 41 | +| `extracting facets`, `merging facet caches` | Extracts and merges filterable attributes. | Keep the number of [**filterable attributes**](/reference/api/settings#filterable-attributes) to a minimum. | |
| 42 | + |
| 43 | +### Searchable attributes |
| 44 | + |
| 45 | +| Trace key | Description | Optimization | |
| 46 | +|------------|--------------|--------------| |
| 47 | +| `extracting words`, `merging word caches` | Tokenizes text and builds the inverted index. | - Ensure the [**searchable attributes**](/reference/api/settings#searchable-attributes) list includes only the fields you want to be checked for query word matches. | |
| 48 | + |
| 49 | +### Proximity precision |
| 50 | + |
| 51 | +| Trace key | Description | Optimization | |
| 52 | +|------------|--------------|--------------| |
| 53 | +| `extracting word proximity`, `merging word proximity` | Builds the data structures for phrase and attribute ranking. | Lower the precision of this operation by setting [proximity precision](/reference/api/settings#proximity-precision) to `byAttribute` instead of the default `byWord`| |
| 54 | + |
| 55 | +### Disk I/O and hardware bottlenecks |
| 56 | + |
| 57 | +| Trace key | Description | Optimization | |
| 58 | +|------------|--------------|--------------| |
| 59 | +| `waiting for database writes` | Time spent writing data to disk. | No direct optimization possible. Either the disk is slow, either the quantity of data to write is big. Avoid HDDs (Hard Disk Drives). | |
| 60 | +| `waiting for extractors` | Time spent waiting for CPU-bound extraction. | No direct optimization possible. Indicates a CPU bottleneck. Use more cores or scale horizontally with [sharding](/learn/advanced/sharding). | |
| 61 | + |
| 62 | +### Facets and filterable attributes |
| 63 | + |
| 64 | +| Trace key | Description | Optimization | |
| 65 | +|------------|--------------|--------------| |
| 66 | +| `post processing facets > strings bulk` / `numbers bulk` | Processes equality or comparison filters. | - Disable unused [**filter features**](/reference/api/settings#features), such as comparison operators on string values. <br/>- Keep [**sortable attributes**](reference/api/settings#sortable-attributes) to the minimum required. | |
| 67 | +| `post processing facets > facet search` | Builds structures for the [facet search API](/reference/api/facet_search). | If you don’t use the facet search API, [disable it](/reference/api/settings#update-facet-search-settings).| |
| 68 | + |
| 69 | +### Embeddings |
| 70 | + |
| 71 | +| Trace key | Description | Optimization | |
| 72 | +|------------|--------------|--------------| |
| 73 | +| `writing embeddings to database` | Time spent saving vector embeddings. | - Use smaller embedding vectors when possible. <br/>- You can avoid recomputing embeddings on document update by [disabling embedding regeneration](/reference/api/documents#vectors). <br/>- Consider enabling [binary quantization](/reference/api/settings#binaryquantized) for your embedders. | |
| 74 | + |
| 75 | +### Word prefixes and post-processing |
| 76 | + |
| 77 | +| Trace key | Description | Optimization | |
| 78 | +|------------|--------------|--------------| |
| 79 | +| `post processing words > word prefix *` | Builds prefix data for autocomplete. Allows to match documents that begin with a specific query term, instead of only exact matches.| Disable [**prefix search**](/reference/api/settings#prefix-search) (`prefixSearch: disabled`) if not required. <br/> Note that this can severely impact search result relevancy. | |
| 80 | +| `post processing words > word fst` | Builds the word FST (finite state transducer). | No direct action possible, as it depends on the number of different words in the database. Fewer searchable words can improve speed. | |
| 81 | + |
| 82 | +## Example analysis |
| 83 | + |
| 84 | +If you see: |
| 85 | + |
| 86 | +```json |
| 87 | +"processing tasks > indexing > post processing facets > facet search": "1763.06s" |
| 88 | +``` |
| 89 | + |
| 90 | +The [facet search feature](/learn/filtering_and_sorting/search_with_facet_filters#searching-facet-values) is consuming significant time. If your application doesn’t use it, disable it: |
| 91 | + |
| 92 | +``` |
| 93 | +client.index('INDEX_NAME').updateFacetSearch(false); |
| 94 | +``` |
| 95 | + |
| 96 | +## Learn more |
| 97 | + |
| 98 | +- [Indexing best practices](/learn/indexing/indexing_best_practices) |
| 99 | +- [Impact of RAM and multi-threading on indexing performance |
| 100 | +](/learn/indexing/ram_multithreading_performance) |
| 101 | +- [Configuring index settings](/learn/configuration/configuring_index_settings) |
0 commit comments