This article sets out to describe how VTEX's search system works.
Warning: VTEX has two search options - VTEX search and VTEX Intelligent Search. This article refers to the VTEX search. To learn more about the VTEX Intelligent Search application, see this track.
VTEX search engine product display prioritization
VTEX search engine is an intelligent algorithm that identifies which result to best show the user based on the search term.
In addition, the result displayed will always be the one that will generate more sales conversion. The success of this result depends solely on the catalog master file (brand, department, category, product, specification, etc).
In accordance with the search term, the system may render the following result pages, by order of priority:
- Landing Page
- Brand
- Department
- Search by term (keyword)
1. Landing Page
If the term searched for corresponds to the name of a folder which is set up in Portal Manager (the landing page), this folder will be rendered.
Notice: a folder needs to have a layout in order to be searchable.Even if the search in done in the subfolder, the parent folder also requires layout.
2. Brand
If the searched for term corresponds exactly to the name of or substitute word for a brand which is registered in the master file, the system will only render products of this brand. If the system identifies two or more brands with the same substitute word (which is considered an inconsistency in the master file), the system will render only the first one found (and its products);
The result of this page will be a brand page, having the source code as follows: ``
3. Department
If the term searched for corresponds exactly to the name of or substitute word for a registered department, the system will render only the products belonging to this department. If the system identifies two or more departments with the same substitute word (which is considered an inconsistency in the master file), the system will render only the first one found (and its products);
The result of this page will be the Department page. Checking the source code will allow you to identify which result was displayed. The following comment should be found in the page code: ``
4. Search by keyword
If the system does not identify the Landing Page, Brand or Department according to the term searched for, it applies a search by term (keyword).
The result of this page will be a search page. Checking the source code will allow you to identify which result was displayed. The following comment should be found in the page code: ``
When the search is by keyword, the search engine undergoes a search of the indexer, which is responsible for the search algorithm by keyword.
Ranking System: This algorithm uses the concept of ranking (Score) to prioritize and order products. For each search, the indexer allocates a score to products based on the search term. Some basic fields (having different weight) are considered when calculating this ranking. The shop window display is created according to this ranking, from highest to lowest, meaning that the product with the highest score displayed first, while the one with the lowest score, last.
For more information, see the article How does the Score field work?.
How the search is done
To understand VTEX search, we must first become familiar with the indexer and its update process.
Catalog Indexer
The catalog indexer is a quick access scalable database, with configurable algorithms for result prioritization. It is positioned between conventional database and user. See the outline below:
The indexer contains all the information referring to the product catalog (products, SKUs, brands, departments, categories). The search engine uses this information to locate products and to display these in shop windows and search results. However, only already indexed products can be found by the search.
Updating process (Indexing)
Any changes made to the product (main data, price, inventory, collection, etc.) will generate an update in the indexer, meaning that whenever changes are applied to a product, it is sent to the end of the indexing queue.
When this indexing queue is consumed, the changes applied will be available for display on the website. The indexing process is extremely safe and has “repêchage” rules. Whenever an item is not indexed in the first attempt, the system makes other attempts.
Fields and their weights
The following fields and their respective weights are used by the search algorithm when ranking a product:
- Product name: 2.8
- First product name: 2.5
- First and second product name: 1.2
-
- Full product na__me: - __1.0
- __Substitute words (product an__d bran- __d): 0.7
- Product specifications (only for indexed text and indexed long text fields): 0.5
Example
Considering the following indexer:
Product name | Complementary product name | Substitute words | Product specifications |
---|---|---|---|
Soccer ball | Football 7 | soccer ball, football ball | White |
Ball | Football 7 | society | Soccer ball |
Soccer boot | Field | Soccer boot, Soccer boot | White |
Result:
- Soccer ball (Highest ranked, since the term corresponds exactly to the product name)
- Ball (Second best ranked, since the term corresponds to a specification value)
- Soccer boot (Third best ranked, since part of the term corresponds to part of the product name)
Result:
- Ball (Best ranked, since part of the term corresponds exactly to the product name)
- Soccer ball (Second best ranked, since part of the term searched for corresponds to part of the product name)
- Soccer boot (Third best ranked, since part of the term corresponds to a specification)
Hint: The use of substitute words, in spite of their low indexer score weight, is an extremely important feature. With this feature, it is possible to reach users that search for grammatically incorrect terms, but with the same semantics.