Home » Blog » Organizing a search on a website: choosing between Bitrix, Sphinx and Elasticsearch

Organizing a search on a website: choosing between Bitrix, Sphinx and Elasticsearch

The average user should feel that searching on the site is easy. He is us to the intelligent search capabilities of Google and Yandex. and expects the same from any other search bar by default. You type. for example. “teliskp” or “porocytomol” and get a list of optical devices and pharmaceuticals with an indication of the catalog sections in which they are locat.

How does a website search engine understand what the user actually meant? Magic or science? Let’s try to figure out what threatens a business from insufficient attention to internal search. how it helps shorten the user’s path and improve conversion.

The Impact of Search Bar on Conversion

The search bar on your website is part of your sales funnel.

Zero results for a request = loss of a client. specific database by industry If the user did not find what he was looking for. you will not sell it to him.

 

Customers who use site search can generate up to 50% of revenue for a commercial Internet resource.

Let’s compare two audiences of an online store: one of them uses internal search. the other does not. The share of the audience that uses internal search is 12%. But it is this audience that brings 43% of the online store’s income.

The basic minimum for modern “smart” search.

 

Will Bitrix cope with such a task? By default – no.

How does Bitrix’s built-in search work?

The standard search module in “1C-Bitrix: Site Management” solves its tasks well when you just ne to find something without a complex context and conditions. Using the exact semantics of the request. it searches the entire text. headings. here’s how to do it right  fields and tags in various sections of the site (catalog. news. knowlge base. blog. about the company. etc.). Such a full-text search is quite suitable if the site has a small assortment. does not require filtering. does not ne to include promotional items in the results and does not ne to perform any complex conversion actions.

However. if you ne to administer a product aggregator or an online store with a large catalog. you will have to tweak the search. For example. limit it to only information blocks with products and SKUs. so that. in addition to them. Bitrix does not give out a hodgepodge of news fes. articles and “useful” information from service pages. In this case. it is more convenient and easier for the client to return to Yandex results than to continue wasting time on your site.

 

Why does this happen?

First. the standard Bitrix search divides the sentence into parts and discards prepositions. conjunctions. and particles. Then it converts all words to their initial form and saves them in database tables. One table stores all initial forms of words. another — all texts. and a third one links a specific word with specific texts. N-grams are not built. so Bitrix cannot correct errors or search by part of a word or article. but it can distinguish word endings. It has a special setting for this — consider/not consider morphology. If it is taken into account. the search is by word forms. if not — by exact match.

Since all word forms can be includ in the  japan data response to the query. there will be many matches with the texts from the database and we will get many similar results in the search results. And this is negative for those who want quick and accurate answers.

There are other nuances that can negatively affect the conversion of a site with a default search. For example. a request with a spelling error or an incorrect number in the product code will have a zero response. as will the conversion from search to the transition to the product card.

To offset the shortcomings of full-text search. it is supplement with facet filtering. but this is a few more extra steps on the way to a purchase. Meanwhile. visitors who search directly on the site are more convertible than the rest of the audience. They already know what they want to buy.

How to boost Bitrix search

Since version 14.0.0. in 1C-Bitrix products. support for the Sphinx full-text search system is available. It allows you to search faster and better. ruces the load on the server. and is also fully integrat with the components of the Search module. Unfortunately. the current version 3.x.x is still support by the Bitrix core.

Pros of Sphinx:

Fast indexing makes Sphinx ideal for projects with low requirements for search functionality. But what if the project has grown. you want to scale. or you ne flexible management of search results?

Migrating search to Elasticsearch

Usually it is discuss when the current search engine of the site is not satisfactory for some reason. In conjunction with other developer products (the so-call Elastic Stack. ELK). this tool provides even more opportunities in managing search and data presentation.

Elasticsearch (ES) is a non-relational document store with its own REST API that works with JSON data.

 

Elasticsearch is a consistent leader in the DB-Engines rankings .

There are many differences between Elasticsearch and Sphinx. Here are the most important ones.

Scalability. Like Sphinx. ES offers a high degree of horizontal scalability. allowing you to distribute and replicate data across multiple nodes and shards. However. Sphinx requires manual management of the index structure. while Elasticsearch allows you to add new nodes to an existing system on the fly and automatically distribute the load across them.

 

Elasticsearch for Search

Elasticsearch uses indices to organize and store data. An index is a logical storage where documents of the same type are organiz. Each index consists of one or more shards (parts). which allows data to be distribut across different cluster nodes to ensure fault tolerance and scalability.

In order for data to get into ES indexes. it cannot be simply pull from the site database “as is”. This “raw” data nes to be index. To create an index. the system’s own API is us. which nes to be call from somewhere. This can be done by event handlers in Bitrix or an agent that will periodically update the data. or a queue server. It is very convenient to work with Kibana from the ELK stack.

Below is an example of an index with manual field mapping. The index can be creat using the REST API or a ready-made library for working with it. Mapping can also be dynamic. but usually this thing cannot be fully trust. The best result is obtain with a combination of dynamic and explicit mapping; you can create your own tricky field mapping rules.