Faceted Search with Elasticsearch for Large Catalogs
Author: WebGoodPeople
The Problem at Scale
Standard 1C-Bitrix search works acceptably up to 300–500 SKUs. Beyond that threshold, systemic problems emerge that cannot be fixed through configuration — only by replacing the search subsystem entirely.
Specific symptoms: searching by a full phrase finds the product, but a partial word search does not. No synonym support: "rebar" and "rod" are unrelated in the index. Filters are built through SQL LIKE queries against iblock tables — with 20,000 products and 8 active filters simultaneously, response time exceeds 800ms, and under high load the database hits deadlocks. Relevance ranking is not configurable: all matches are treated equally, and product position in results is unpredictable. The zero-results rate on the UralMetall project before optimization was 22% of all search queries.
Why Elasticsearch Fits
Elasticsearch is built on an inverted index — a data structure designed from the ground up for full-text search. This provides several fundamental advantages over a relational database.
- Aggregations for facets. ES counts products for each attribute value in a single query, without additional JOINs. This is exactly how the counters next to each filter value work ("Diameter: 12mm (47)").
- Configurable relevance. You can specify that a match in a product title carries 3× the weight of a match in the description. You can boost in-stock products in results.
- Synonyms and stemming. A synonym dictionary attaches at the analyzer level: "angle iron" and "angle profile" become interchangeable without modifying product data.
- Autocomplete. A
completionfield type delivers search suggestions as the user types, in 5–10ms. - Zero-downtime reindexing. A new index builds in parallel with the live one, then an alias switches atomically — users see no interruption.
Architecture: Bitrix as the Data Source
Bitrix remains the single source of truth for products — we do not duplicate catalog management logic. The sync flow:
- When a product is saved or a price changes in Bitrix, an
OnAfterIBlockElementUpdateevent handler fires and queues the product ID. - A Bitrix agent (or external cron) processes the queue every 2 minutes: it fetches product data via
CIBlockElement::GetByID, builds a JSON document, and sends it to ES via the bulk API. - Next.js queries Elasticsearch directly for catalog pages, search, and filtering — without calling Bitrix.
- Add-to-cart, checkout, and authentication remain on thin Bitrix PHP endpoints.
This architecture means the lag between a product change in Bitrix and its appearance on the site is at most 2–3 minutes — acceptable for a B2B steel distribution catalog.
Key Technical Techniques
Filter-aware aggregations. This is the core complexity of faceted search: when a user selects the "Diameter: 12mm" filter, the counts for all other filters must update to reflect that selection. In ES this is implemented via post_filter combined with separate per-attribute aggregations, each with their own filter. The implementation is non-trivial but it is the only correct approach.
Field boosting. In the index mapping we set field weights:
"query": {
"multi_match": {
"query": "rebar 12",
"fields": ["title^3", "article^2", "description", "tags^1.5"]
}
}
Nested aggregations for hierarchical categories. Subcategories are counted via nested aggregation — one query returns the full category tree with product counts at each level.
Real Numbers: The UralMetall Project
After replacing Bitrix search with Elasticsearch:
- Search response time: from 800ms to 45ms (median from production logs)
- Simultaneous active filter combinations: from 3 to 47 — now technically feasible without performance degradation
- Zero-results rate: from 22% to 4% — thanks to synonyms and fuzzy search (
fuzziness: AUTO) - Search-to-product-page conversion: +18% — users find what they need on the first attempt more often
When You Do Not Need Elasticsearch
ES is additional infrastructure: a separate service, indexes, synchronization. It makes sense for catalogs of 1,000–2,000+ SKUs with complex attributes and significant search load.
If your catalog has fewer than 300 SKUs with 3–4 simple attributes and moderate traffic — standard Bitrix search is fine. Do not add complexity without a reason.
Want to check whether your project needs Elasticsearch? As part of the Catalog Probe, we deploy a proof-of-concept on your real catalog data in 3 days and show the difference in search speed and quality.