Faceted Search with Elasticsearch for Large Catalogs

Author: WebGoodPeople

The Problem at Scale

Standard 1C-Bitrix search works acceptably up to 300–500 SKUs. Beyond that threshold, systemic problems emerge that cannot be fixed through configuration — only by replacing the search subsystem entirely.

Specific symptoms: searching by a full phrase finds the product, but a partial word search does not. No synonym support: "rebar" and "rod" are unrelated in the index. Filters are built through SQL LIKE queries against iblock tables — with 20,000 products and 8 active filters simultaneously, response time exceeds 800ms, and under high load the database hits deadlocks. Relevance ranking is not configurable: all matches are treated equally, and product position in results is unpredictable. The zero-results rate on the UralMetall project before optimization was 22% of all search queries.

Why Elasticsearch Fits

Elasticsearch is built on an inverted index — a data structure designed from the ground up for full-text search. This provides several fundamental advantages over a relational database.

  • Aggregations for facets. ES counts products for each attribute value in a single query, without additional JOINs. This is exactly how the counters next to each filter value work ("Diameter: 12mm (47)").
  • Configurable relevance. You can specify that a match in a product title carries 3× the weight of a match in the description. You can boost in-stock products in results.
  • Synonyms and stemming. A synonym dictionary attaches at the analyzer level: "angle iron" and "angle profile" become interchangeable without modifying product data.
  • Autocomplete. A completion field type delivers search suggestions as the user types, in 5–10ms.
  • Zero-downtime reindexing. A new index builds in parallel with the live one, then an alias switches atomically — users see no interruption.

Architecture: Bitrix as the Data Source

Bitrix remains the single source of truth for products — we do not duplicate catalog management logic. The sync flow:

  • When a product is saved or a price changes in Bitrix, an OnAfterIBlockElementUpdate event handler fires and queues the product ID.
  • A Bitrix agent (or external cron) processes the queue every 2 minutes: it fetches product data via CIBlockElement::GetByID, builds a JSON document, and sends it to ES via the bulk API.
  • Next.js queries Elasticsearch directly for catalog pages, search, and filtering — without calling Bitrix.
  • Add-to-cart, checkout, and authentication remain on thin Bitrix PHP endpoints.

This architecture means the lag between a product change in Bitrix and its appearance on the site is at most 2–3 minutes — acceptable for a B2B steel distribution catalog.

Key Technical Techniques

Filter-aware aggregations. This is the core complexity of faceted search: when a user selects the "Diameter: 12mm" filter, the counts for all other filters must update to reflect that selection. In ES this is implemented via post_filter combined with separate per-attribute aggregations, each with their own filter. The implementation is non-trivial but it is the only correct approach.

Field boosting. In the index mapping we set field weights:

"query": {
  "multi_match": {
    "query": "rebar 12",
    "fields": ["title^3", "article^2", "description", "tags^1.5"]
  }
}

Nested aggregations for hierarchical categories. Subcategories are counted via nested aggregation — one query returns the full category tree with product counts at each level.

Real Numbers: The UralMetall Project

After replacing Bitrix search with Elasticsearch:

  • Search response time: from 800ms to 45ms (median from production logs)
  • Simultaneous active filter combinations: from 3 to 47 — now technically feasible without performance degradation
  • Zero-results rate: from 22% to 4% — thanks to synonyms and fuzzy search (fuzziness: AUTO)
  • Search-to-product-page conversion: +18% — users find what they need on the first attempt more often

When You Do Not Need Elasticsearch

ES is additional infrastructure: a separate service, indexes, synchronization. It makes sense for catalogs of 1,000–2,000+ SKUs with complex attributes and significant search load.

If your catalog has fewer than 300 SKUs with 3–4 simple attributes and moderate traffic — standard Bitrix search is fine. Do not add complexity without a reason.

Want to check whether your project needs Elasticsearch? As part of the Catalog Probe, we deploy a proof-of-concept on your real catalog data in 3 days and show the difference in search speed and quality.

Tell us about your project

Our offices

  • Russia
    Saint Petersburg, Rizhskaya st. 5, bldg. 1, office 402
    +7 (967) 555-90-32
  • Kazakhstan
    Almaty
    +7 (707) 340-29-12