{"id":42063,"date":"2024-06-27T21:35:17","date_gmt":"2024-06-27T21:35:17","guid":{"rendered":"https:\/\/www.bloomreach.com\/?post_type=library&#038;p=42063"},"modified":"2024-11-20T01:20:54","modified_gmt":"2024-11-20T01:20:54","slug":"the-power-of-hybrid-vector-search-in-ecommerce","status":"publish","type":"library","link":"https:\/\/www.bloomreach.com\/en\/blog\/the-power-of-hybrid-vector-search-in-ecommerce","title":{"rendered":"Understanding the Power of Hybrid Vector Search in Ecommerce"},"content":{"rendered":"\n<p>In the realm of ecommerce, site search solutions often cater to two distinct types of shoppers: the &#8220;searcher,&#8221; who knows exactly what they want, and the &#8220;seeker,&#8221; who browses to find what they need. Historically, optimizing for one type of shopper has often come at the expense of the other.<br><br>However, recent <a href=\"https:\/\/www.bloomreach.com\/en\/blog\/how-ai-is-revolutionizing-ecommerce-at-an-unprecedented-pace\">advancements in artificial intelligence<\/a>, in particular language models, are enabling us to strike a perfect balance, meeting the needs of every consumer. This post explores the evolution of ecommerce search and how Bloomreach is leveraging cutting-edge technology to enhance search functionality.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-precision-vs-recall-in-search\">Precision vs. Recall in Search<\/h2>\n\n\n\n<p>Ecommerce search engines typically focus on either precision or recall breadth. If a brand prioritizes precision, the search engine will only show results that exactly match the search query in the product corpus, often at the expense of a larger results set. For instance, if someone searches for a \u201cred leather jacket,\u201d a precision-focused search won\u2019t show any results that don\u2019t include the combination of \u201cred,\u201d \u201cleather,\u201d and \u201cjacket\u201d in the description.<\/p>\n\n\n\n<p>Precise search also encounters issues with long-tail queries. If you search for \u201cred leather jacket for an outdoor event,\u201d you\u2019ll likely return very few (or even zero) results. This is due to not finding exact matches for all the terms in the product corpus (which can be just a title + a short description of a product).&nbsp; Even with algorithms like query relaxation, this leads to a non-graceful degradation of precision versus recall breadth.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"683\" src=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Semantic-Search-1024x683.jpg\" alt=\"Chart Displaying Search for Red Leather Jacket Using Semantic\/Keyword Search\" class=\"wp-image-42068\" srcset=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Semantic-Search-1024x683.jpg 1024w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Semantic-Search-300x200.jpg 300w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Semantic-Search-768x512.jpg 768w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Semantic-Search.jpg 1470w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Conversely, if a brand prioritizes recall breadth using technology like embedding-powered vector search, you\u2019ll have more results to show, but there may be more noise in the recall set with irrelevant products. So, with the same \u201cred leather jacket\u201d example, this may return results for jackets that are not red, or not the right material, or even items that aren\u2019t exactly jackets but similar to jackets (like a shirt).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"683\" src=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Vector-Search-1024x683.jpg\" alt=\"Chart Displaying Search for Red Leather Jacket Using Vector Search\" class=\"wp-image-42071\" srcset=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Vector-Search-1024x683.jpg 1024w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Vector-Search-300x200.jpg 300w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Vector-Search-768x512.jpg 768w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Chart-Displaying-Search-for-Red-Jacket-Using-Vector-Search.jpg 1470w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Neither of these search options is ideal, as they each heavily favor one type of shopper over another. Instead, the \u201choly grail\u201d for search engines is to excel in both precision and recall. You need to be precise but also intuitive enough to understand what people are looking for, even if they didn\u2019t specifically search for it.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"683\" src=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Hybrid-Search-Using-Vector-and-Semantic-Diagramjpg-1024x683.jpg\" alt=\"Hybrid Search With Semantic\/Keyword and Vector Search - Diagram\" class=\"wp-image-42077\" srcset=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Hybrid-Search-Using-Vector-and-Semantic-Diagramjpg-1024x683.jpg 1024w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Hybrid-Search-Using-Vector-and-Semantic-Diagramjpg-300x200.jpg 300w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Hybrid-Search-Using-Vector-and-Semantic-Diagramjpg-768x512.jpg 768w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Hybrid-Search-Using-Vector-and-Semantic-Diagramjpg.jpg 1470w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-where-does-bloomreach-rank-nbsp\">Where Does Bloomreach Rank?&nbsp;<\/h3>\n\n\n\n<p>At Bloomreach, our search engine has tried to balance this through a combination of precision and recall methods. Our <a href=\"https:\/\/www.bloomreach.com\/en\/blog\/best-semantic-search-engine\">semantic understanding<\/a> capabilities allowed us to parse search queries and deliver relevant results. When the recall set was too low, we would turn to algorithms like <a href=\"https:\/\/documentation.bloomreach.com\/discovery\/docs\/query-relaxation\" target=\"_blank\" rel=\"noopener\">query relaxation<\/a> to show more results.<\/p>\n\n\n\n<p>For example, with the \u201cred leather jacket\u201d query, our semantic search can break down the terms to understand that \u201cred\u201d is the attribute, \u201cleather\u201d is the material, and \u201cjacket\u201d is the object. There may also be synonym rules in place, so that \u201ccrimson leather jackets\u201d show up as well.<\/p>\n\n\n\n<p>However, once we turn to query relaxation, the results may generate more noise. This is because we\u2019ve relied on heuristic rules like dropping words to match with the corpus. If \u201cred leather jacket\u201d doesn\u2019t produce results, for example, we would simply specify that the search engine drop \u201cred\u201d and show \u201cleather jackets\u201d instead.<\/p>\n\n\n\n<p>One problem with this approach is that if there were a similar product (e.g., a \u201cburgundy leather jacket\u201d) that didn\u2019t have a synonym rule attached to it, it might end up appearing on page three of the search results after query relaxation. We knew we needed to improve our search to be more sophisticated, which is where Google Vertex AI and hybrid vector search come into play.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-hybrid-vector-search-is-the-answer\">Why Hybrid Vector Search Is the Answer<\/h2>\n\n\n\n<p>Thanks to our <a href=\"https:\/\/www.bloomreach.com\/en\/news\/2024\/bloomreach-amplifies-the-power-of-its-e-commerce-search-and-merchandising-with-google-cloud-ai\">partnership with Google<\/a>, we can now leverage the power of their Vertex AI platform and Gemini language models to achieve the best of both worlds: a broad recall set capable of understanding both short and long queries, layered on top of our powerful semantic intelligence for highly precise and relevant results without relying on more archaic methods like query relaxation. Let\u2019s take a closer look at how it all works.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-hybrid-vector-search-explained\">Hybrid Vector Search, Explained<\/h3>\n\n\n\n<p>With a hybrid search engine, large language models (LLMs) do the heavy lifting, eliminating the need for predefined synonyms or query relaxation rules. Instead, the AI uses embedding models and vector search to perform similarity matching of the query with the product corpus. The embedding model is a neural network that has an inherent understanding of human concepts, pre-trained with text, and then fine-tuned by Bloomreach to work in the product domain. This more accurately matches queries at a human concept level with the products.<\/p>\n\n\n\n<p>Returning to our \u201cred leather jacket\u201d example, the LLMs have an innate understanding of this query as a concept, recognizing that \u201cred\u201d and \u201cburgundy\u201d are similar, while \u201cjacket\u201d and \u201ccouch\u201d are not. Attributes are no longer binary (e.g., is \u201cred\u201d in the name\/description or is it not?) \u2014 the neural network assigns a score between 0 and 1 to determine the closeness of a match.<\/p>\n\n\n\n<p>This means \u201cred leather jacket\u201d and \u201cburgundy leather jacket\u201d might have a 0.98 match score, while a \u201cred leather couch\u201d will have a 0.5 score, which would be below the threshold of our recall set, and thus eliminated from the recall set.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"585\" src=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/How-Vector-Search-Works-1024x585.jpg\" alt=\"How Vector Search Works - Diagram\" class=\"wp-image-42074\" srcset=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/How-Vector-Search-Works-1024x585.jpg 1024w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/How-Vector-Search-Works-300x171.jpg 300w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/How-Vector-Search-Works-768x439.jpg 768w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/How-Vector-Search-Works.jpg 1470w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Diving deeper, the key differentiator in embedding language AI models is not needing heuristic algorithm-based NLP. In a sense, there are no \u201cif\u201d statements in the algorithm. The embedding is generated through the neural network via a set of matrix math operations. Each aspect of a search query is mapped out in a high-dimensional space, considering each word in relation to every other word and assigning a probability in the form of a vector embedding. The resulting embedding contains the numerical representation of the concept of \u201cred leather jacket.\u201d<\/p>\n\n\n\n<p>The end result <em>is that we have unlocked limitless possibilities<\/em> for delivering relevancy. This is especially useful for long-tail queries \u2014 for example, with a search of \u201cred leather jacket that\u2019s good for fall weather,\u201d the AI can inherently understand this query better and match it with the proper product in the catalog corpus, which may be a red leather jacket that is lighter or thinner but doesn\u2019t mention anything about fall weather in the description.<\/p>\n\n\n\n<p>We also recognize that embedding models and the algorithm for matching using cosine similarity have limitations. The embedding dimensionality is a type of compression that takes concepts and compresses them into 768 or 1024 or some higher set of vector dimensions, which may introduce precision errors when using cosine similarity scores. Hybrid vector is a method that utilizes classical lexical matching as an additional scoring signal in combination with vector and embedding models achieving the holy grail of both precision and recall breadth.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-the-bloomreach-advantage\">The Bloomreach Advantage<\/h3>\n\n\n\n<p>It\u2019s important to note that while we\u2019re not the only ones leveraging Google\u2019s technology, we have a distinct advantage at Bloomreach over anyone else using these models. That\u2019s because we\u2019ve been fine-tuning the models with over 15 years&#8217; worth of ecommerce-specific data, culminating in the launch of <a href=\"https:\/\/www.bloomreach.com\/en\/news\/2024\/bloomreach-offers-an-unprecedented-new-way-to-maximize-ecommerce-search-revenue-with-the-launch-of-loomi-search\">Loomi Search+<\/a>.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"683\" src=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Merchandiser-Working-With-Loomi-Search-1024x683.jpg\" alt=\"Merchandiser Working With Loomi Search+\" class=\"wp-image-42080\" srcset=\"https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Merchandiser-Working-With-Loomi-Search-1024x683.jpg 1024w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Merchandiser-Working-With-Loomi-Search-300x200.jpg 300w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Merchandiser-Working-With-Loomi-Search-768x512.jpg 768w, https:\/\/www.bloomreach.com\/wp-content\/uploads\/2024\/06\/Merchandiser-Working-With-Loomi-Search.jpg 1470w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>When search engines only rely on vanilla embedding models, the model parameters might be optimized for generic document search. So, the AI might understand all the differences between \u201cairplanes\u201d and \u201cjets,\u201d but may not understand the similarities between products like \u201cred leather jacket\u201d vs. \u201cburgundy leather jacket.\u201d<\/p>\n\n\n\n<p>Loomi Search+ taps into our extensive ecommerce data, and we\u2019ve fine-tuned our LLMs to more accurately understand a wide range of commerce products. We have combined these fine-tuned models with classic lexical search as an added signal to give us the best precision. We\u2019re not just delivering hybrid vector search \u2014 we\u2019re delivering hybrid vector search that\u2019s highly optimized for commerce brands and their customers.<\/p>\n\n\n\n<p>I\u2019m very excited to see the \u201choly grail\u201d of search come to fruition. But this isn\u2019t the only exciting thing we\u2019re working on \u2014 be sure to check out all of Bloomreach Discovery\u2019s <a href=\"https:\/\/www.bloomreach.com\/en\/products\/discovery\/whats-new\">AI-driven innovations.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the realm of ecommerce, site search solutions often cater to two distinct types of shoppers: the &#8220;searcher,&#8221; who knows exactly what they want, and the &#8220;seeker,&#8221; who browses to find what they need. Historically, optimizing for one type of shopper has often come at the expense of the other. However, recent advancements in artificial [&hellip;]<\/p>\n","protected":false},"author":127,"featured_media":42064,"template":"","ew-regions":[],"ew-solutions":[],"library_type":[513],"library_blog_tag":[359,362,455],"industry":[],"channel":[278,276,277,333],"topic":[283,286,285,291],"class_list":["post-42063","library","type-library","status-publish","has-post-thumbnail","hentry","library_type-blog","library_blog_tag-executive-insights","library_blog_tag-ai-and-innovation","library_blog_tag-commerce-experience","channel-results-pages","channel-category-pages","channel-product-pages","channel-website","topic-ai","topic-commerce-experience","topic-grow-aov","topic-team-efficiency"],"acf":{"library_blog_banner_content":"","library_blog_banner_cta1_text":"","library_blog_banner_cta1_href":"","library_blog_banner_cta1_new_tab":false,"library_blog_banner_cta2_text":"","library_blog_banner_cta2_href":"","library_blog_banner_cta2_new_tab":false,"library_blog_banner_bg_color":"#EAF7FE","library_blog_banner_cta_text_color":"#FFF","library_blog_banner_cta_bg_color":"#019ACE","library_blog_banner_cta2_text_color":"#000","library_blog_banner_cta2_bg_color":"#FFF","library_blog_chatgpt_content":"","library_blog_chatgpt_cta_href":"","library_blog_chatgpt_cta_text":"Ask ChatGPT"},"_links":{"self":[{"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/library\/42063","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/library"}],"about":[{"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/types\/library"}],"author":[{"embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/users\/127"}],"version-history":[{"count":2,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/library\/42063\/revisions"}],"predecessor-version":[{"id":54112,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/library\/42063\/revisions\/54112"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/media\/42064"}],"wp:attachment":[{"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/media?parent=42063"}],"wp:term":[{"taxonomy":"ew_regions","embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/ew-regions?post=42063"},{"taxonomy":"ew_solutions","embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/ew-solutions?post=42063"},{"taxonomy":"library_type","embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/library_type?post=42063"},{"taxonomy":"library_blog_tag","embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/library_blog_tag?post=42063"},{"taxonomy":"industry","embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/industry?post=42063"},{"taxonomy":"channel","embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/channel?post=42063"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/www.bloomreach.com\/en\/wp-json\/wp\/v2\/topic?post=42063"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}