How AI Is Revolutionizing Ecommerce

Artificial intelligence, as we’re all aware, has permeated every facet of our lives. However, the sheer velocity at which AI has revolutionized our world is frequently underestimated. Take, for instance, the meteoric rise of ChatGPT. It astoundingly amassed 100 million monthly active users in a mere two months. To put this into perspective, Facebook, one of the world’s most influential social media platforms, took four years to achieve the same feat.

The momentum of AI is unrelenting, with projections indicating a staggering global economic contribution of $15.7 trillion by 2030. Particularly in the realm of ecommerce, the phenomenal surge in AI advancements is propelling the entire sector into uncharted territories that were inconceivable just a few years prior. In this post, I will delve into the history of AI usage at Bloomreach and why the recent breakthroughs are so impactful for the ecommerce industry.

The “Traditional” AI Approach in Search

AI isn’t new to ecommerce — for example, Bloomreach has been using AI to power our first products. We started at first to leverage the power of natural language processing algorithms to power our semantic search capabilities. This improved the retrieval step of our system by increasing recall and precision. Utilizing ontology dictionaries, we were able to extract information from queries and products using algorithms like the Aho-Corasick search string algorithm. Then in the ranking step, we utilized the power of big data processing merging aggregate usage data from our customers with advanced ranking algorithms to deliver a much better consumer experience.

Ultimately, the algorithms resolve into a set of if-then-else rules that runs well on a CPU. We would apply these heuristic algorithms as a way to streamline the amount of computing the AI needs to do. However, these algorithms have limitations. For example, when we see a word before another word in English, we generally know that the first word is the descriptor vs. the product, and can apply that heuristic to help the search engine parse queries and serve more relevant results. But this doesn’t apply to every term, or in the case of languages outside of English, we would need different heuristics.

To achieve both recall and precision, we needed extensive dictionaries that specified ontology and synonyms. However, these curated data sets that our algorithms used could lead to errors, as not all corner cases can be covered. For instance, we added the synonym “OLED” for “TV” because many people used the term OLED when looking for TVs. This worked for queries that were just “OLED,” but this simple algorithm would sometimes interpret a search for “OLED TV” as “TV TV,” which was in a lot of battery descriptions. As a result, without additional conditions to the algorithm to handle this particular case, it would yield irrelevant results.

Building a generic algorithm that would apply to all corner cases becomes challenging, and thus a lot of exceptions would be needed. This ends up requiring too many conditional statements, making it too difficult for human programmers to build an algorithm to cover all the corner cases, let alone run it in a performant way on a CPU.

This is where the recent advancements in AI come into the picture.

The Fundamental Shift in AI

In recent years, generative AI has brought about a seismic shift in our approach to AI. While many people now associate generative AI with ChatGPT, its applications extend far beyond content generation.

The advent of deep learning utilizing neural networks has been instrumental in the development of generative AI such as large language models (LLMs). These new techniques apply deep learning methods, unlocked by the power of parallel computing on GPUs, allowing us to deal with complex understanding that scales by the amount of data that is fed to the models. An analogy is that instead of needing humans to create the perfect heuristic algorithm, we can now create a model-building algorithm that essentially creates a “much better heuristic algorithm” by training it with data/examples. We call this model training, and the “much better heuristic algorithm” can be used to solve these natural language challenges quickly and cost-effectively through inferencing on a GPU.

So, in the previous example, instead of needing a synonym and an algorithm to deal with inserting the right synonym into the correct terms, the term “OLED TV” is transformed through a language encoder model into a vector embedding. This embedding, a multi-dimensional set of numbers, represents the concept of OLED TV, which will be similar to a TV in terms of its Euclidean distance: no synonyms or dictionaries needed.

We’ve already been using the language model BERT to help us scale our product discovery solution into the French and German languages for parts of our algorithm like attribute extraction. These advancements are only the beginning of what’s possible with AI as we begin to leverage the power of vector embeddings in the product domain, which will allow us to push the boundaries of our semantic capabilities.

The Exceptional Pace of AI Innovation

ChatGPT seemingly exploded overnight, and that’s honestly how it feels for generative AI technology as a whole. This is the fastest I’ve ever seen technology innovation — in just a matter of months, we’ve gone from a 200K context window to 10 million, which is two orders of magnitude! And improved models for every domain are now being released every week. This is an unprecedented pace of innovation, and while there are certainly challenges for us to navigate, the technology also presents a wealth of opportunities. Just look at what we’ve managed to achieve with conversational commerce.

As we continue to discover more ways to innovate with AI, it’s crucial for ecommerce technology companies — Bloomreach included — to consider how to be strategic in light of this rapid evolution. Take the classic concept of “garbage in, garbage out,” for example. Can we improve these problems by using AI to put garbage in, augment it, and get good outputs as a result? How can we leverage the images in product catalogs to enhance product understanding? Can we optimize open-source models to deliver higher quality for specific verticals with low cost and low latency?

These are the questions my team and I have been grappling with at Bloomreach, and we are eager to share the results of our research and development. To follow along, take a look at the latest features in Bloomreach Discovery and the roadmap for Bloomreach Engagement.