Hover over the dots to explore related posts. Closer dots are more semantically related, and the red dot marks the current page.
Hover over the dots to explore related posts. Closer dots are more semantically related, and the red dot marks the current page.
In the past year, at 904Labs A.I. for Search, we’ve put a lot of effort to optimize and push further the technology behind query intent/understanding engines. If you are in the search engine/information retrieval community, you know the term and how challenging the problem is. But if you are not familiar with the problem, it is hard to grasp its importance and challenges. Here’s my attempt to explain it in simple words.
A Query Intent/Understanding engine aims at identifying the intent of a user’s query and ultimately at translating it to a set of search directives.
Consider a user coming to your e-shop and typing in the search box: “red shoes”. The query may look pretty straightforward to you when it comes to what products to show to the user but, for a machine, it’s pretty hard to figure out what the person meant.
To get into the machine’s shoes (pun intended) think for a moment that you were born and raised in a warehouse, which is isolated from the rest of the world but has all the inventory of an e-shop. You’ve never left the warehouse so you don’t know anything about the world. It’s quite grim world but bare with me. One day someone slips a message under the door of the warehouse with the words “red shoes”. Now your task is to select a set of products that satisfy/cover that person’s information need. Obviously you know nothing about that mysterious person, let alone their preferences, and you barely understand the language of the message–perhaps you are able to recognize characters and words and match them to words found on the labels of products, but that’s pretty much it.
In such a surreal world, you can imagine that it is quite difficult, even for you, a human, to select a set of products that relate to “red shoes”. The challenge lies in that much of the important, contextual, information that we have access to from our constant interactions with the real world by living in it, it is very much missing in this artificial setting: We don’t know whether the user is a he or a she (and therefore we don’t know if we should pick male or female type of shoes), we don’t know what is the “hot” color of the season, nor the most popular brand, nor whether the person is looking for sneakers or boots or high-heels. There are lots of unknowns.
A Query Intent/Understanding engine is an algorithm that tries to make sense of the world for machines, or other entities, which are locked up in a similar warehouse as the one we described above and have no access to contextual information. A Query Intent/Understanding algorithm aims at mapping free text (a user query) to a series of directives (rules) that when applied, they will yield a useful set of products for the user. In our “red shoes” example, we are looking for directives that look as the following: “filter on products that have attribute:red and category:shoes”. At a first glance, the mapping looks simple but as of now you’ve seen from our warehouse setting that it can be quite daunting.
Researchers and practitioners have been working on this problem for quite some time and progress has been made; however the community has still some way to go before solving it. At the core of current solutions, there is a lot of complex technology such as neural networks that power NLP (Natural Language Processing) tools and other pipelines that require lots of human annotations (read expensive). These approaches yield relatively good accuracy, however, putting these systems into production may still be a challenging engineering problem–from setting up the data pipelining to scaling up, to retraining, and to monitoring system effectiveness and efficiency.
At 904Labs we’ve developed a Query Intent/Understanding engine that hooks up on an e-shop’s data, bootstraps a knowledge graph, and it learns the mapping to directives automatically, i.e., without supervision (read without human annotations). It doesn’t require retraining nor external dependencies, and you can use it straight away, today, on your own Apache Solr or Elasticsearch index (from which we only read and never write and we never lock you in). If that sounds appealing, get in touch for a one month free trial!