Recommendations with Neo4j and Graph-Aided Search
by Michal Bachman
· 4 min read
For the last couple of years, Neo4j has been increasingly popular as the technology of choice for people building real-time recommendation engines. Having been at the forefront of the graph movement through client engagements and open-source software development, we have identified the next step in the natural evolution of graph-based recommendation engines. We call it Graph-Aided Search.
At first glance, it may seem that graph databases are only good for social networks but it has been proven over and over again that the variety of domains and industries that need a graph database to store, analyse, and query connected data could not be any wider.
Similarly, recommendation engines go far beyond retail - the most obvious industry. We’ve seen real-time recommendations with Neo4j applied to finding:
- matches on dating sites (Dating, Social)
- people one may know in professional networks (Social)
- ideal candidates for clinical trials (Pharma)
- fraudsters (Banking, Insurance, Retail)
- criminals (Law Enforcement)
- events of interest (Event Planning)
- and many more…
The reasons for wanting to implement a system that serves recommendations in real-time and for choosing a native graph database to do that have been well understood and written about. Once the technology choice has been made, there are three main challenges to building such a recommender. The first one is to discover the items to recommend. The second is to choose the most relevant ones to present to the user. Finally, the third challenge is to find relevant recommendations as quickly as possible.
Typically, the input to the recommendation engine is an object (e.g. a user) for which we would like to determine the recommendations. Such an object is represented in the graph as a node, so the whole process is effectively a traversal through the network, finding paths from the input node to other nodes, some of which will be deemed as the most relevant ones and served as recommendations.
Last year, GraphAware built an open-source recommendation engine skeleton that runs as a Neo4j extension and provides a foundation to address the three challenges outlined above. It does so by allowing developers to plug in their (path-finding) business logic into a best-practice architecture, resulting in a fast, flexible, yet simple and maintainable piece of software. The architecture imposes the separation of concerns between the plug-in components that:
- discover all possible recommendations
- apply a score to the identified recommendations
- filter out irrelevant/blacklisted recommendations
- optionally record why and how fast the recommendations were served
The skeleton is responsible for sorting by relevance, performance optimisations, thread-safety, and other “frameworky” features.
Since its first release, the GraphAware Recommendation Engine has been used by teams all around the world to build production-ready recommendation functionality into their applications.
The vast majority of websites and other systems today provide some sort of search capability, allowing users to find what they are looking for very quickly. Lucene-based search engines, such as Elasticsearch and Apache Solr are the leading technologies in this space.
Like recommendation engines, search engines also serve results in real-time, sorted by decreasing relevance. However, the input to these systems is typically a string of characters and the results are matching documents (items). Without adding extra complexity, the user performing the search is not taken into account. Hence, two users searching for the same thing will get the same results.
For the same reasons people are interested in personalising recommendations, they also want to personalise search results. To see an example of such personalisation in practice, just head to LinkedIn and type the first name of one of your connections into the search box. Your connections will appear on top of the results. Not because they are the most important person with that first name on LinkedIn, but because they are most likely the person you are looking for.
One can treat such functionality as a recommendation engine with all candidate recommendations provided by an external system (search engine in this case), as opposed to discovered by the recommendation engine itself. Applying the “right tool for the job” philosophy, we can use the search (S) and recommendation (R) engines together to achieve what we call Graph-Aided Search:
- discover all matching recommendations (S)
- apply a score to the recommendations based on textual match (S)
- apply a score to the recommendations based on the user’s graph (R)
- filter out irrelevant/blacklisted recommendations (R)
This way, the power of both systems can be used to build personalised search functionality.
Learn More at GraphConnect
At GraphAware, we are currently finalising the development of enterprise-ready extensions to Neo4j and Elasticsearch for bi-directional integration of the two systems, so that they can be easily combined to provide Graph-Aided Search. We will launch and open-source both extensions at this year’s GraphConnect in San Francisco.
If you are interested in real-time recommendations, personalising search results, or integrating Neo4j with a search engine such as Elasticsearch, come see my presentation at GraphConnect, starting at 2.20pm.
You can now read the blog post about the graph-aided-search plugin.