Welcome to the first blog in the business series of GraphAware blog! This series is designed for us non-techies out there. Personally, I was shocked when I found out how big and common knowledge graphs are and how often graph databases are used in today's world - and I had first heard of them just a couple of months ago. So, for people like me, for marketers and non-tech people in business, I'll try to open the door to the world of graphs, and their potential and take you through it step-by-step. It seems only appropriate that we start with the top 5 things you need to know about knowledge graphs.
What is a knowledge graph?
Obviously, this is a pretty straight-forward question - What is a knowledge graph? Believe it or not, the industry, to this day, does not have a standard, united, simple definition of a knowledge graph. Knowledge graphs in the form we know them today have been around for approximately 50 years - give or take. However, as you might have guessed, or as you already knew, knowledge graphs are based on graph theory, the foundations of which were laid by Leonhard Euler's response to the issue of Seven Bridges of Königsberg; in 1736! Not so recent now, is it? Anyways, in very simple terms, the great-grandpa of knowledge graphs is the issue of Seven Bridges: Once upon a time, there was a city - Königsberg, lying on both banks of the river Pregel. There were two islands on the river, and these islands were connected to both banks and each other by seven bridges.
The Issue of Seven Bridges in Königsberg was worded by Carl Ehler, who wanted to find a path, if you will, that would cross every bridge but would cross it only once. Quite a pickle, isn't it? Well, yes. A huge pickle. It's impossible. Leonhard Euler claimed it to be impossible and found that to create such a - Eulerian - path that would cross each bridge only once, it is possible only in one of two scenarios:
- When there are exactly two nodes - the starting and the finish point - connected via an even number of bridges.
- When all the nodes (islands) are of an even degree - each being connected to another by an even number of bridges. In this case, the path starts and ends at the same place. Such a path is also called a Eulerian circuit.
That was a bit of background. Now, let's go back to the picture of our islands and bridges connecting them. Simplifying the picture, we get a bunch of circles connected by lines. This is basically a graph. Knowledge graphs are composed of nodes and relationships among them, both of which can have different attributes. For example, a node can be a person node, with attributes such as age, gender, address, phone number, etc. Now, two nodes can be connected with a relationship - for example, a person node can be connected to a movie node with an "ACTED IN" relationship. In the pictures below, you can see that Tom Cruise acted in Top Gun, as well as attributes of Tom Cruise (such as his name and the year he was born), and of the "ACTED IN" relationship (the role he played).
The possibility of having different kinds, and different numbers of attributes on both nodes and relationships is one of the advantages of graph databases over traditional ones. This has to do with their lack of a fixed schema. We will explore different databases and their characteristics later in the series.
If you'd like to get more technical, the next step would be to look into the two perspectives from which one can approach knowledge graphs - resource description framework (RDF) and labeled property graph (LPG). My colleague Giuseppe explored these perspectives in depth in his article Knowledge Graph Perspectives: building bridges from RDF to LPG.
So let's bring it all together with the definition of knowledge graphs used by Deloitte:
A Knowledge Graph is a close-to-reality model that represents a company's business logic and serves as its central knowledge platform. You could say it is the brain of a company, as its architecture is very similar to that of our brains - unlike traditional databases and data lakes. Both connect objects directly with each other rather than storing data in different tables and then connecting them via JOIN-tables.
Knowledge graphs are everywhere.
The second very essential thing to know about KGs is that they are commonly used. While knowledge graphs have a vast number of use cases, they serve some areas better than others, or at least they are more commonly used in certain areas.
One of the most commonly cited examples of graph use cases is (optimized) search. Each time you type a search query into a search engine like Google or Yahoo!, the engine has to comb through what is for me an unimaginable amount of data before providing search results. And this needs to be done within milliseconds. To do so, search engines use knowledge graphs.
This use case clearly illustrates that knowledge graphs are great for dealing with vast amounts of data and speedy in recovering the information one is looking for (this is thanks to the way graph databases are structured and how they retrieve information). Furthermore, this example also points to one other use case for which KGs are ideal - natural language processing (NLP). While this is another chapter, the important thing to understand is that knowledge graphs and their algorithms, especially in combination with machine learning, can process natural language, look for patterns, assess similarities, identify keywords, group content together, and more - all of which enhances and improves search.
Another common use case for knowledge graphs are recommendation engines. The sections of the websites reading "You may also like" and "People also bought" are the most common examples, however, social networks like Facebook or LinkedIn also use knowledge graphs to provide friend recommendations. These can be based on your activity like where you went to school, where do you work, where do you live and so on, and/or they can be based on your friends and who their friends are - knowledge graphs are ideal for providing these kinds of recommendations.
Once you see what they are about, you won't believe anyone ever used anything else
Knowledge graphs are very intuitive. This point is very straightforward - just look at the picture below and tell me what you don't understand. You can't, can you? You see which grapes were used to make which wines, where were these wines produced, and what's more, you can see which wines are likely to be somewhat similar because they were produced from the same grape and/or in the same location! Knowledge graphs (especially labeled property graphs) are very easily readable, very user-friendly. Their power lies in the simplicity and ease with which they can represent and store complex, large data.
Despite that all, KGs are not a good fit for everything
I spent most of this article explaining why KGs are amazing, and I started pointing out what kind of value they can provide. So far we have covered only a minimal amount of things there is to say about knowledge graphs, and if you knew nothing about KGs before reading this article, you are still amazingly unaware of their true potential. Trust me, there is much more. Yet, one thing that is good to understand is that knowledge graphs are not the magical solution for your every question or problem. Indeed, there are cases for which knowledge graphs are simply not the best solution. Check out Neo4j's How Do You Know If a Graph Database Solves the Problem? if you want more detail than the quick recap I am about to give you. So, when are KGs not the best solution?
When relationships don't matter
Graphs are excellent when you are looking to understand the context of your data, the relationships, and the connections among it. However, there are cases where this is not important. If this is the case and the context of your data is irrelevant, there are probably other solutions that can serve you better than graphs.
When you're not interested in retrieving, reading, and analyzing the data
Connected to the above, if you need a place to store the data, but not read it, the advantages of graphs such as quick and easy retrieval of information, intuitiveness, and making relationships easily understandable are wasted. This does not necessarily mean that you CAN'T store data in a graph database, but frankly - why would you.
When the structure of your data is tabular
Pretty obvious, if your data has a tabular structure, why would you not use tabular databases to store it? As mentioned before, graph databases have a loose structure - they are schema flexible - meaning they are great for instances when you do not have the same data about every entity - let's say you know two people, but you know an address of only one of them - this is not a problem in graph databases, but tabular databases are not fans of such instances to say the very least.
Bulk scans and unknown starting points
With graph databases you should always know where to start from - the starting point of your analysis. If you don't have this starting point, graph databases will spend too much time looking through the massive amounts of data they store before they find an answer to your loose query, simply making them a slow solution in this case.
When you store large amounts of text in properties
Similar to the above, graph traversal and information retrieval will be much slower if you store large amounts of text in properties of a single node.
How does Hume make KGs better
And finally, we would like you to understand the relationship between GraphAware and KGs. Our product Hume, is the missing piece, the bridge, between graph database Neo4j and customers. It allows people like you and me to query their datasets without any coding. Hume makes data analysis simple thanks to its powerful visualization, pre-canned actions (queries pre-programmed for you so you do not have to code to query the database yourself), alerts, temporal and geospatial analysis, and more. Want to know more? Contact us at email@example.com or request a demo to find out what Hume can do for you.
Knowledge graphs are powerful tools. In this article, we went over five key things I believe everyone should know about them. I hope you know just a little bit more about KGs now and that you are excited to learn more about the world of graphs.