Google and the Knowledge Graph
I came across a really interesting article the other day on Mashable (Google Knowledge Graph Could Change Search Forever). Google SVP Amit Singhal lays out their efforts around a more semantic understanding of the web leveraging their purchase of Freebase a couple of years ago. The gist is that by leveraging a proprietary Knowledge Graph, Google will be able to return search results based on the meaning of documents rather than simply the presence of particular text strings. It’s a really compelling vision and well worth reading. Personally, I’m terribly excited about the prospect of not only a truly semantic search, but the proliferation of data systems that are backed by large scale ontologies. The power of ontology based semantics is a basic tenet of everything we do at Gravity, and it always feels good to see folks like Google moving in the same direction. For those of you not thoroughly enmeshed in this sort of tech (which is just about everyone), a bit of explanation is probably in order.
What is an ontology?
The simplest way to imagine an ontology is as a graph that shows how things are connected to each other (if you’re already familiar with the nuances of graph theory, RDF, and convergence algos, feel free to skip ahead). Take the example below from our ontology:
This is a small subset of the many things Kobe Bryant is actually connected to. A ontology allows you to not only crawl a page and recognize that “Kobe Bryant” is contained in the text and an entity of note, but now you can imbue that article with additional meaning. Kobe’s presence in a document may be indicative of a web page being conceptually about famous people, basketball, the Lakers, or celebrities who cheat. We can now move past simply understanding of what’s on a web page and grasp more concretely what it’s about.
Now that was a single entity in the ontology. Google’s ontology and our own have millions of entities and abstract concepts all interconnected with hundreds of millions of edges. Topics run the gambit from every person of note throughout history to every song ever recorded to diseases of every flavor. I can’t speak for Google’s system, but we maintain various weights on those interconnections (Kobe is more tightly bound to “Los Angeles Lakers Players” than “American expatriates in Italy”). In this way we are able to more easily infer document aboutness.
What’s the point?
Per Mr. Singhal, Google is applying this semantic understanding of content to search. Would you like results about Kobe as a basketball player, or would you rather see pertinent celebrity gossip? The ontology allows Google and the user to make that distinction when applied applied to the set of content that includes Kobe as a component. You can also introduce any number of semantically proximate suggestions to searchers. Searchers for “surfing” could easily be presented with the opportunity to explore relevant results for the more abstract “water sports” or the more specific “longboards”. With an ontology we can place topics in their proper context within the set of everything else that exists.
We leverage similar technology to a very different end. By understanding what every article is actually about, we can consider what pages you engage with to build a holistic picture of those topics and concepts that actually matter to you (your Interest Graph). That then can be used to present you with content, ads, and other people that you’ll probably enjoy (see a lot more about that here).
For those of you that are just discovering ontologies, I hope this was a helpful introduction. If you’re in the space, we always love talking shop. Drop us a line.
