Neo Technology execs: How Neo4j beat Oracle Database
- 04 February, 2013 11:17
Neo Technology, which was formed in 2007, offers Neo4J, a Java-based open source NoSQL graph database. With a graph database, which can search social network data, connections between data are explored. Neo4j can solve problems that require repeated network probing (the database is filled with nodes, which are then linked), and the company stresses Neo4j's high performance. InfoWorld Editor at Large Paul Krill recently talked with Neo CEO Emil Eifrem and Philip Rathle, Neo senior director of products, about the importance of graph database technology as well as Neoo4j's potential in the mobile space. Eifrem also stressed his confidence in Java, despite recent security issues affecting the platform.
InfoWorld: Graph database technology is not the same as NoSQL, is it?
Eifrem: NoSQL is actually four different types of databases: There's key value stores, like Amazon DynamoDB, for example. There's column-family stores like Cassandra. There's document databases like MongoDB. And then there's graph databases like Neo4j. There are actually four pillars of NoSQL, and graph databases is one of them. Cisco is building a master data management system based on Neo4j, and this is actually our first Fortune 500 customer. They found us about two years ago when they tried to build this big, complex hierarchy inside of Oracle RAC. In Oracle RAC, they had response time in minutes, and then when they replaced it [with] Neo4j, they had response times in milliseconds.
[ Neo4j was a 2013 InfoWorld Technology of the Year award winner. | Also, see InfoWorld's recent interview with MongoDB developer 10gen. | Download InfoWorld's Big Data Analytics Deep Dive to make sense of Big Data. ]
InfoWorld: How did they do that?
Eifrem: It's because the graph model fundamentally assumes that data is connected, whereas the relational model was built back in the 1970s when databases were mostly used to sort tabular data.
InfoWorld: But you're not going to use a graph database for a payroll application or anything like that?
Eifrem: Payroll is actually the example that I always use, because payroll is first name, last name, age, salary, title maybe, and super-well-structured, very tabular, and awesome for a relational database. That's the original use case.
InfoWorld: Are graph databases limited to a single machine?
Eifrem: Neo4j is an open source project. It has a community edition, which is a fully featured graph database, but it runs just on one machine. But then there's Neo4j Enterprise, which is the commercial edition and that's the one that Cisco uses, that's the one that Adobe uses. We have more than 20 of the Global 2000 using Neo4j Enterprise. It's clustered across many, many machines.
InfoWorld: What would you say is the main takeaway from a graph database?
Eifrem: I would say two things. One is it's 1,000 times faster for a lot of queries on connected data than a relational database. The second thing is that it's a lot more intuitive to model many domains as a graph. If you have a domain that is very connected and messy and changing, it's very intuitive and easy to model it with a graph database.
InfoWorld: With Neo4j, can you query it over the Internet?
Eifrem: It has a RESTful API where you can query over the Web. Or you can run it locally. It runs in the cloud on Heroku.
InfoWorld: Neo4j is written in Java, correct?
Eifrem: It's written in Java, that's the 4j.
InfoWorld: Given recent problems with Java security, are you concerned about the security ramifications of Java?
Eifrem: The recent issue was serious. It was real. It was a browser problem. It was a real issue, but generally speaking, no, I'm not concerned because we're written in Java. Java is one of the strongest security platforms out there, actually. You should always take security very seriously, especially if you're focused on the enterprise like we are and we certainly do. But I don't think Java is any less secure than other languages. On the contrary, I think it's more secure than a lot of other languages.
InfoWorld: What makes it more secure?
Eifrem: Because it's fundamentally written with the sandboxing model, with a JVM, which has a very sophisticated security model. Sometimes you have bugs in it but historically if you look at the kind of proliferation that the Java platform has versus the amount of security issues that have been found, it's actually very, very low.
InfoWorld: What's the mobile story for graph databases in Neo4j?
Rathle: We actually do have some customers who are running mobile apps that use Neo4j on the background. There is, for example, a company in Germany that just started a project where they're building iPad apps that are used by salespeople who are working in the medical field and working with the hospital and using Neo4j on the back end to navigate the hierarchy between doctors and hospitals and insurance companies and providers. We also have another customer who has actually ported Neo4j to Android.That is not yet open source, but we're working on that.
InfoWorld: What can you do with Neo4j on Android?
Rathle: I can't share a lot about their use case, but it's a device that's taking measurements of things that are highly related and what they care about is understanding the relationships.
InfoWorld: So mobile is an opportunity for you guys?
Rathle: It's an opportunity, yes. And it's something that I think we'll start to see more of in the future. Most uses of Neo4j to date have been, or most enterprise uses, have been in these areas that Emil has shown, inside the enterprise, either in customer-facing websites or internally where there are serious performance challenges with hierarchal data, like in the Cisco case.
InfoWorld: What is the graph megatrend?
Eifrem: Facebook may be a very visible and very recent example of the use of graphs and very spectacular. [Users are] going to see graph search at the top bar of the Internet because Facebook is increasingly becoming the Internet for a lot of people. But they're not the first to use graphs. Here are some other examples. One is, of course, Google, who started out by taking the Web graph and making that searchable and then actually announced what they call the Knowledge Graph, which they did the same week as Facebook IPO'd. Twitter has the Interest Graph. In fact, I just saw an interview with Marissa Mayer where she said that her vision for Yahoo is to model the Interest Graph. Not just people knowing other people. But model -- what are you interested in? There's a bunch of companies that are trying to leverage these connected data structures.
InfoWorld: What are you going to do with the connected data?
Eifrem: Let's take one example, which is search. Search, pre-1999, basically worked the same way for all the 20, 30, 50 firms that tried to do Web search, which is all of them downloaded the entirety of the Web into their data centers, and then they searched into every individual document. If you search for Paul, it would look into every individual document and find if that document mentions Paul. And then they would serve that. Pretty simple. We call that atomic data. They use data only about every individual entity.
Then in 1999, along comes Google, which says that -- hey, on top of this atomic data, we're also going to look at how these pages are connected to one another and they call that the Link Graph. And they called the algorithm Page Rank, and that invention was enough to make them the most dominant company, I think, of the last decade. And they did that based on -- let's leverage this connected data rather than just atomic data.
And then, of course, 2012 and 2013, search made its next discontinuous leap, which was when Google announced their Knowledge Graph, which is not just how pages are related to other pages but they also start to model the actual entities in these pages. For example, if you have a Web page about a movie, previously Google only recorded what other pages this page linked to, but now they also look into the page and see that -- hey, this is a page about "Apollo 13," and "Apollo 13" actually stars Kevin Bacon, and Kevin Bacon has also starred in these other movies. And they build up this big connected data structure they call the Knowledge Graph.
InfoWorld: What is the purpose of a Knowledge Graph or a graph database in an enterprise?
Eifrem: In an enterprise, the point is that it's going to help you deliver a lot -- much better search results. You're going to be able to look through information in a way [that] makes it much more targeted and much easier to find out relevant things.
This article, "Neo Technology execs: How Neo4j beat Oracle Database," was originally published at InfoWorld.com. Follow the latest developments in business technology news and get a digest of the key stories each day in the InfoWorld Daily newsletter. For the latest developments in business technology news, follow InfoWorld.com on Twitter.
Read more about big data in InfoWorld's Big Data Channel.