Stargate: A new way to think about databases
- 10 November, 2020 10:00
As with many corporate-sponsored open source projects, Stargate becomes most interesting when it gets beyond its roots.
DataStax open sourced Stargate “because we got tired of using different databases and different APIs depending on the work that we were trying to get done.” Billed as an “open source API framework for data,” the project aims to offer “a framework that can serve many APIs for a range of workloads.”
And yet Stargate starts with Apache Cassandra, the database upon which DataStax has built its business. For analyst Tony Baer, Stargate “could eventually turn Apache Cassandra into a multi-model database,” and he’s not wrong. But this isn’t what makes Stargate interesting.
No, Stargate becomes interesting if a community grows up around it to accomplish what the fictional “stargate” was designed to do: serve as a “bridge portal device... that allows practical, rapid travel between two distant locations.”
In other words, a “data gateway” that turns any database into a multi-model database with “pluggable back ends so that developers can work with data in any shape they want via an API,” as DataStax chief data officer Denise Gosnell put it in an interview.
Speaking the same database language
Let’s say you don’t spend nights and weekends obsessing over databases. What would Stargate mean for you? As Gosnell put it:
Think about building a new app. You have databases all over the place. You have requirements for your web apps that need SQL. They need documents/JSON, or they could need GraphQL. You’ve got all these requirements from your back-end engineering team, and you’d have a bunch of requirements from your front-end engineering team. Stargate is an API layer that sits right in between that conversation and makes it possible to serve both of them from one existing technology.
DataStax is getting there by open sourcing the Cassandra coordinator code. DataStax started with Cassandra for obvious reasons: The company knows the database well and Cassandra is a great option for handling distributed data requests.
But it’s that coordinator code that is the heart of Stargate, as Gosnell explained. The hardest aspect of the logic between a customer’s API and their back end is the distributed request coordination, i.e., ensuring proper load balancing, directing database requests to the right place, etc. This is what Cassandra’s coordinator code does well.
The company wants more developers to “join our community and help us prioritise which features we need next” in Stargate, Gosnell stressed. It’s a great story, one that helps DataStax, of course, but also has the potential to be useful for other vendors and with other databases. And that is where Stargate could go from an interesting, single-vendor project into something much more noteworthy.
Opening up to real community
Consider Kubernetes, for example. It was cool technology when its Borg ancestor ran exclusively within Google, and it remained cool as an open source project to which Google employees, almost exclusively, contributed.
Today Kubernetes is an industry-defining open source project precisely because Google did the hard work to open it up. To make it work, Google had to do considerable work to recruit outside contributors (Red Hat engineers being early to that party) and to make would-be contributors feel welcome and productive (through documentation and more).
The same is true of Envoy, the open source edge and service proxy developed at Lyft by Matt Klein. As Klein related recently, coding Envoy was relatively easy compared to the effort required to successfully open source it. What kind of effort? “It was all leadership, public relations, marketing, documentation, etc., and I did it all myself and I nearly killed myself [doing it].”
So... Stargate. It’s cool. It’s a great way to take Cassandra beyond its columnar database roots. But if that’s all that Stargate remains, it will be interesting and useful but not particularly important. Not important at the industry level, anyway. It certainly promises to be important for DataStax as it extends Cassandra’s capabilities to support the MongoDB API, for example, essentially turning Cassandra into a document database, as well.
DataStax is knee-deep in this kind of work now, Gosnell noted. They’re working to ensure Stargate is pluggable for any back-end data store, as well as making the front-end API pluggable to handle any data shape. As she explained, it’s like a LEGO board, with different types of LEGO blocks plugging into the top and bottom of the board.
Gosnell acknowledges that DataStax, like Google and Lyft/Klein before them, may be “really early [but] we’re developing openly and we’re asking people to come join us.” They’re doing the hard work of generalising the code, so that it’s useful beyond DataStax, she says, and improving the documentation to make the project approachable to would-be contributors.
Today the likely contributor to Stargate is a deeply technical back-end engineer. But going forward, Gosnel suggested, “we need software engineers who are interested in building out new REST-based APIs or new access methods that they’d want to use on the front end.”