Taking aim at Amazon, Google, Joyent offers tools to change the data analytics game

Joyent melds compute/storage services, allowing processing and data analytics jobs to be done in the same system

Cloud computing provider Joyent, a company with robust engineering chops that's looking to compete with the heavyweights of the industry, today rolled out Manta, its newest service that offers object storage with baked in compute features.

Combining compute and storage into one offering allows processing and data analytics jobs to be done without the need to transfer data from a storage system into a compute engine, speeding the time to process the transaction and creating a more efficient process.

Joyent engineers are no strangers to shaking up the cloud industry. Under the radar of even some in the cloud community, the company has developed its own SmartOS infrastructure stack, which includes an open source in-memory hypervisor. Joyent sells its SmartDataCenter as a public cloud IaaS option, or for users to install on their own premises.

CTO Jason Hoffman says Manta is the next logical iteration of the company's products. "This completes our platform," he says, after working on the project for about four years, internally using it for the last year and running a six-month multi-petabyte beta project on it with some of Joyent's largest customers.

Gartner analyst Henry Baltazar says the key to Manta is the ability to run compute jobs without having to extract, transform and load (ETL) the data from object storage into compute servers. In traditional formats, the object storage is meant for the retention of data, but if some sort of processing is needed to be done on it, then a copy of the data is usually made and then transferred to make it accessible to the compute services. "You avoid the transformation and migration of the data because it's already sitting with the compute," he says.It's ideal, he says, for archival of data heavy information that may need some sort of processing done to it. Joyent specifically lists use cases such as video and image coding and reformatting, data analytics and log processing.

Customer Konstantin Gredeskoul, who is CTO of Wanelo, an online retail community, is already raving about it after using it as part of the beta. The company hosts a website for about 8 million users that has more than 6 million products from more than 200,000 stores that shoppers can browse, search and then be sent to other websites to buy from. During the past few months, Wanelo has been using Manta to storage log files of everything that happens on the company's website each day - what items people view, which are purchased and how they rated products. Storing those files in Manta allows commands to be run directly on the data.

Gredeskoul is currently using Amazon Web Services Simple Storage Service (S3) as its back-end repository for all of Wanelo's images, but he hopes to migrate those over to Joyent's Manta service. If there's ever a point when Gredeskoul needs to convert petabytes worth of images from one format to another, that process could be done directly within Manta. On AWS, he would have likely had to make a copy of the files, transfer or copy it into Elastic Compute Cloud (EC2) or another compute service like Elastic MapReduce (EMR), run the processing job, then transfer it back into storage. Hoffman says Manta is like having AWS's EC2 and S3 baked into a single offering.

Gartner's Baltazar, says Manta is part of a larger wave of converging storage and compute into single systems, but he says it's one of the first large-scale cloud offerings in the category. Other providers like Simplivity and Nutanix offer customers so-called data centers in a box, which have compute, storage and networking functionality baked into a single system, which is usually run on customer premises. Baltazar calls this the "big squeeze" of infrastructure.

Joyent could kick off this wave for cloud providers to do the same. Already Amazon has a variety of storage options, between S3, Elastic Block Storage (EBS), its RedShift data warehousing service and Glacier, which is a long-term cold storage option that is cheap to write into and slow to retrieve information from. Amazon offers analytics services on top of such storage options, but they're not architected to be baked into the storage services as Joyent's is. Google has its Hadoop-like BigQuery data processing cloud-based service, which is an adjacent offering to its application development PaaS and Compute Engine IaaS.

Manta is generally available starting today. Prices start at $0.043 per GB of storage, up to the first 1 TB, with volume discounts applied after that. Joyent's default setting is to make two companies of all data. Compute service is priced by the second at $0.00004/GB of DRAM.

