You can pick an AI Platform Deep Learning VM Image with TensorFlow, TensorFlow Enterprise, PyTorch, R, or half a dozen other frameworks. All images can include JupyterLab, and images intended for use with GPUs can have CUDA drivers.
You can create an instance from the gcloud command line (installed via the Google Cloud SDK) or from the Google Cloud marketplace. When you create your VM you can choose the number of virtual CPUs (the machine type, which also determines the RAM), and the number and kind of GPUs. You’ll see an estimate of your monthly cost based on the hardware you choose, with a discount for sustained usage. There’s no additional charge for the frameworks. If you choose a VM with GPUs, you need to allow a few minutes for installation of the CUDA drivers.
AI Platform Deep Learning Containers
Google also supplies Deep Learning Containers suitable for use in Docker on your local machine or on Google Kubernetes Engine (GKE). The containers have all the frameworks, drivers, and support software you might want, unlike the VM Images, which allow you to select only what you need. Deep Learning Containers are currently in beta test.
AI Platform Pipelines
MLOps (machine learning operations) applies DevOps (developer operations) practices to machine learning workflows. Much of the Google Cloud AI Platform supports MLOps in some way, but AI Platform Pipelines are at the core of what MLOps is all about.
AI Platform Pipelines, currently in beta test, makes it easier to get started with MLOps by saving you the difficulty of setting up Kubeflow Pipelines with TensorFlow Extended (TFX). The open source Kubeflow project is dedicated to making deployments of machine learning workflows on Kubernetes simple, portable, and scalable. Kubeflow Pipelines, a component of Kubeflow that is currently in beta test, is a comprehensive solution for deploying and managing end-to-end machine learning workflows.
TensorFlow Extended is an end-to-end platform for deploying production machine learning pipelines. TFX makes it easier to implement MLOps by providing a toolkit that helps you orchestrate your machine learning process on various orchestrators, such as Apache Airflow, Apache Beam, and Kubeflow Pipelines. Google Cloud AI Platform Pipelines uses TFX Pipelines, which are DAGs (directed acyclic graphs), with Kubeflow Pipelines rather than Airflow or Beam.
You manage AI Platform Pipelines from the Pipelines tab of the AI Platform in your Google Cloud console. Creating a new pipeline instance creates a Kubernetes cluster, a cloud storage bucket, and a Kubeflow Pipeline. Then you can either define the pipeline from an example, or from scratch using TFX.
Spotify used TFX and Kubeflow to improve its MLOps. The company reports that some teams are producing 7x more experiments.
AI Platform Data Labeling Service
Google Cloud AI Platform Data Labeling Service lets you work with human labelers to generate highly accurate labels for a collection of data that you can use in machine learning models. The service is currently in beta test, and availability is very limited because of COVID-19.
Google Cloud AI Hub, currently in beta test, offers a collection of assets for developers and data scientists building artificial intelligence systems. You can both find and share assets. Even in beta form, AI Hub seems to be useful.
TensorFlow Enterprise provides users with a Google Cloud optimized distribution of TensorFlow with long-term version support. The TensorFlow Enterprise Distribution contains custom-built TensorFlow binaries and related packages. Each version of the TensorFlow Enterprise Distribution is anchored on a particular version of TensorFlow; all packages included are available in open source.
Google Cloud AI Solutions
Google aims AI Solutions at business executives rather than at data scientists or programmers. The solutions usually come with an optional consulting or contract development component; consulting services are also available separately. Underneath the covers, AI Solutions are based on AI Building Blocks and AI Platform.
Contact Center AI
Contact Center AI (CCAI) is a Google solution for contact centers designed to provide humanlike interactions. It builds on Dialogflow to provide a virtual agent, monitor customer intent, switch to live agents when necessary, and assist human agents. Google has half a dozen partners to help you develop and deploy a CCAI solution, and to support and train your agents.
Build and Use AI
Google Cloud Build and Use AI is a generically-defined solution that essentially offers Google’s AI expertise, AI Building Blocks, and AI Platform to solve your business problems. Among other benefits, this solution helps you to set up MLops with pipeline automation and CI/CD.
Google Cloud Document AI applies the Google Vision API OCR building block along with Cloud Natural Language to extract and interpret information from business documents, typically supplied in PDF format. Additional components parse general forms and invoice forms. Industry-specific solutions for mortgage processing and procurement are currently under test. Google has half a dozen partners to help you implement Document AI solutions.
In praise of Google Cloud AI Platform
Overall, the Google Cloud AI and Machine Learning Platform is very good, both in scope and quality. Whether you need to analyze language or images, use services out of the box, or use them with customizations to handle your own data, you’ll find the Google Cloud AI Platform at or near the state of the art. You can get your hands only as dirty as you need to. The days when you needed a Ph.D. in data science and a deep knowledge of TensorFlow to get a model trained and deployed are long gone.
The Google Cloud AI and Machine Learning Platform does have a few gaps, such as the lack of a first-party Google data preparation service, the lack of many on-premises options, and the fact that way too many of the services offered are still in beta test. Despite those issues, the Google Cloud AI can serve almost any machine learning or deep learning need. I especially like the new pipeline capabilities, even though it’s a beta test service based on open source products that are also still in beta test.
While I’m not going to speak further about the competition, some of which is also very good, I will say that in the sphere of AI applications, Google is essentially the new IBM, in that nobody ever gets fired for choosing Google AI.
(All pricing in USD)
Cloud AutoML Translation: Training: $76.00 per hour; Translation: $80 per million characters after the first 500K characters.
Cloud AutoML Natural Language: Training: $3.00 per hour; Classification: $5 per thousand text records after the first 30K records.
Cloud AutoML Vision: Training: $20 per hour after the first hour; Classification: $3 per thousand images after the first thousand images.
Cloud AutoML Tables: Training: 6 hours of free one-time use + $19.32 per hour (92 n1-standard-4 equivalent machines used in parallel); Batch prediction: 6 hours of free one-time use + $1.16 per hour (5.5 n1-standard-4 equivalent machines used in parallel); Online prediction: $0.21 per hour (1 n1-standard-4 equivalent machine); Deployment: $0.005 per GiB-hour x 9 machines (model replicated to 9 machines for low latency serving purposes); Vision: $1.50/feature/1,000 units after the first 1,000 units/month for most features; more for web detection and object localization.
Video: $0.04 to $0.17/minute/feature after the first 1,000 minutes/month.
Natural Language: $0.50 to $2.00/feature/1,000 units after the first 5,000 units/month.
Translation: $20/million characters after the first 500,000 characters/month.
Media Translation: $0.068 to $0.084/minute after the first 60 minutes/month.
Text to speech: $4/million characters after the first 4 million characters/month, standard voices; $16/million characters after the first 1 million characters/month, WaveNet voices.
Speech to text: $0.004 to $0.009/15 seconds after the first 60 minutes/month.
Dialogflow CX Agent: $20/100 chat sessions, $45/100 voice sessions.
Dialogflow ES Agent: varies by mode, reflects underlying voice and natural language charges.
Recommendations AI: $2.50/node/hour for training and tuning; $0.27/1000 predictions with tiered quantity discounts above 20 million requests/month.
GPUs: $0.11 to $2.48/GPU/hour.
TPUs: $1.35 to $8/TPU/hour.
AI Platform Training: $0.19 to $21.36/hour.
AI Platform Predictions: $0.045 to $1.13/node/hour plus GPUs at $0.45 to $2.48/GPU/hour.
All services run on the Google Cloud Platform; a few can also be run on-premises or in containers.