Review: Google Cloud AI lights up machine learning
- 10 November, 2020 20:27
Google has one of the largest machine learning stacks in the industry, currently centering on its Google Cloud AI and Machine Learning Platform. Google spun out TensorFlow as open source years ago, but TensorFlow is still the most mature and widely cited deep learning framework. Similarly, Google spun out Kubernetes as open source years ago, but it is still the dominant container management system.
Google is one of the top sources of tools and infrastructure for developers, data scientists, and machine learning experts, but historically Google AI hasn’t been all that attractive to business analysts who lack serious data science or programming backgrounds. That’s starting to change.
The Google Cloud AI and Machine Learning Platform includes AI building blocks, the AI platform and accelerators, and AI solutions. The AI solutions are fairly new and aimed at business managers rather than data scientists. They may include consulting from Google or its partners.
The AI building blocks, which are pre-trained but customizable, can be used without intimate knowledge of programming or data science. Nevertheless, they are often used by skilled data scientists for pragmatic reasons, essentially to get stuff done without extensive model training.
The AI platform and accelerators are generally for serious data scientists, and require coding skill, knowledge of data preparation techniques, and lots of training time. I recommend going there only after trying the relevant building blocks.
There are still some missing links in Google Cloud’s AI offerings, especially in data preparation. The closest thing Google Cloud has to a data import and conditioning service is the third-party Cloud Dataprep by Trifacta; I tried it a year ago and was underwhelmed. The feature engineering built into Cloud AutoML Tables is promising, however, and it would be useful to have that sort of service available for other scenarios.
The seamy underside of AI has to do with ethics and responsibility (or the lack thereof), along with persistent model biases (often because of biased data used for training). Google published its AI Principles in 2018. It’s a work in progress, but it’s a basis for guidance as discussed in a recent blog post on Responsible AI.
There is lots of competition in the AI market (over a dozen vendors), and lots of competition in the public cloud market (over half-a-dozen credible vendors). To do the comparisons justice, I’d have to write an article at least five times as long as this one, so as much as I hate leaving them out, I’ll have to omit most product comparisons. For the top obvious comparison, I can summarize: AWS does most of what Google does, and is also very good, but generally charges higher prices.
Google Cloud AI Building Blocks
Google Cloud AI Building Blocks are easy-to-use components that you can incorporate into your own applications to add sight, language, conversation, and structured data. Many of the AI building blocks are pre-trained neural networks, but can be customized with transfer learning and neural network search if they don’t serve your needs out of the box. AutoML Tables is a little different, in that it automates the process a data scientist would use to find the best machine learning model for a tabular data set.
The Google Cloud AutoML services provide customized deep neural networks for language pair translation, text classification, object detection, image classification, and video object classification and tracking. They require tagged data for training, but don’t require significant knowledge of deep learning, transfer learning, or programming.
Google Cloud AutoML customizes Google’s battle-tested, high-accuracy deep neural networks for your tagged data. Rather than starting from scratch when training models from your data, AutoML implements automatic deep transfer learning (meaning that it starts from an existing deep neural network trained on other data) and neural architecture search (meaning that it finds the right combination of extra network layers) for language pair translation and the other services listed above.
In each area, Google already has one or more pre-trained services based on deep neural networks and huge sets of labeled data. These may well work for your data unmodified, and you should test that to save yourself time and money. If they don’t do what you need, Google Cloud AutoML helps you to create a model that does, without requiring that you know how to perform transfer learning or how to design neural networks.
Transfer learning offers two big advantages over training a neural network from scratch. First, it requires a lot less data for training, since most of the layers of the network are already well trained. Second, it trains a lot faster, since it’s only optimizing the final layers.
While the Google Cloud AutoML services used to be presented together as a package, they are now listed with their base pre-trained services. What most other companies call AutoML is performed by Google Cloud AutoML Tables.
The usual data science process for many regression and classification problems is to create a table of data for training, clean and condition the data, perform feature engineering, and try to train all of the appropriate models on the transformed table, including a step to optimize the best models’ hyperparameters. Google Cloud AutoML Tables can perform this entire process automatically once you manually identify the target field.
AutoML Tables automatically searches through Google’s model zoo for structured data to find the best model for your needs, ranging from linear/logistic regression models for simpler data sets to advanced deep, ensemble, and architecture-search methods for larger, more complex ones. It automates feature engineering on a wide range of tabular data primitives — such as numbers, classes, strings, timestamps, and lists — and helps you detect and take care of missing values, outliers, and other common data issues.
Its codeless interface guides you through the full end-to-end machine learning lifecycle, making it easy for anyone on your team to build models and reliably incorporate them into broader applications. AutoML Tables provides extensive input data and model behavior explainability features, along with guardrails to prevent common mistakes. AutoML Tables is also available in API and notebook environments.
AutoML Tables competes with Driverless AI and several other AutoML implementations and frameworks.
The Google Cloud Vision API is a pre-trained machine learning service for categorizing images and extracting various features. It can classify images into thousands of pre-trained categories, ranging from generic objects and animals found in the image (such as a cat), to general conditions (for example, dusk), to specific landmarks (Eiffel Tower, Grand Canyon), and identify general properties of the image, such as its dominant colors. It can isolate areas that are faces, then apply geometric (facial orientation and landmarks) and emotional analyses to the faces, although it does not recognize faces as belonging to specific people, except for celebrities (which requires a special usage license). Vision API uses OCR to detect text within images in more than 50 languages and various file types. It can also identify product logos, and detect adult, violent, and medical content.
Video Intelligence API
The Google Cloud Video Intelligence API automatically recognizes more than 20,000 objects, places, and actions in stored and streaming video. It also distinguishes scene changes and extracts rich metadata at the video, shot, or frame level. It additionally performs text detection and extraction using OCR, detects explicit content, automates closed captioning and subtitles, recognizes logos, and detects faces, persons, and poses.
Google recommends the Video Intelligence API for extracting metadata to index, organize, and search your video content. It can transcribe videos and generate closed captions, as well as flag and filter inappropriate content, all more cost-effectively than human transcribers. Use cases include content moderation, content recommendations, media archives, and contextual advertisements.
Natural Language API
Natural language processing (NLP) is a big part of the “secret sauce” that makes input to Google Search and the Google Assistant work well. The Google Cloud Natural Language API exposes that same technology to your programs. It can perform syntax analysis (see the image below), entity extraction, sentiment analysis, and content classification, in 10 languages. You may specify the language if you know it; otherwise, the API will attempt to auto-detect the language. A separate API, currently available for early access on request, specializes in healthcare-related content.
The Google Cloud Translation API can translate over a hundred language pairs, can auto-detect the source language if you don’t specify it, and comes in three flavors: Basic, Advanced, and Media Translation. The Advanced Translation API supports a glossary, batch translation, and the use of custom models. The Basic Translation API is essentially what is used by the consumer Google Translate interface. AutoML Translation allows you to train custom models using transfer learning.
The Media Translation API translates content directly from audio (speech), either audio files or streams, in 12 languages, and automatically generates punctuation. There are separate models for video and phone call audio.
Read more on the next page...
The Google Cloud Text-to-Speech API converts plain text and SSML markup to sound, with a choice of over 200 voices and 40 languages and variants. Variants include different national accents, such as US, Great Britain, South African, Indian, Irish, and Australian English.
Basic voices often sound rather mechanical; WaveNet voices typically sound more natural, but cost slightly more to use. You can also create custom voices from your own studio-quality audio recordings.
You can tune the speed of synthesized voices by up to 4x faster or slower, and the pitch by up to 20 semitones up or down. SSML tags allow you to add pauses, numbers, date and time formatting, and other pronunciation instructions. You can also increase the volume gain by up to 16db or decrease the volume by up to -96db.
The Google Cloud Speech-to-Text API converts speech into text using Google’s most advanced deep learning neural network algorithms for automatic speech recognition (ASR). It supports over 125 languages and variants, and can be deployed on-premises (with a license) as well as in Google Cloud. Speech-to-text can be run synchronously for short audio samples (one minute or less), asynchronously for longer audio (up to 480 minutes), and streaming for real-time recognition.
You can customize speech recognition to transcribe domain-specific terms and rare words by providing hints. There are specialized ASR models for video, phone calls, and command and search, as well as “default” (anything else). While you can embed encoded audio in your API request, more often you’ll provide a URI to a binary audio file stored in a Google Cloud storage bucket.
Google Cloud’s Dialogflow Essentials builds on Speech-to-Text and Text-to-Speech, and can take advantage of over 40 prebuilt agents as templates, for small bots with single topic conversations. Dialogflow CX is an advanced development suite for creating conversational AI applications, including chatbots, voice bots, and IVR (interactive voice response) bots. It includes a visual bot-building platform (see screenshot below), collaboration and versioning tools, and advanced IVR feature support, and it is optimized for enterprise scale and complexity.
Cloud Inference API
Time series data often requires some special handling, especially if you want to perform it in real time on streaming data in addition to handling a large historical data set. The fully managed serverless Google Cloud Inference API, currently in limited alpha test, detects trends and anomalies with event time markers, handles data sets consisting of up to tens of billions of events, runs thousands of queries per second, and responds with low latency.
Building effective recommendation systems with machine learning is reputed to be tricky and time-consuming. Google has automated the process with the Recommendations API, currently in beta test. This fully managed service takes care of preprocessing your data, training and hypertuning machine learning models, and provisioning the infrastructure. It also corrects for bias and seasonality. It integrates with related Google services, such as Analytics 360, Tag Manager, Merchant Center, Cloud Storage, and BigQuery. Initial model training and tuning take two to five days to complete.
Google Cloud AI Platform
The Google Cloud AI Platform and accelerators are for developers, data scientists, and data engineers. Most often, using the Google Cloud AI Platform to solve a problem can be a large effort. If you can avoid that effort by using AI Building Blocks, you should.
The Google Cloud AI Platform facilitates an end-to-end machine learning workflow for developers, data scientists, and data engineers. While it doesn’t help you source your data or code your model, it does help you tie together the rest of your machine learning workflow.
The AI Platform includes several model training services and a variety of machine types for training and tuning, including GPU and TPU accelerators. The Prediction service lets you serve predictions from any trained model; it’s not limited to models you trained yourself or models you trained on Google.
AI Platform Notebooks implement JupyterLab Notebooks on Google VMs, preconfigured with TensorFlow, PyTorch, and other deep learning packages. The AI Platform Data Labeling Service lets you request human labeling for a data set you want to use for training a model. AI Platform Deep Learning VM Images are optimized for data science and machine learning tasks, with key machine learning frameworks and tools and GPU support.
AI Platform Notebooks
For many data scientists, using Jupyter or JupyterLab Notebooks can be one of the easiest ways to develop and share models and machine learning workflows. Google Cloud AI Platform Notebooks make it simple to create and manage secure VMs preconfigured with JupyterLab, Git, GCP integration, and your choice of Python 2 or Python 3, R, Python and/or R core packages, TensorFlow, PyTorch, and CUDA.
While Kaggle and Colab also support Jupyter notebooks, Kaggle is aimed at enthusiasts and learning professionals, and Colab is aimed at researchers and students, while Google Cloud AI Notebooks are aimed at enterprise users. For heavy lifting, AI Notebooks can work with Deep Learning VMs, Dataproc clusters, and Dataflow, and they can connect to GCP data sources such as BigQuery.
You can start developing with a small VM and later scale up to a beefier VM with more memory and CPUs, and possibly with GPUs or TPUs for deep learning training. You can also save notebooks in Git repositories and load them into other instances. Alternatively, you can use the AI Platform Training service discussed below.
I went through a hands-on code lab on using Google Cloud AI Notebooks. A few screenshots from that experience follow. I also noticed a directory full of sample notebooks pre-loaded into JupyterLab; they look interesting, but I don’t have enough space here to discuss them in depth.
Explainable AI and the What-if Tool
If you use TensorFlow as your framework to build and fit a model, you can use Google’s What-if Tool to understand how changes to values in the training data might affect the model. In other domains, that’s called a sensitivity study. The What-if Tool can also display a number of useful graphs.
AI Platform Training
Model training often requires orders of magnitude more compute resources than model development. You can train simple models on small data sets in a Google Cloud AI Notebook or on your own machine. To train complex models on large data sets you may be better off using the AI Platform Training service.
The training service runs a training application stored in a Cloud Storage bucket against training and verification data stored in a Cloud Storage bucket, Cloud Bigtable, or another GCP storage service. If you run a built-in algorithm, you don’t need to build your own training application.
You can train models that use a code package from Cloud Storage, currently TensorFlow, Scikit-learn, and XGBoost, as well as models that use a custom container image from Cloud Storage and models that use built-in algorithms. You can also use a pre-built PyTorch container image derived from AI Platform Deep Learning Containers.
The current built-in algorithms are XGBoost, Distributed XGBoost, Linear Learner, Wide and Deep Learner, Image Classification, Image Object Detection, and TabNet. All of these algorithms except for Image Classification and Image Object Detection train from tabular data. All but the XGBoost algorithms currently rely on TensorFlow 1.14.
You can run AI Platform Training from the Jobs tab of the AI platform console, or by issuing a
gcloud ai-platform jobs submit training command. The command-line invocation method can also automate uploading your model code to a Cloud Storage bucket.
You can monitor training jobs from the Jobs tab of the AI platform console, from a
gcloud ai-platform jobs command, or from Python code. When a job completes, it normally saves a trained model to the Cloud Storage bucket you specified when you started the job.
You can perform distributed AI Platform training using Distributed XGBoost, TensorFlow, and PyTorch. The setup is different for each framework. For TensorFlow, there are three possible distribution strategies, and six options for “scale tiers,” which define the training cluster configuration.
Hyperparameter tuning works by performing multiple trainings of a model (to set variable weights) with different training process variables (to control the algorithm, e.g. by setting the learning rate). You can perform hyperparameter tuning on TensorFlow models fairly simply, as TensorFlow returns its training metric in its summary event reports. For other frameworks you may need to use the
cloudml-hypertune Python package so that AI Platform Training can detect the model’s metric. You set up the hyperparameters to tune, their ranges, and the tuning search strategy when you define the training job.
You can use GPUs or TPUs for your training jobs. You typically need to specify an instance type that includes the GPUs or TPUs you want to use, and enable them from within the code. The larger and more complicated the model, the more likely it is that its training can be accelerated by GPUs or TPUs.
AI Platform Vizier
Another way to perform hyperparameter optimization is to use AI Platform Vizier, a black-box optimization service. Vizier does studies with multiple trials and can solve lots of types of optimization problems, not just AI Training. Vizier is still in beta test.
AI Platform Prediction
Once you have a trained model, you need to deploy it for prediction. AI Platform Prediction manages computing resources in the cloud to run your models. You export your model as artifacts that you can deploy to AI Platform Prediction. The models don’t have to be trained on Google Cloud AI.
AI Platform Prediction assumes that models will change over time, so models contain versions, and it is the versions that can be deployed. The versions can be based on completely different machine learning models, although it helps if all versions of a model use the same inputs and outputs.
AI Platform Prediction allocates nodes to handle online prediction requests sent to a model version. When you deploy a model version, you can customize the number and type of virtual machine that AI Platform Prediction uses for these nodes. Nodes aren’t exactly VMs, but the underlying machine types are similar.
You can allow AI Platform Prediction to scale nodes automatically or manually. If you use GPUs for a model version, you can’t scale nodes automatically. If you allocate machine types that are too big for your model, you can try to scale nodes automatically, but the CPU load conditions for scaling might never be met. Ideally, you’ll use nodes that are just big enough for your machine learning models.
In addition to predictions, the platform can provide AI Explanations in the form of feature attributions for a particular prediction. This is currently in beta test. You can get feature attributions as a bar graph for tabular data and as an overlay for image data.
AI Platform Deep Learning VM Images
When you start with a plain vanilla operating system, configuring your environment for machine learning and deep learning, CUDA drivers, and JupyterLab can sometimes take as long as training the model, at least for simple models. Using a pre-configured image solves that problem.
Read more on the next page...
You can pick an AI Platform Deep Learning VM Image with TensorFlow, TensorFlow Enterprise, PyTorch, R, or half a dozen other frameworks. All images can include JupyterLab, and images intended for use with GPUs can have CUDA drivers.
You can create an instance from the gcloud command line (installed via the Google Cloud SDK) or from the Google Cloud marketplace. When you create your VM you can choose the number of virtual CPUs (the machine type, which also determines the RAM), and the number and kind of GPUs. You’ll see an estimate of your monthly cost based on the hardware you choose, with a discount for sustained usage. There’s no additional charge for the frameworks. If you choose a VM with GPUs, you need to allow a few minutes for installation of the CUDA drivers.
AI Platform Deep Learning Containers
Google also supplies Deep Learning Containers suitable for use in Docker on your local machine or on Google Kubernetes Engine (GKE). The containers have all the frameworks, drivers, and support software you might want, unlike the VM Images, which allow you to select only what you need. Deep Learning Containers are currently in beta test.
AI Platform Pipelines
MLOps (machine learning operations) applies DevOps (developer operations) practices to machine learning workflows. Much of the Google Cloud AI Platform supports MLOps in some way, but AI Platform Pipelines are at the core of what MLOps is all about.
AI Platform Pipelines, currently in beta test, makes it easier to get started with MLOps by saving you the difficulty of setting up Kubeflow Pipelines with TensorFlow Extended (TFX). The open source Kubeflow project is dedicated to making deployments of machine learning workflows on Kubernetes simple, portable, and scalable. Kubeflow Pipelines, a component of Kubeflow that is currently in beta test, is a comprehensive solution for deploying and managing end-to-end machine learning workflows.
TensorFlow Extended is an end-to-end platform for deploying production machine learning pipelines. TFX makes it easier to implement MLOps by providing a toolkit that helps you orchestrate your machine learning process on various orchestrators, such as Apache Airflow, Apache Beam, and Kubeflow Pipelines. Google Cloud AI Platform Pipelines uses TFX Pipelines, which are DAGs (directed acyclic graphs), with Kubeflow Pipelines rather than Airflow or Beam.
You manage AI Platform Pipelines from the Pipelines tab of the AI Platform in your Google Cloud console. Creating a new pipeline instance creates a Kubernetes cluster, a cloud storage bucket, and a Kubeflow Pipeline. Then you can either define the pipeline from an example, or from scratch using TFX.
Spotify used TFX and Kubeflow to improve its MLOps. The company reports that some teams are producing 7x more experiments.
AI Platform Data Labeling Service
Google Cloud AI Platform Data Labeling Service lets you work with human labelers to generate highly accurate labels for a collection of data that you can use in machine learning models. The service is currently in beta test, and availability is very limited because of COVID-19.
Google Cloud AI Hub, currently in beta test, offers a collection of assets for developers and data scientists building artificial intelligence systems. You can both find and share assets. Even in beta form, AI Hub seems to be useful.
TensorFlow Enterprise provides users with a Google Cloud optimized distribution of TensorFlow with long-term version support. The TensorFlow Enterprise Distribution contains custom-built TensorFlow binaries and related packages. Each version of the TensorFlow Enterprise Distribution is anchored on a particular version of TensorFlow; all packages included are available in open source.
Google Cloud AI Solutions
Google aims AI Solutions at business executives rather than at data scientists or programmers. The solutions usually come with an optional consulting or contract development component; consulting services are also available separately. Underneath the covers, AI Solutions are based on AI Building Blocks and AI Platform.
Contact Center AI
Contact Center AI (CCAI) is a Google solution for contact centers designed to provide humanlike interactions. It builds on Dialogflow to provide a virtual agent, monitor customer intent, switch to live agents when necessary, and assist human agents. Google has half a dozen partners to help you develop and deploy a CCAI solution, and to support and train your agents.
Build and Use AI
Google Cloud Build and Use AI is a generically-defined solution that essentially offers Google’s AI expertise, AI Building Blocks, and AI Platform to solve your business problems. Among other benefits, this solution helps you to set up MLops with pipeline automation and CI/CD.
Google Cloud Document AI applies the Google Vision API OCR building block along with Cloud Natural Language to extract and interpret information from business documents, typically supplied in PDF format. Additional components parse general forms and invoice forms. Industry-specific solutions for mortgage processing and procurement are currently under test. Google has half a dozen partners to help you implement Document AI solutions.
In praise of Google Cloud AI Platform
Overall, the Google Cloud AI and Machine Learning Platform is very good, both in scope and quality. Whether you need to analyze language or images, use services out of the box, or use them with customizations to handle your own data, you’ll find the Google Cloud AI Platform at or near the state of the art. You can get your hands only as dirty as you need to. The days when you needed a Ph.D. in data science and a deep knowledge of TensorFlow to get a model trained and deployed are long gone.
The Google Cloud AI and Machine Learning Platform does have a few gaps, such as the lack of a first-party Google data preparation service, the lack of many on-premises options, and the fact that way too many of the services offered are still in beta test. Despite those issues, the Google Cloud AI can serve almost any machine learning or deep learning need. I especially like the new pipeline capabilities, even though it’s a beta test service based on open source products that are also still in beta test.
While I’m not going to speak further about the competition, some of which is also very good, I will say that in the sphere of AI applications, Google is essentially the new IBM, in that nobody ever gets fired for choosing Google AI.
(All pricing in USD)
Cloud AutoML Translation: Training: $76.00 per hour; Translation: $80 per million characters after the first 500K characters.
Cloud AutoML Natural Language: Training: $3.00 per hour; Classification: $5 per thousand text records after the first 30K records.
Cloud AutoML Vision: Training: $20 per hour after the first hour; Classification: $3 per thousand images after the first thousand images.
Cloud AutoML Tables: Training: 6 hours of free one-time use + $19.32 per hour (92 n1-standard-4 equivalent machines used in parallel); Batch prediction: 6 hours of free one-time use + $1.16 per hour (5.5 n1-standard-4 equivalent machines used in parallel); Online prediction: $0.21 per hour (1 n1-standard-4 equivalent machine); Deployment: $0.005 per GiB-hour x 9 machines (model replicated to 9 machines for low latency serving purposes); Vision: $1.50/feature/1,000 units after the first 1,000 units/month for most features; more for web detection and object localization.
Video: $0.04 to $0.17/minute/feature after the first 1,000 minutes/month.
Natural Language: $0.50 to $2.00/feature/1,000 units after the first 5,000 units/month.
Translation: $20/million characters after the first 500,000 characters/month.
Media Translation: $0.068 to $0.084/minute after the first 60 minutes/month.
Text to speech: $4/million characters after the first 4 million characters/month, standard voices; $16/million characters after the first 1 million characters/month, WaveNet voices.
Speech to text: $0.004 to $0.009/15 seconds after the first 60 minutes/month.
Dialogflow CX Agent: $20/100 chat sessions, $45/100 voice sessions.
Dialogflow ES Agent: varies by mode, reflects underlying voice and natural language charges.
Recommendations AI: $2.50/node/hour for training and tuning; $0.27/1000 predictions with tiered quantity discounts above 20 million requests/month.
GPUs: $0.11 to $2.48/GPU/hour.
TPUs: $1.35 to $8/TPU/hour.
AI Platform Training: $0.19 to $21.36/hour.
AI Platform Predictions: $0.045 to $1.13/node/hour plus GPUs at $0.45 to $2.48/GPU/hour.
All services run on the Google Cloud Platform; a few can also be run on-premises or in containers.