The Google Cloud Text-to-Speech API converts plain text and SSML markup to sound, with a choice of over 200 voices and 40 languages and variants. Variants include different national accents, such as US, Great Britain, South African, Indian, Irish, and Australian English.
Basic voices often sound rather mechanical; WaveNet voices typically sound more natural, but cost slightly more to use. You can also create custom voices from your own studio-quality audio recordings.
You can tune the speed of synthesized voices by up to 4x faster or slower, and the pitch by up to 20 semitones up or down. SSML tags allow you to add pauses, numbers, date and time formatting, and other pronunciation instructions. You can also increase the volume gain by up to 16db or decrease the volume by up to -96db.
The Google Cloud Speech-to-Text API converts speech into text using Google’s most advanced deep learning neural network algorithms for automatic speech recognition (ASR). It supports over 125 languages and variants, and can be deployed on-premises (with a license) as well as in Google Cloud. Speech-to-text can be run synchronously for short audio samples (one minute or less), asynchronously for longer audio (up to 480 minutes), and streaming for real-time recognition.
You can customize speech recognition to transcribe domain-specific terms and rare words by providing hints. There are specialized ASR models for video, phone calls, and command and search, as well as “default” (anything else). While you can embed encoded audio in your API request, more often you’ll provide a URI to a binary audio file stored in a Google Cloud storage bucket.
Google Cloud’s Dialogflow Essentials builds on Speech-to-Text and Text-to-Speech, and can take advantage of over 40 prebuilt agents as templates, for small bots with single topic conversations. Dialogflow CX is an advanced development suite for creating conversational AI applications, including chatbots, voice bots, and IVR (interactive voice response) bots. It includes a visual bot-building platform (see screenshot below), collaboration and versioning tools, and advanced IVR feature support, and it is optimized for enterprise scale and complexity.
Cloud Inference API
Time series data often requires some special handling, especially if you want to perform it in real time on streaming data in addition to handling a large historical data set. The fully managed serverless Google Cloud Inference API, currently in limited alpha test, detects trends and anomalies with event time markers, handles data sets consisting of up to tens of billions of events, runs thousands of queries per second, and responds with low latency.
Building effective recommendation systems with machine learning is reputed to be tricky and time-consuming. Google has automated the process with the Recommendations API, currently in beta test. This fully managed service takes care of preprocessing your data, training and hypertuning machine learning models, and provisioning the infrastructure. It also corrects for bias and seasonality. It integrates with related Google services, such as Analytics 360, Tag Manager, Merchant Center, Cloud Storage, and BigQuery. Initial model training and tuning take two to five days to complete.
Google Cloud AI Platform
The Google Cloud AI Platform and accelerators are for developers, data scientists, and data engineers. Most often, using the Google Cloud AI Platform to solve a problem can be a large effort. If you can avoid that effort by using AI Building Blocks, you should.
The Google Cloud AI Platform facilitates an end-to-end machine learning workflow for developers, data scientists, and data engineers. While it doesn’t help you source your data or code your model, it does help you tie together the rest of your machine learning workflow.
The AI Platform includes several model training services and a variety of machine types for training and tuning, including GPU and TPU accelerators. The Prediction service lets you serve predictions from any trained model; it’s not limited to models you trained yourself or models you trained on Google.
AI Platform Notebooks implement JupyterLab Notebooks on Google VMs, preconfigured with TensorFlow, PyTorch, and other deep learning packages. The AI Platform Data Labeling Service lets you request human labeling for a data set you want to use for training a model. AI Platform Deep Learning VM Images are optimized for data science and machine learning tasks, with key machine learning frameworks and tools and GPU support.
AI Platform Notebooks
For many data scientists, using Jupyter or JupyterLab Notebooks can be one of the easiest ways to develop and share models and machine learning workflows. Google Cloud AI Platform Notebooks make it simple to create and manage secure VMs preconfigured with JupyterLab, Git, GCP integration, and your choice of Python 2 or Python 3, R, Python and/or R core packages, TensorFlow, PyTorch, and CUDA.
While Kaggle and Colab also support Jupyter notebooks, Kaggle is aimed at enthusiasts and learning professionals, and Colab is aimed at researchers and students, while Google Cloud AI Notebooks are aimed at enterprise users. For heavy lifting, AI Notebooks can work with Deep Learning VMs, Dataproc clusters, and Dataflow, and they can connect to GCP data sources such as BigQuery.
You can start developing with a small VM and later scale up to a beefier VM with more memory and CPUs, and possibly with GPUs or TPUs for deep learning training. You can also save notebooks in Git repositories and load them into other instances. Alternatively, you can use the AI Platform Training service discussed below.
I went through a hands-on code lab on using Google Cloud AI Notebooks. A few screenshots from that experience follow. I also noticed a directory full of sample notebooks pre-loaded into JupyterLab; they look interesting, but I don’t have enough space here to discuss them in depth.
Explainable AI and the What-if Tool
If you use TensorFlow as your framework to build and fit a model, you can use Google’s What-if Tool to understand how changes to values in the training data might affect the model. In other domains, that’s called a sensitivity study. The What-if Tool can also display a number of useful graphs.
AI Platform Training
Model training often requires orders of magnitude more compute resources than model development. You can train simple models on small data sets in a Google Cloud AI Notebook or on your own machine. To train complex models on large data sets you may be better off using the AI Platform Training service.
The training service runs a training application stored in a Cloud Storage bucket against training and verification data stored in a Cloud Storage bucket, Cloud Bigtable, or another GCP storage service. If you run a built-in algorithm, you don’t need to build your own training application.
You can train models that use a code package from Cloud Storage, currently TensorFlow, Scikit-learn, and XGBoost, as well as models that use a custom container image from Cloud Storage and models that use built-in algorithms. You can also use a pre-built PyTorch container image derived from AI Platform Deep Learning Containers.
The current built-in algorithms are XGBoost, Distributed XGBoost, Linear Learner, Wide and Deep Learner, Image Classification, Image Object Detection, and TabNet. All of these algorithms except for Image Classification and Image Object Detection train from tabular data. All but the XGBoost algorithms currently rely on TensorFlow 1.14.
You can run AI Platform Training from the Jobs tab of the AI platform console, or by issuing a
gcloud ai-platform jobs submit training command. The command-line invocation method can also automate uploading your model code to a Cloud Storage bucket.
You can monitor training jobs from the Jobs tab of the AI platform console, from a
gcloud ai-platform jobs command, or from Python code. When a job completes, it normally saves a trained model to the Cloud Storage bucket you specified when you started the job.
You can perform distributed AI Platform training using Distributed XGBoost, TensorFlow, and PyTorch. The setup is different for each framework. For TensorFlow, there are three possible distribution strategies, and six options for “scale tiers,” which define the training cluster configuration.
Hyperparameter tuning works by performing multiple trainings of a model (to set variable weights) with different training process variables (to control the algorithm, e.g. by setting the learning rate). You can perform hyperparameter tuning on TensorFlow models fairly simply, as TensorFlow returns its training metric in its summary event reports. For other frameworks you may need to use the
cloudml-hypertune Python package so that AI Platform Training can detect the model’s metric. You set up the hyperparameters to tune, their ranges, and the tuning search strategy when you define the training job.
You can use GPUs or TPUs for your training jobs. You typically need to specify an instance type that includes the GPUs or TPUs you want to use, and enable them from within the code. The larger and more complicated the model, the more likely it is that its training can be accelerated by GPUs or TPUs.
AI Platform Vizier
Another way to perform hyperparameter optimization is to use AI Platform Vizier, a black-box optimization service. Vizier does studies with multiple trials and can solve lots of types of optimization problems, not just AI Training. Vizier is still in beta test.
AI Platform Prediction
Once you have a trained model, you need to deploy it for prediction. AI Platform Prediction manages computing resources in the cloud to run your models. You export your model as artifacts that you can deploy to AI Platform Prediction. The models don’t have to be trained on Google Cloud AI.
AI Platform Prediction assumes that models will change over time, so models contain versions, and it is the versions that can be deployed. The versions can be based on completely different machine learning models, although it helps if all versions of a model use the same inputs and outputs.
AI Platform Prediction allocates nodes to handle online prediction requests sent to a model version. When you deploy a model version, you can customize the number and type of virtual machine that AI Platform Prediction uses for these nodes. Nodes aren’t exactly VMs, but the underlying machine types are similar.
You can allow AI Platform Prediction to scale nodes automatically or manually. If you use GPUs for a model version, you can’t scale nodes automatically. If you allocate machine types that are too big for your model, you can try to scale nodes automatically, but the CPU load conditions for scaling might never be met. Ideally, you’ll use nodes that are just big enough for your machine learning models.
In addition to predictions, the platform can provide AI Explanations in the form of feature attributions for a particular prediction. This is currently in beta test. You can get feature attributions as a bar graph for tabular data and as an overlay for image data.
AI Platform Deep Learning VM Images
When you start with a plain vanilla operating system, configuring your environment for machine learning and deep learning, CUDA drivers, and JupyterLab can sometimes take as long as training the model, at least for simple models. Using a pre-configured image solves that problem.
Read more on the next page...