Offer pre-trained models and support transfer learning
Not everyone wants to spend the time and compute resources to train their own models — nor should they, when pre-trained models are available. For example, the ImageNet dataset is huge, and training a state-of-the-art deep neural network against it can take weeks, so it makes sense to use a pre-trained model for it when you can.
On the other hand, pre-trained models may not always identify the objects you care about. Transfer learning can help you customise the last few layers of the neural network for your specific data set without the time and expense of training the full network.
Offer tuned AI services
The major cloud platforms offer robust, tuned AI services for many applications, not just image identification. Example include language translation, speech to text, text to speech, forecasting, and recommendations.
These services have already been trained and tested on more data than is usually available to businesses. They are also already deployed on service endpoints with enough computational resources, including accelerators, to ensure good response times under worldwide load.
Manage your experiments
The only way to find the best model for your data set is to try everything, whether manually or using AutoML. That leaves another problem: Managing your experiments.
A good cloud machine learning platform will have a way that you can see and compare the objective function values of each experiment for both the training sets and the test data, as well as the size of the model and the confusion matrix. Being able to graph all of that is a definite plus.
Support model deployment for prediction
Once you have a way of picking the best experiment given your criteria, you also need an easy way to deploy the model. If you deploy multiple models for the same purpose, you’ll also need a way to apportion traffic among them for a/b testing.
Monitor prediction performance
Unfortunately, the world tends to change, and data changes with it. That means you can’t deploy a model and forget it. Instead, you need to monitor the data submitted for predictions over time. When the data starts changing significantly from the baseline of your original training data set, you’ll need to retrain your model.
Finally, you need ways to control the costs incurred by your models. Deploying models for production inference often accounts for 90 per cent of the cost of deep learning, while the training accounts for only 10 per cent of the cost.
The best way to control prediction costs depends on your load and the complexity of your model. If you have a high load, you might be able to use an accelerator to avoid adding more virtual machine instances.
If you have a variable load, you might be able to dynamically change your size or number of instances or containers as the load goes up or down. And if you have a low or occasional load, you might be able to use a very small instance with a partial accelerator to handle the predictions.