[Video] Artificial Intelligence, Mobile and Beyond
Presently, adding an Artificial Intelligence (AI) feature to an application will help make it stand out, but in the near future it will likely be crucial to stay competitive. It’s necessary to make full use of what’s available, especially on mobile platforms where we can take advantage of sensory information from features like GPS and the camera. Using these inputs to our advantage is key and there are several approaches one can take to go about adding this. In my recent talk, Artificial Intelligence (AI), Mobile and Beyond (captioned video embedded below) I take a case study approach and walk through the key components of AI integration within mobile applications. Play Video
There are two overarching categories of AI integration: the classical approach and the mobile approach. The key differentiator is that with the classical approach, the client application relies on an endpoint to return a prediction from a trained model. With the mobile approach, the client application pulls the entire trained model and uses it for the client-side interface.
ZenDrive, an app used to monitor and analyze users’ driving habits, is a good case where we would use the classical approach. In this case, the client-side would collect and send its data to a backend to get the analysis. The backend could be anything, but in my case study I talked about using a machine learning cloud service, namely, Azure ML Studio. This service would hold the trained model and expose an endpoint to query.
AiCure, an app that makes sure the right patient gets the right medication at the right time, is a good example where we would want to use the mobile approach. For this application, it would be preferable to not have the user rely on an API request for facial recognition and then again for medication detection. A high-level outline for this would be exposing a way for the app to download the trained model and then using something like TensorFlow Lite to infer from that client-side model.
I created a flowchart that summarizes some of the possibilities in these approaches and can be used to go through the pros and cons.
When deciding between the mobile approach and classical approach, consider the importance of things like latency and network connection. If these factors are of low importance, the classical approach is the tried-and-true method, but this really depends on the feature. Imagine if the Google Translate app didn’t allow the user to download a trained model to work offline and instead always relied on a steady connection.
In most cases there is not really much choice between inferring on the device or training on the device because, at this time, most devices are not powerful enough to train on the device – but I did want to mention that this is a possibility for very simple use cases. For inference on mobile see: TensorFlow Lite, CoreML
Predict on device:
- Speed doesn’t rely on an endpoint to give user feedback, creating a more immersive experience.
- If users carry the trained model in their app there is less to worry about about in terms of scaling the machine learning backend.
- Trained models will increase the app size and overall management of the trained model would be harder.
- This is still pretty new stuff, so it may be harder to find resources and get set up compared to just setting up an endpoint.
In cases where there isn’t a problem with doing a network call, there are a lot of options out there in terms of where and how to get a trained model setup.
Ready-made services are for cases where you don’t need to train your own model and you can get away with a generic object detection or a generic voice detection feature. There are plenty of services that offer their own trained model and most of them with a free tier too. See: Google Cloud Vision, Amazon Rekognition, Microsoft Azure Cognitive Services, IBM Watson, Clarifai
These sites are also good resources that compare the above services: https://blog.filestack.com/thoughts-and-knowledge/comparing-google-vision-microsoft-cognitive-amazon-rekognition-clarifai/
- You don’t need to train your own model.
- It’s very easy to add into an application.
- You don’t own the model. Most of these services don’t even let you customize it, so it’s entirely dependent on the service owner.
- You pay per request (there is usually a free tier).
On your own servers:
On the other end of the spectrum is to do to all this on your own servers. This isn’t likely but if your model is simple this may be a good choice.
- You own the entire pipeline. You train and deploy how you see fit.
- This option is good for sandboxing and learning.
- You pay for and manage everything.
General cloud services:
The cloud is usually a better alternative to doing everything on your own servers. You have all the benefits of doing everything on our own with none of the responsibility. This involves setting up a machine learning (ML) pipeline to get a trained model in a general cloud service, something like
- You only need to rent these ML services when you need to re-train or get started. Otherwise, you own the model and can download it and ship it with your application, if you want to. This option pairs well with the mobile approach.
- You can train however you like.
- You pay for storage as well as computing.
- You’ll need to set everything up.
Machine learning services (ML):
Lastly, there are machine learning services that are like the cloud services above, but with everything set up so you only need to focus on the machine learning part. See: Google Cloud Machine Learning, AWS Machine Learning, Azure ML Studio.
- You just need to get data and train.
- It’s easy to integrate and you can usually create an endpoint to get inference with the click of a couple buttons.
- With the exception of Google, these services don’t let you download the model. You probably don’t want to do this if you are considering the mobile approach.
- You don’t have the option to train the model any way you like; you’re usually choosing from a selection.
The above services are the common ways of handling AI features on an application. Hopefully this summary gave some insight as to what’s available to work with and what approach would be most useful if, and when, you decide to add a machine learning feature to your application.
Published on 02.15.18