Life is a game, take it seriously

Archive for the ‘App’ Category

Tool Tutorial: Google AI Platform for Hobbyist

In AI, App, deep learning, Machine Learning, Serious Stuffs on October 27, 2019 at 10:44 pm

by Li Yang Ku (Gooly)

In this post I am going to talk about the Google AI platform (previously called Google ML engine) and how to use it if deep learning is just your after work hobby. I will provide links to other tutorials and details at the end so that you can try it out, but the purpose of this post is to give you a big picture of how it works without having to read through all the marketing phrases targeting company decision makers.

Google AI platform is part of the Google cloud and provides computing power for training and deploying deep networks. So what’s the difference between this platform and any other cloud computing services such as AWS (Amazon Web Services)? Google AI platform is specialized for deep learning and is suppose to simplify the process. If you are using TensorFlow (also developed by Google) with a pretty standard neural network architecture, it should be a breeze to train and deploy your model for online applications. There is no need to set up servers, all you need is a few lines of gcloud commands and your model will be trained and deployed in the cloud. (You also get a $300 dollar first year credit for signing up on Google Cloud Platform, which is quite a lot for home projects.) Note that Google AI platform is not the only shop in town, take a look at Microsoft’s Azure AI if you like to shop around.

So how does it work? First of all, there are four ways to communicate with Google AI platform. You can do it 1) locally: where you have all the code on your computer and communications are made through commands directly, 2) on Google Colab: Colab is another Google project that is basically a Jupyter notebook on the cloud which you can share with others, 3) on the AI platform notebook: which is similar to Colab but have more direct access to the platform and more powerful machines, and 4) on any other cloud server or jupyter notebook like webservice such as FloydHub. The main difference between using Colab versus AI platform notebook is pricing. Colab is free (even with GPU access), but has limitations such as up to 12 hours of run time and shuts down after 90 minutes of idle time. It provides you with about 12GB RAM and 50GB disk space (although the disk is half full when started due to preinstalled packages). After disconnected, you can still reconnect with whatever you wrote in the notebook, but you will lost whatever is in the RAM and disk. For a home project, Colab is probably sufficient, the disk space is not a limitation since we can store training data in google storage. (Note that it is also possible to connect Google drive in Colab so that you don’t need to start from scratch every time.) On the other hand, AI platform notebook could be pricey if you want to keep it running (0.137 / hour and 99.89 / month for a non-gpu machine).

Before we move on, we also have to understand the differences between computation and storage on the Google AI platform. Unlike personal computers where disk space and computation are tightly integrated, they are separated on the cloud. There are machines that are responsible for computation and machines that are responsible for storage. Here, Google AI platform is responsible for the computation while the Google Cloud Storage takes care of the stored data and code. Therefore, before we start using the platform we will need to first create a storage space called bucket. This can be easily done through a one line command once you created a Google Cloud account.

If you are using Colab, you will also need to have the code for training your neural network downloaded to your Colab virtual machine. One common work flow would be to use software version control services such as Github for your code and just clone the files to Colab every time you start. It makes more sense to use Colab if you are collaborating with others or want to share how you train your model, otherwise doing everything locally might be simpler.

So the whole training process looks like this:

  1. Create a Google Cloud Project.
  2. Create a bucket where the Google AI platform can perform computations on.
  3. With a single command, upload your code to the bucket and request the AI platform to perform training.
  4. Can also perform hyper parameter tuning if needed.
  5. If you want the trained model locally, you can simply download it from the bucket through a user interface or command.

A trained model is not very useful if not used. Google AI platform provides an easy way to deploy your model as a service in the cloud. Before continuing, we should clarify some Google terminology. At Google AI platform, a “model” means an interface that solves certain tasks and a trained model is named  a “version” of this “model” (reference). In the following, quotation marks will be put around Google specific terminologies to avoid confusion.

The deployment and prediction process is then the following:

  1. Create a “model” at AI platform.
  2. Create a “version” of the “model” by providing the trained model stored in the bucket.
  3. Make predictions through one of the following approaches:
    • gcloud commands
    • Python interface
    • Java interface
    • REST API
      (the first three methods are just easier ways to generate a REST request)

And that’s all you need to grant your home made web application access to scalable deep learning prediction capability. You can run this whole process I described above through this official tutorial in Colab and more descriptions of this tutorial can be found here. I will be posting follow up posts on building specific applications on Google AI platform, so stay tuned if you are interested.

References:

How things work: Amazon Flow App Algorithm

In App, Computer Vision on January 29, 2012 at 6:49 pm

by Gooly (Li Yang Ku)

A9 Flow App

Amazon and Google are now the top players in the area of image query. Amazon’s laboratory A9 acquired the image query company SnapTell in 2009 and released a smart phone app Flow in mid 2011. Google have long been in the image search service since it acquired Neven Vision in 2006. Google also released Google Goggle in early 2011.

Amazon’s Flow is an app that allows users to obtain product information by pointing your phone camera to the product. The idea is to allow consumers to buy stuffs on Amazon in a rival’s physical store and also to report the local price of the product to Amazon. This controversial business idea is considered by many shop owners as immoral but might be unavoidable.

Google Goggle is a more general smart phone app that does image recognition and image query. I’ll leave Google Goggle to the next post and focus on Flow now.

The Flow App is made by A9’s visual search group, which I believe is basically the SnapTell group. Before Amazon acquiring SnapTell, it already had a visual search App. It even used a similar logo (see below).

On A9’s official website only very few information is revealed about the technology; we can only guess from the following flowchart. Apparently they are waiting for patents to get approved and doesn’t want to reveal any detail.

To get some sense about what this is all about we have to dig deeper and see who are the people that made this app. From SnapTell.com we know that the people that probably influenced this app most are Gautam Bhargava (CEO), Rajeev Motwan (Stanford Professor), and G.D. Ramkumar (CTO). If you google for their publications none of them was in the computer vision area. Gautam Bhargava worked on database, Rajeev Motwan teaches theoretical computer science, and G.D. Ramkumar seems used to work on geometric algorithms. Therefore the innovative part of this app very likely lies in the database part.

From the first part of the chart, the points look very like SIFT like features. According to vision wang’s blog post he believes that the ASG algorithm (Accumulated Signed Gradient) is a SIFT like feature. According to the name, Accumulated Signed Gradient, I guess it might use several gradient values around a point and accumulate it as a vector descriptor. Or it might simply mean the algorithm accumulated several descriptor as one vector. The app works real time so I suspect anything complicated was implemented.

However according to a 2008 white paper I found on the web (see flow chart below), in addition to the image, text is also used. Text inside the polygon surrounded by the feature points could be used as an additional information for visual search.

I would say there might not be anything amazing in the part I mentioned above, the core technology should be on how they query the image database. While I am not an expert on database, I guess the database is a tree like structure, something derived from the patent, method and apparatus for classification of high dimensional data, written by G.D. Ramkumar before he co-founded SnapTell.

The problem flow is trying to solve is actually very hard and complicated, and apparently they haven’t solve it. I currently only succeeded on matching books and a coke can. For the app to work well on other non-plane objects, more efforts need to be implemented on the first step.