Our team is no stranger to various flavors of AI including deep learning (DL). Thatâs why weâve immediately noticed when Google came out with AutoML project, designed to make AI build other AIs.
Neural networks have recently gained popularity and wide practical applications. However, to get good results with neural networks, it is critical to pick the right network topology, which has always been a difficult manual task.
Googleâs recent project promises to help solve this task automatically with a meta-AI which will design the topology for neural network architecture. Google, however, did not offer documentation or examples of how to use this new wonderful technology. We liked the idea and, among the first, came up with a practical implementation that other people can follow, using it as an example. This is similar in concept to AlphaGo, for instance.
Googleâs approach is based on the AI concept called Reinforcement Learning, meaning that the parent AI reviews the efficiency of the child AI and makes adjustments to the neural network architecture, such as adjusting the number of layers, weights, regularization methods, etc. to improve efficiency.
The advantage of automation is the ability to eliminate guesswork from the manual neural network model design as well as significantly reducing the time required for each problem, since designing the neural network model is the most labor-intensive part of the task.
Although Google has recently open sourced an example of NASnet, how they found the architecture of NASnet is still unclear to most folks.
In addition, in our opinion, the name itself adds to the confusion with these technologies.
In this post, we will take a detailed look (with a step by step explanation) at implementing a simple model for neural architecture search with AutoML and reinforcement learning.
Note: To understand this post, you will need to have sufficient background understanding of the convolutional neural networks, recurrent neural networks, and reinforcement learning.
Links below will provide you with good background information:
Neural Architecture Search (NAS) with Reinforcement Learning is a method for finding good neural networks architecture. For this post, we will try to find optimal architecture for Convolutional Neural Network (CNN) which recognizes handwritten digits.
For this implementation, we use TensorFlow 1.4, but if you want to try this at home, you can use any version after 1.1, since NASCell first became available in TensorFlow 1.1. It is important not to confuse AutoML and NAS.
The full code is available on Github.
To train the model we will use the MNIST database of handwritten digits, which has a training set of 55,000 examples and a test set of 10,000 examples.
The network we are building in this exercise consists of a controller and the actual neural network that we are trying to optimize. The Controller is an rnn tensorflow with NAS cells and special reinforcement learning methods for training and getting rewards. We will define ârewardsâ as maximizing the accuracy of the desired neural network and train the Controller to improve this outcome. The controller should generate Actions to modify the architecture of CNN. Specifically, Actions can modify filters: the dimensionality of the output space, kernel_size (integer, specifying the length of the 1D convolution window), pool_size ( integer, representing the size of the pooling window) and dropout_rate per layer.
All convolutions employ Rectified Linear Units (ReLU) nonlinearity. Weights were initialized by the Xavier initialization algorithm.
For the Controller, we built a method for policy network based on NASCell. This network takes, as inputs, the current state (in this task, state and action are the same things) and maximum number of searching layers and outputs new Action to update the desired neural network. If for some reason, NASCell is not available, you can use any RNNCell.
To allow hyperparameter tuning we put our code into a Reinforce class.
To instantiate the class we then pass the following arguments:
sess and optimizerâââTensorFlow session and optimizer, will be initialized separately.
Of course, we also must create variables and placeholders, consisting of logits and gradients. To do this, letâs write a method create_variables:
After computing the initial gradients, we launch the gradient descent method. Now letâs take a look at how reinforcement learning is implemented.
First, we can multiply gradient value to the discounted reward.
After defining the variables, we should initialize it in a TensorFlow graph in end of __init__:
Every Action depends on the previous state, but sometimes, for more effective training, we can generate random actions to avoid local minimums.
In each cycle, our network will generate an Action, get rewards and after that, take a training step.
The implementation of the training step includes store_rollout and train_step methods below:
As mentioned above, we need to define rewards for each ActionState.
This is accomplished by generating a new CNN network with new architecture per Action, training it and assessing its accuracy. Since this process generates a lot of CNN networks, letâs write a manager for it:
Then we formed bathc with hyperparameters for every layer in âactionâ and we created cnn_drop_rateâââlist of dropout rates for every layer.
Here we define a convolution neural model with CNN class. It can be any class that is able to generate the neural model by some action.
We created a separate container to avoid confusion in TF graph.
After creating a new CNN model, we can train it and get a reward.
As defined, the reward improves accuracy on all test datasets; for MNIST it is 10000 examples.
Now that we have everything in place, letâs find the best architecture for MNIST. First, we will optimize the architecture for the number of layers. Letâs set the maximum number of layers to 2. Of course, you can set this value to be higher, but every layer needs a lot of computing power.
We couldnât be sure about what we should feed to our policy network. First, we tried to always feed in the array of 1.0 to our RNN per episode, but it yielded no results. Then we tried feeding every new state per episode and it resulted in a good architecture. We concluded that the first state can be any non-zero array, to expedite finding a suitable architecture we set the first state: [[10.0, 128.0, 1.0, 1.0]*args.max_layers]
We have updated the weights after every episode. Otherwise, our calculations would have been useless. Thatâs why our âbatch sizeâ for reinforce = 1.
After 100 cycles, we get the following architecture:
Now that weâve trained our âNAS modelâ on MNIST dataset, we should be able to compare the architecture our AI has created with the other architectures created manually. For comparable results we will use popular Convolutional Neural Network (CNN) architecture for MNIST [Itâs not the state-of-the-art architecture, but itâs good for comparing]:
All weights were initialized by the Xavier algorithm.
We trained our models on 10 epochs and got of the accuracy of 0.9987 for our âNAS modelâ, compared to 0.963 for the popular manually defined neural network architecture.
We have presented a code example of a simple implementation that automates the design of machine learning models and:
Going forward, we will continue working on careful analysis and testing of these machine-generated architectures to help refine our understanding of them. Naturally, if we search for more parameters using our model, weâll achieve better results for MNIST, but more importantly, this simple example illustrates how this approach can be applied to the problems that are much more complicated.
We built this model using some assumptions which are quite difficult to justify if you notice any mistakes, please write in issues on GitHub.
Your board wants AI. Your developers are building with it. Your budget committee is asking…
AI systems are no longer just isolated models responding to human prompts. In modern production…
Broken authorization is one of the most widely known API vulnerabilities. It features in the…
The shadow technology problem is getting worse. Over the past few years, organizations have scaled…
API security has been a growing concern for years. However, while it was always seen…
Itâs an unusually cold winter morning in Houston, and Craig Riddell is settling into his…