AutoML describes strategies for automatically discovering the best-performing design for a given dataset.
When used to neural networks, this includes both finding the design architecture and the hyperparameters utilized to train the model, usually referred to as neural architecture search
AutoKeras is an open-source library for performing AutoML for deep learning models. The search is carried out using so-called Keras models by means of the TensorFlow tf.keras API.
It provides an easy and efficient method for immediately discovering top-performing models for a wide range of predictive modeling tasks, consisting of tabular or so-called structured category and regression datasets.
In this tutorial, you will find how to utilize AutoKeras to find excellent neural network designs for category and regression tasks.
After finishing this tutorial, you will understand:
- AutoKeras is an execution of AutoML for deep learning that utilizes neural architecture search.
- How to utilize AutoKeras to find a top-performing design for a binary category dataset.
- How to use AutoKeras to find a top-performing design for a regression dataset.
Let’s get going.
Guide Introduction
This tutorial is divided into 3 parts; they are:
- AutoKeras for Deep Learning
- AutoKeras for Category
- AutoKeras for Regression
AutoKeras for Deep Knowing
Automated Artificial Intelligence, or AutoML for short, refers to instantly discovering the very best mix of data preparation, design, and model hyperparameters for a predictive modeling problem.
The benefit of AutoML is allowing artificial intelligence specialists to quickly and efficiently address predictive modeling jobs with really little input, e.g. fire and forget.
Automated Artificial Intelligence (AutoML) has actually ended up being a really crucial research topic with large applications of artificial intelligence methods. The goal of AutoML is to make it possible for individuals with restricted machine finding out background knowledge to use machine learning models quickly.
— Auto-keras: An efficient neural architecture search system, 2019.
AutoKeras is an execution of AutoML for deep learning designs using the Keras API, particularly the tf.keras API supplied by TensorFlow 2
It uses a procedure of searching through neural network architectures to best address a modeling job, described more typically as Neural Architecture Search, or NAS for short.
… we have established an extensively adopted open-source AutoML system based on our proposed technique, specifically Auto-Keras. It is an open-source AutoML system, which can be downloaded and installed locally.
— Auto-keras: An effective neural architecture search system, 2019.
In the spirit of Keras, AutoKeras supplies an easy-to-use user interface for various tasks, such as image category, structured data classification or regression, and more. The user is just needed to define the area of the data and the variety of models to attempt and is returned a model that achieves the very best performance (under the set up constraints) on that dataset.
Note: AutoKeras offers a TensorFlow 2 Keras design (e.g. tf.keras) and not a Standalone Keras model. As such, the library assumes that you have Python 3 and TensorFlow 2.1 or higher installed.
To set up AutoKeras, you can utilize Pip, as follows:
sudo pip set up autokeras |
You can verify the setup achieved success and inspect the version number as follows:
You need to see output like the following:
Name: autokeras Variation: 1.0.1 Summary: AutoML for deep knowing Home-page: http://autokeras.com Author: Data Analytics at Texas A&M (DATA) Laboratory, Keras Team Author-email: jhfjhfj1@gmail.com License: MIT Area: … Requires: scikit-learn, product packaging, pandas, keras-tuner, numpy Required-by: |
Once set up, you can then apply AutoKeras to find a good or terrific neural network model for your predictive modeling job.
We will take a look at 2 typical examples where you may wish to use AutoKeras, category and regression on tabular data, so-called structured information.
AutoKeras for Category
AutoKeras can be utilized to discover a good or great design for classification tasks on tabular data.
Remember tabular data are those datasets composed of rows and columns, such as a table or information as you would see in a spreadsheet.
In this section, we will establish a design for the Finder classification dataset for categorizing finder returns as rocks or mines. This dataset includes 208 rows of information with 60 input functions and a target class label of 0 (rock) or 1 (mine).
A naive design can accomplish a category accuracy of about 53.4 percent through repeated 10- fold cross-validation, which offers a lower-bound. A great model can accomplish a precision of about 88.2 percent, providing an upper-bound.
You can learn more about the dataset here:
No need to download the dataset; we will download it instantly as part of the example.
Initially, we can download the dataset and divided it into an arbitrarily selected train and test set, holding 33 percent for test and utilizing 67 percent for training.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# pack the sonar dataset from pandas import read_csv from sklearn model_selection import train_test_split from sklearn preprocessing import LabelEncoder # load dataset url =-LRB- ‘ https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv’ dataframe =-LRB- read_csv( url, header =-LRB- None) print( dataframe |
Running the example first downloads the dataset and sums up the shape, showing the anticipated number of rows and columns.
The dataset is then divided into input and output elements, then these elements are further split into train and test datasets.
(208, 61) (208, 60) (208,) (139, 60) (69, 60) (139,) (69,) |
We can use AutoKeras to automatically find an efficient neural network model for this dataset.
This can be attained by utilizing the StructuredDataClassifier class and specifying the number of models to search. This specifies the search to perform.
|
We can then perform the search utilizing our packed dataset.
|
This may take a few minutes and will report the progress of the search.
Next, we can evaluate the design on the test dataset to see how it carries out on brand-new data.
|
We then use the design to make a forecast for a brand-new row of data.
# utilize the design to make a prediction row =-LRB- [0.0200,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,0.1609,0.1582,0.2238,0.0645,0.0660,0.2273,0.3100,0.2999,0.5078,0.4797,0.5783,0.5071,0.4328,0.5550,0.6711,0.6415,0.7104,0.8080,0.6791,0.3857,0.1307,0.2604,0.5121,0.7547,0.8537,0.8507,0.6692,0.6097,0.4943,0.2744,0.0510,0.2834,0.2825,0.4256,0.2641,0.1386,0.1051,0.1343,0.0383,0.0324,0.0232,0.0027,0.0065,0.0159,0.0072,0.0167,0.0180,0.0084,0.0090,0.0032] X_new =-LRB- asarray([row]) astype(‘ float32’) yhat =-LRB- search predict( X_new) print(‘ Forecasted: %.3 f’% yhat[0]) |
We can obtain the last design, which is a circumstances of a TensorFlow Keras model.
# get the very best performing design design =-LRB- search export_model() |
We can then sum up the structure of the design to see what was chosen.
|
Lastly, we can conserve the model to apply for later use, which can be packed utilizing the TensorFlow load_model() function
# conserve the very best carrying out design to file model conserve(‘ model_sonar. h5’) |
Connecting this together, the complete example of applying AutoKeras to discover a reliable neural network model for the Finder dataset is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
# utilize autokeras to find a model for the sonar dataset from numpy import asarray from pandas import read_csv from sklearn model_selection import train_test_split from sklearn preprocessing import LabelEncoder from autokeras import StructuredDataClassifier # load dataset url =-LRB- ‘ https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv’ dataframe =-LRB- read_csv( url, header =-LRB- None) print( dataframe shape) # divided into input and output elements information =-LRB- dataframe values X, y =-LRB- data[:, :–1], information[:, –1] print( X shape, y shape) # fundamental information preparation X =-LRB- X astype(‘ float32’) y =-LRB- LabelEncoder() fit_transform( y) # different into train and test sets X_train, X_test, y_train, y_test =-LRB- train_test_split( X, y, test_size =-LRB- 0.33, random_state =-LRB- 1) print( X_train shape, X_test shape, y_train shape, y_test shape) # define the search search =-LRB- StructuredDataClassifier( max_trials =-LRB- 15) # carry out the search search fit( x =-LRB- X_train, y =-LRB- y_train, verbose =-LRB- 0) # evaluate the design loss, acc =-LRB- search examine( X_test, y_test, verbose =-LRB- 0) print(‘ Accuracy: %.3 f’% acc) # use the model to make a prediction row =-LRB- [0.0200,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,0.1609,0.1582,0.2238,0.0645,0.0660,0.2273,0.3100,0.2999,0.5078,0.4797,0.5783,0.5071,0.4328,0.5550,0.6711,0.6415,0.7104,0.8080,0.6791,0.3857,0.1307,0.2604,0.5121,0.7547,0.8537,0.8507,0.6692,0.6097,0.4943,0.2744,0.0510,0.2834,0.2825,0.4256,0.2641,0.1386,0.1051,0.1343,0.0383,0.0324,0.0232,0.0027,0.0065,0.0159,0.0072,0.0167,0.0180,0.0084,0.0090,0.0032] X_new =-LRB- asarray([row]) astype(‘ float32’) yhat =-LRB- search forecast( X_new) print(‘ Anticipated: %.3 f’% yhat[0]) # get the very best performing design design =-LRB- search export_model() # sum up the crammed model design summary() # conserve the best carrying out model to file model save(‘ model_sonar. h5’) |
Running the example will report a lot of debug information about the development of the search.
The designs and results are all conserved in a folder called “ structured_data_classifier” in your present working directory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
The best-performing model is then evaluated on the hold-out test dataset.
Note: Your results may differ offered the stochastic nature of the algorithm or assessment treatment, or differences in mathematical accuracy. Think about running the example a couple of times and compare the average outcome.
In this case, we can see that the design attained a category precision of about 82.6 percent.
Next, the architecture of the best-performing design is reported.
We can see a model with 2 covert layers with dropout and ReLU activation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
Design: “design” _________________________________________________________________ Layer (type) Output Shape Param # ================================================================ =-LRB- input_1 (InputLayer) [(None, 60)] 0 _________________________________________________________________ categorical_encoding (Catego (None, 60) 0 _________________________________________________________________ thick (Dense) ( None, 256) 15616 _________________________________________________________________ re_lu (ReLU) (None, 256) 0 _________________________________________________________________ dropout (Dropout) ( None, 256) 0 _________________________________________________________________ dense_1 (Thick) ( None, 512) 131584 _________________________________________________________________ re_lu_1 (ReLU) (None, 512) 0 _________________________________________________________________ dropout_1 (Dropout) ( None, 512) 0 _________________________________________________________________ dense_2 (Thick) ( None, 1) 513 _________________________________________________________________ classification_head_1 (Sigmo (None, 1) 0 ================================================================ =-LRB- Overall params: 147,713 Trainable params: 147,713 Non-trainable params: 0 _________________________________________________________________ |
AutoKeras for Regression
AutoKeras can likewise be utilized for regression tasks, that is, predictive modeling issues where a numeric worth is predicted.
We will utilize the auto insurance dataset that includes forecasting the overall payment from claims provided the overall number of claims. The dataset has 63 rows and one input and one output variable.
A naive design can accomplish a mean outright error (MAE) of about 66 using repeated 10- fold cross-validation, providing a lower-bound on anticipated performance. A good design can attain a MAE of about 28, supplying an efficiency upper-bound.
You can discover more about this dataset here:
- Auto Insurance coverage Dataset (auto-insurance. csv)
- Automobile Insurance Dataset (auto-insurance. names)
We can fill the dataset and divided it into input and output elements and then train and test datasets.
The complete example is noted below.
# load the sonar dataset from pandas import read_csv from sklearn model_selection import train_test _ split # load dataset url =-LRB- ‘ https://raw.githubusercontent.com/jbrownlee/Datasets/master/auto-insurance.csv’ dataframe =-LRB- read_csv( url, header =-LRB- None) print( dataframe shape) # divided into input and output components information =-LRB- dataframe values information =-LRB- information astype(‘ float32’) X, y =-LRB- information[:, :–1], information[:, –1] print( X shape, y shape) # different into train and test sets X_train, X_test, y_train, y_test =-LRB- train_test_split( X, y, test_size =-LRB- 0.33, random_state =-LRB- 1) print( X_train shape, X_test shape, y_train shape, y_test shape) |
Running the example loads the dataset, confirming the variety of rows and columns, then splits the dataset into train and test sets.
(63, 2) (63, 1) (63,) (42, 1) (21, 1) (42,) (21,) |
AutoKeras can be used to a regression job using the StructuredDataRegressor class and configured for the number of designs to trial.
|
The search can then be run and the very best design saved, similar to in the classification case.
# define the search search =-LRB- StructuredDataRegressor( max_trials =-LRB- 15, loss =-LRB- ‘ mean_absolute_error’) # carry out the search search fit( x =-LRB- X_train, y =-LRB- y_train, verbose =-LRB- 0) |
We can then use the best-performing design and evaluate it on the hold out dataset, make a forecast on new information, and summarize its structure.
|
Tying this together, the total example of utilizing AutoKeras to find an efficient neural network design for the car insurance dataset is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# use autokeras to find a design for the insurance coverage dataset from numpy import asarray from pandas import read_csv from sklearn model_selection import train_test_split from autokeras import StructuredDataRegressor # load dataset url =-LRB- ‘ https://raw.githubusercontent.com/jbrownlee/Datasets/master/auto-insurance.csv’ dataframe =-LRB- read_csv( url, header =-LRB- None) print( dataframe shape) # divided into input and output elements information =-LRB- dataframe worths data =-LRB- data astype(‘ float32’) X, y =-LRB- data[:, :–1], data[:, –1] print( X shape, y shape) # separate into train and test sets X_train, X_test, y_train, y_test =-LRB- train_test_split( X, y, test_size =-LRB- 0.33, random_state =-LRB- 1) print( X_train shape, X_test shape, y_train shape, y_test shape) # specify the search search =-LRB- StructuredDataRegressor( max_trials =-LRB- 15, loss =-LRB- ‘ mean_absolute_error’) # perform the search search fit( x =-LRB- X_train, y =-LRB- y_train, verbose =-LRB- 0) # evaluate the model mae, _ =-LRB- search assess( X_test, y_test, verbose =-LRB- 0) print(‘ MAE: %.3 f’% mae) # utilize the design to make a forecast X_new =-LRB- asarray([[108]] ) astype(‘ float32’) yhat =-LRB- search anticipate( X_new) print(‘ Forecasted: %.3 f’% yhat[0]) # get the best performing design model =-LRB- search export_model() # summarize the packed design design summary() # conserve the very best performing design to file model save(‘ model_insurance. h5’) |
Running the example will report a lot of debug info about the progress of the search.
The models and results are all saved in a folder called “ structured_data_regressor” in your present working directory site.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
The best-performing design is then evaluated on the hold-out test dataset.
Note: Your results may differ offered the stochastic nature of the algorithm or evaluation treatment, or distinctions in numerical accuracy. Think about running the example a couple of times and compare the average outcome.
In this case, we can see that the design accomplished a MAE of about 24.
Next, the architecture of the best-performing model is reported.
We can see a design with 2 surprise layers with ReLU activation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
Model: “design” _________________________________________________________________ Layer (type) Output Shape Param # ================================================================ =-LRB- input_1 (InputLayer) [(None, 1)] 0 _________________________________________________________________ categorical_encoding (Catego (None, 1) 0 _________________________________________________________________ dense (Thick) ( None, 64) 128 _________________________________________________________________ re_lu (ReLU) (None, 64) 0 _________________________________________________________________ dense_1 (Dense) ( None, 512) 33280 _________________________________________________________________ re_lu_1 (ReLU) (None, 512) 0 _________________________________________________________________ dense_2 (Dense) ( None, 128) 65664 _________________________________________________________________ re_lu_2 (ReLU) (None, 128) 0 _________________________________________________________________ regression_head_1 (Dense) ( None, 1) 129 ================================================================ =-LRB- Total params: 99,201 Trainable params: 99,201 Non-trainable params: 0 _________________________________________________________________ |
Further Reading
This area provides more resources on the topic if you are wanting to go deeper.
- Automated artificial intelligence, Wikipedia
- Neural architecture search, Wikipedia
- AutoKeras Homepage
- AutoKeras GitHub Project
- Auto-keras: An efficient neural architecture search system, 2019.
- Results for Basic Classification and Regression Artificial Intelligence Datasets
Summary
In this tutorial, you found how to utilize AutoKeras to find excellent neural network models for classification and regression tasks.
Particularly, you found out:
- AutoKeras is an application of AutoML for deep knowing that utilizes neural architecture search.
- How to use AutoKeras to find a top-performing model for a binary category dataset.
- How to use AutoKeras to discover a top-performing design for a regression dataset.
Do you have any questions?
Ask your questions in the remarks below and I will do my finest to answer.
Develop Deep Knowing Projects with Python!
What If You Could Establish A Network in Minutes
… with simply a couple of lines of Python
Discover how in my brand-new Ebook:
Deep Learning With Python
It covers end-to-end tasks on topics like:
Multilayer Perceptrons, Convolutional Nets and Persistent Neural Nets, and more …
Finally Bring Deep Knowing To
Your Own Projects
Skip the Academics. Just Results.