Swiftpack.co - Package - theolternative/SwiftDeepNeuralNetwork

Swiftpack.co is a collection of thousands of indexed Swift packages. Search packages.

theolternative/SwiftDeepNeuralNetwork

SwiftDeepNeuralNetwork aims to provide a step by step guide to Deep Neural Network implementation. It's largely based on code developed in Neural Networks and Deep Learning course by Andrew Ng on Coursera.

SwiftDeepNeuralNetwork

SwiftDeepNeuralNetwork is a Swift library based on SwiftMatrix library and it aims to provide a step by step guide to Deep Neural Network implementation. It's largely based on code developed in Neural Networks and Deep Learning course by Andrew Ng on Coursera.

SwiftDeepNeuralNetwork has been developed in order to show all necessary steps to write code for Deep Learning in the easiest possible way


Installation

Using XCode v.12 and upper add a package dependancy by selecting File > Swift Packages > Add Package Dependancy ... and entering following URL https://github.com/theolternative/SwiftDeepNeuralNetwork.git

SwiftDeepNeuralNetwork uses Swift 5 and Accelerate Framework and SwiftMatrix as a dependancy.

License

SwiftDeepNeuralNetwork is available under the MIT license. See the LICENSE file for more info.

Usage

You can check Tests for examples. Currently DNN supports binary, multiclass and multitask classification. Basically

  • Define a train set X, Y where X is a n x m input matrix containing m examples of n features each and a r x m Y output matrix of labels. All values must be Double
  • Create an instance of DeepNeuralNetwork passing an array of Int where the first element is the number of features n, the last one is always r (number of rows of Y output matrix) while the others specify number of units per layer. [2000, 20, 5, 1] means that there are 1 input layer of 2000 features (and must match with number of rows of X), 2 hidden layers of 20 and 5 units each and and outpute layer.
  • Set learning_rate and num_iterations
  • Train your model by calling function L_layer_model_train()
  • Make predictions on test set with L_layer_model_test( X_test, Y_test) where X_test must have size n x p and Y_test size 1 x p where p >= 1
// XOR example
let X = Matrix([ [0,0], [0,1], [1,0], [1,1] ])′
let Y = Matrix([[ 0, 1, 1, 0 ]])
let dnn = DeepNeuralNetwork(layerDimensions: [2, 2, 1], X: X, Y: Y)
dnn.learning_rate = 0.15
dnn.num_iterations = 1000
let (accuracy, costs) = dnn.L_layer_model_train(nil)

Initialize

You can initialize a new model as

let dnn = DeepNeuralNetwork( layerDimensions: [Double], X: Matrix, Y: Matrix, type : Classification )

where layerDimensions is an array of Int specyfing number of units per layer, X is the features matrix, Y is the labels matrix and type must be a value between .binary | .multiclass | .multitask; if omitted it falls back to .binary. Let's suppose you're building a model to identify if a set of images contains food. In .binary classification you train your model to identify one kind of food (i.e. pizza) in each image. In this case you build a model for each food you want to identify. Y matrix has only 1 row.

  • Model 1 for "Pizza" -> Y[0,i] = 1.0 if image identified by features X[:,i] is a pizza, 0.0 otherwise
  • Model 2 for "Lasagna" > Y[0,i] = 1.0 if image identified by features X[:,i] is a pizza, 0.0 otherwise etc.

In .multiclass classification you train your model to identify one kind of food in each image among a set of different foods (i.e. pizza, lasagna, hamburger). In this case you build a unique model for all foods you want to identify. Y matrix has as many rows as the number of foods to be identified ("classes") so i.e. first row can be labeled as "Pizza", second row for "Lasagna", third row for "Hamburger". Only one row per column ("example") can be 1.0 (food is present in image), all others must be 0.0 (food is not present in image)

Y[:,i] = [[1.0], [0.0], [0.0]] // i-nth image is a pizza Y[:,i+1] = [[0.0], [0.0], [1.0]] // i+1-nth image is a hamburger

In .multitask classification you train your model to identify more kinds of food (i.e. pizza, lasagna, hamburger) in each image. In this case you build a unique model for all foods you want to identify. Y matrix has as many rows as the number of food to be identified ("classes") so i.e. first row can be labeled as "Pizza", second row for "Lasagna", third row for "Hamburger". Possible values can be 1.0 (food is present), 0.0 (food is not present), -1.0 (food was not labeled meaing that we don't know if it's present or not). Please note that if all foods are present in the same image, all rows will be 1.0.

Y[:,i] = [[1.0], [0.0], [0.0]] // i-nth image contains only pizza Y[:,i+1] = [[1.0], [0.0], [1.0]] // i+1-nth image contains pizza and hamburger Y[:,i+2] = [[-1.0], [1.0], [-1.0]] // i+2-nth image contains lasagna while pizza and hamburger were not labeled

// Binary classification
let dnn1 = DeepNeuralNetwork( layerDimensions: [10000,20,5,1], X: X_train, Y: Y_train) 

// Multiclass classification
let dnn2 = DeepNeuralNetwork( layerDimensions: [10000,200,100,20,4], X: X_train1, Y: Y_train1, type: .multiclass )

// Multitask classification
let dnn2 = DeepNeuralNetwork( layerDimensions: [10000,200,100,20,4], X: X_train1, Y: Y_train1, type: .multitask )

In

Setting up training

Following hyperparameters can be set before training

Learning rate

Name: learning_rate Type: Double Default value: 0.0075

Learning rate determines the overall velocity of gradient descent. Low values (<0.001) can lead to slow performance but better accuracy while high values can lead to better performance but worse accuracy or instability.

dnn.learning_rate = 0.15

Hint: start with low values (0.0001) and increase by an order if too slow

Number of epochs

Name: num_iterations Type: Int Default value: 2500

This hyperparameter controls how many iterations are performed by gradient descent. Low values can lead to better performance but worse accuracy while high values lead to better accuracy with performance as a trade-off.

dnn.num_iterations = 2500

Initialization type

Name: weigth_init_type Type: Initialization Default value: random

Weights' matrices must be initialized with random values in order for the training to work. Following types are available:

Value Description
.zeros Values set to 0 (only for debug)
.random Random values in range -1.0....1.0
.xavier Xavier initialization
.he He initialization
.custom Multiply random values -1.0...1.0 by factor custom_weight_factor
dnn.weigth_init_type = .he

Regularization

Two types of regularization in order to reduce variance between train and test sets, are provided: L2 and Dropout

L2 Regularization

Name: λ Type: Double Default value: 0.0

If a value greater than 0.0 is set for λ it will be applied.

dnn.λ = 0.7
Dropout regularization

Name: keep_prob Type: Double Default value: 1.0

Dropout Regularization randomly shuts off some elements in weights matrices so that a percentage of keep_prob is not zero. If a value less than 1.0 is set for keep_prob, dropout regularization will be applied.

dnn.keep_prob = 0.85 // 15% elements of weights matrices will be 0

Optimization type

Name: optimization_type Type: Optimization Default value: gradientdescent

Following choices are available

Value Description
.gradientdescent Gradient descent
.momentum Gradient descent with momentum
.adam Adam optimization method

For .momentum optmization method β factor can be set (deafult is 0.1) For .adam optmization method β1 and β2 factors can be set (deafults are 0.9 and 0.999)

dnn.optimization_type = .adam
dnn.β1 = 0.91
dnn.β2 = 0.98

Batch type

Name: batch_type Type: Batch Default value: .batch

Batch type determines how many examples are computed all at a time.

Value Description
.batch All examples at a time
.minibatch mini_batch_size examples at a time
.stochastic 1 example at a time
dnn.batch_type = .minibatch
dnn.mini_batch_size = 64

Hint: Stochastic gradient descent is very slow but can lead to better accuracy after fewer epochs. When examples tend to become many (>1000) training times become longer and Minibatch approach should be the default choice

Github

link
Stars: 0
Last commit: 1 week ago

Releases

v1.0 - 2021-01-09T13:49:26

Initial working release