AI & Deep Learning - Technical Terminology
Activation function : mechanism whereby a specific neuron is activated or not by its inputs, to deliver an output to the next neurons. Common activation functions include sigmoïd, hyperbolic tangent (tanh), softmax, rectified linear unit (ReLU), exponential linear units(ELUs).
Batch-size : size of the subset of the training data that is used together for training the model
CNN : “Convolutional neural network” is a type of deep neural network that apply a convolution of an activation mechanism on a subset of a matrix to generate reshaped results
Dropout : regularization technique where some neurons are randomly ‘deactivated’, independently from their inputs
Epoch : one cycle of learning through the full training set, that include the move forward propagation, the loss computation and the backpropagation to adjust the weights of the model
Exploding gradient : problem encountered in gradient descent optimization causing instability of the model when the error gradients accumulate
GAN : “Generative Adversarial Networks” is a structure that two neural networks against each other to improve the quality of the overall results, one network is called “generator” and competes with the discriminator
GRU : “Gated Recurrent Unit”, mechanism used in some types of RNN with a relatively low number of parameters
Hyper-parameters : technical parameters of the AI model. As various architectures or techniques might be used, the list is very broad. It includes among others (list not exhaustive) the number of layers, the number of neurons per layer, the learning rate and forget rate, the activation function(s), the solver(s), the batch-size, the regularization parameters, the number of epochs, the loss function(s), the random state, criteria for impurity measure, the splitter method, …
Layer : is the basis level of a neural network. A neural network is made of :
- an input layer that contains the initial adata to feed the neural networks
- one or several hidden layers
- an output layer that produces the result for the given inputs
Loss function : is the function used to evaluate the difference between the true value “y” and the estimated value “ŷ or yhat” calculated by the model.
LSTM : “Long-Short Term Memory” network is a specific RNN architecture. LSTM embeds feedback connections that allow to handle sequences of data (speech, video, time-series,…), with learning and forget functions
Regularization : is a set of techniques that make slight modifications to the learning algorithm to improve model generalization and performance
RNN : “Recurrent neural networks”, a class of neural network models where connections between neurons form a directed graph along a temporal sequence and that embed a double mechanism of learning and forgetting to regulate the flow of information. It includes GRU and LSTM architectures
SVM (or Support Vector Machine) : type of supervised machine learning algorithm through discriminant analysis, to generate a hyperplane to discriminate between datas.
Tree / Decision Tree : ensemble of techniques used to classify data in determining the relevant features to split the data in order to generate the most homogenous subsets or “leafs”
Vanishing gradient : problem encountered in gradient descent optimization causing inability of the model to reach an optimum when the error gradients tend to zero, leading the model to stop learning