Multi-class Classification of Mathematical-Numbers | DNN+dropout | Part2

  • Training dataset of 60,000 images.
  • Testing dataset of 10,000 images.
  • Xtrain is the training data set.
  • Ytrain is the set of labels to all the data in Xtrain.
  • Xtest is the test data set.
  • Ytest is the set of labels to all the data in Xtest.
  • Training dataset of 60,000 images, stored in Xtrain.
  • Testing dataset of 10,000 images, stored in Xtest.
  • The input images are converted into a tensor of size (28 * 28), using the “reshape()” function.
  • Then, each image is normalised, in order to convert the pixel values from range (0 to 255) to range (0 to 1). Also, we shall be converting the values into float type values.
  • Reshape → We shall be reshaping each image in the test dataset from 2d (each image of size 28 * 28) into 1d (each image of size 784).
  • Normalisation → We then perform normalisation where each pixel value is converted from range of (0 to 255) to the range of (0 to 1).
  • Xtrain contains the 60,000 images, which we shall be using for training.
  • Ytrain contains the corresponding labels for 60,000 images in Xtrain dataset.
  • Xtest contains the 10,000 images, which we shall be using for testing/validation.
  • Ytest contains the corresponding labels for 10,000 images in Xtest dataset.
  • We shall be defining “Categorical Entropy” as the Cost-Function / Loss-Function.
  • The metrics used for evaluating the model is “accuracy”.
  • We are planning to use ADAM optimiser on the top of SGD (Standard Gradient Descent), in order to minimise the cost-function more robustly.
  • Epoch :- One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE. We usually need many epochs, in order to arrive at a optimal learning curve.
  • Batch size: Since we have limited memory, probably we shall not be able to process the entire training instances(i.e. 60,000 instances) all at one forward pass. So, what is commonly done is splitting up training instances into subsets (i.e., batches), performing one pass over the selected subset (i.e., batch), and then optimising the network through back-propagation.
  • Validation Split: It means that out of total dataset, 2% of data is set aside as the cross-validation-set.
  • The last element of the history object (model_fit.history[‘loss’])gives us the final loss after training process.
  • The last element of the history object (model_fit.history[‘acc’])gives us the final accuracy after training process.
  • Changing the number of hidden layers itself. (Note that, we have used 4 layers in total in aforesaid demonstration).
  • Changing the number of neurons in each of the involved layers (input/dense/output layer).
  • By playing on the value of the eTa i.e. Learning-Rate.
  • Choice of Activation functions at each layer.
  • By modifying the various optimisers like RMSProp, etc.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
aditya goel

aditya goel

Software Engineer for Big Data distributed systems