Sentiment Classification using TensorFlow || NLP with DL
In this case you are landing here directly, it’s recommended to read through this blog first.
In this blog, we shall be using the concept of word embeddings, in order to perform the Sentiment Analysis. We will train our own word embeddings using a simple Keras model for a sentiment classification task with following tasks :-
- Downloading data from tensorflow dataset.
- Segregating training and testing sentences & labels.
- Data preparation to padded sequences.
- Defining out Keras model with an Embedding layer.
- Train the model and explore the weights from the embedding layer.
- Classify a fresh movie-review into either Positive OR negative.
Question: Demonstrate the entire process of performing the Sentiment Analysis task ?
Phase-1 : Data Downloading and Understanding :-
Step #1.) Let’s first import the necessary libraries :-
Here is the version of the tensorflow, that we shall be using :-
Step #2.) Let’s now download the ready-made dataset of imdb movie-reviews :-
Step #3.) Let’s first understand the data that, we have downloaded. This is a dictionary type of dataset. There are two major components to this dataset :-
- Training dataset.
- Testing dataset.
Step #4.) Let’s now segregate the training and test data first :-
Step #5.) Let’s first investigate any one sequence in the training data-set :-
Note that, the dataset (that we have downloaded) contains the Sentence as well as it’s true label.
- Label value of ZERO (0) means that, sentiment for sentence is negative.
- Label value of ONE (1) means that, sentiment for sentence is positive.
Step #6.) We would now create empty lists, in order to store sentences and labels :-
Step #7.) Let’s iterate over the train-data and test-data, to extract sentences and labels from them and eventually append to the afore-declared empty lists.
Step #8.) We would investigate, whether all the sentences have been appended into the afore-declared empty lists.
Step #9.) Now, we convert the training-labels list into numpy array :-
Step #10.) Similarly, we convert the test-labels list into numpy array :-
Phase-2 : Data Downloading and Understanding :-
Step #1.) We would now instantiate an object of Tokenizer class and train it using the corpus of training_sentences.
Note that, vocabulary size of 10,000 means that, while we would obtain the train_sequence of our sentences, the ids of the most 100 frequent words shall be returned.
Step #2.) Let’s understand our word_index. This is nothing other than our Dictionary :-
Step #3.) Let’s understand our word_index. This is nothing other than our Dictionary and note that the size of our dictionary is 88, 583 words.
Step #4.) Next, let’s convert our training corpus of 25,000 sentences into the corresponding word-encodings :-
Note that, the length of the training_sentences was 25,000 and length of the train_seqs is also 25000.
Step #5.) Let’s investigate the first training sentence from our corpus and it’s corresponding word-encoded version :-
Phase-3 : Model preparation :-
Step #1.) We first create a Sequential Keras based Model :-
- Input Layer is an Embedding Layer, where we supply our word-embeddings as the input to the model. One thing to note here is that, we have defined 16 dimensional layer.
- Next we have defined Flatten operation. Keras Flatten() class is very important when we have to deal with multi-dimensional inputs such as the embedding layer here.
Keras.layers.Flatten()
function flattens the multi-dimensional input tensors into a single dimension, so we can model our input layer and build our neural network model, then pass those data into every single neuron of the model effectively. - Next, we have Dense-Layer with 6 neurons and RELU activation-function. Dense-Layer implies that, all neurons of one layer are connected to all neurons of next layer. There would be 6 outputs, from this Layer, because there are 6 neurons overall we have in this particular layer.
- Next, we have another Dense-Layer with 1 neuron and SIGMOID activation-function. There would be 1 output, from this Layer, because there is only 1 neuron overall we have in this particular layer.
Step #2.) We now proceed ahead with compiling this model :-
- We have used Binary-CrossEntrophy as the Divergence function, because our problem is here binary type.
- We are planning to use ADAM optimiser on the top of SGD (Standard Gradient Descent), in order to minimise the cost-function.
Usually, below values are adopted for hyper-parameters Delta and Gamma :-
Here is a ready comparative analysis for various types of Optimisers :-
Step #3.) Lets proceed with the Training of Model :- Given the 25K sequences of data-set that we have got, we would perform the Model-Training now :-
Let’s understand few things about the Epoch :-
- One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE. We usually need many epochs, in order to arrive at an optimal learning curve. Weights are iteratively tuned, Loss is gradually reduced and Model-accuracy is gradually improved.
- One epoch is not enough, as it may lead to under-fitting of the curve. As the number of epochs increases, more number of times the weight are changed in the entire neural network and the curve usually goes from under-fitting to optimal to over-fitting curve.
Step #3.) We now extract the learned weights for the embedding-layer :- We have each word in 16 dimensions.
Phase-4 : Model Evaluation :-
- We now plot the graph of Accuracy, which shows our model has problem of OverFitting, because there is a huge gap in the Training-Accuracy and Testing-Accuracy.
- We now plot the graph of Loss. Here also, there is a huge gap in the Training-Loss and Testing-Loss.
Phase-5 : Model Usage :-
Let’s now use our constructed Model, in order to perform the classification task i.e. we shall be using our model to detect, whether a particular review is positive OR negative ?
Example #1) We supply the sentence(review of movie shershah) and we can see that, output value is 1, which indicates it’s a strongly positive review.
Example #2) We supply the sentence(review of movie Lal Singh Chaddha) and we can see that, output value is 5.45e-10, which indicates it’s a strongly negative review.
That’s all in this blog and Thanks for reading till here. If you liked it, please do clap on this page. We shall see you in next blog.