Simple Keras Neural Network

What is a Sentiment Analyser?

Want to know how a product owner can find out heaps on insight about how their products are received by customers, without having to read millions of articles and reviews? A Sentiment Analyser is the answer, these things can be hooked up to twitter, review sites, databases or all of the above utilising Neural Neworks in Keras.

Its a great lazy way to understand how a product is viewed by a large group of customers in a very short space of time.

What is Keras?

Wikipedia quote: “Keras is an open-source neural-network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, Theano, or PlaidML. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible

So yeah, if you can code in python and cant be bothered to learn TensorFlow, here is your answer!

So lets get to it…Import and Format Data

For this simple piece of coding we will be using the commonly used IMDB review dataset, its readily available and free to pull directly from keras.

The first thing we need to do is import the IMDB dataset from Keras then split it into train and test datasets.

from keras.datasets import imdb 
top_words = 10000

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=top_words) 


You should see a downloading screen and a warning saying “using tensorflow backend”, nothing wrong with this, just let it do its thing.

You can run the below piece of code to take a look at the dataset, its basically load of words from the English Dictionary with corresponding IDs. We will use these IDs to translate later on as Neural Nets don’t like numbers.

imdb.get_word_index()

Sample from Keras IMDB Dictionary

The next part is probably the only complex bit, and its pretty much optional – in order to test that our data looks good we need to grab our first review and reverse out the keys to translate into actual words.

word_dict = imdb.get_word_index()
word_dict = { key:(value + 3) for key, value in word_dict.items() }
word_dict[''] = 0  # Padding
word_dict['>'] = 1 # Start
word_dict['?'] = 2 # Unknown word
reverse_word_dict = { value:key for key, value in word_dict.items() }
print(' '.join(reverse_word_dict[id] for id in x_train[0]))

(I say optional, but in the real world you’re always going to want to do these checks, suggest googling around and finding some code that does what you need – that’s what I did – thanks Microsoft Academy)

Here is the first review – with all the crap stripped out.


Next a bit of tidying is required, we need to make sure that everything going into the NN is the same shape. i.e. we will specify a maximum character length of review and assign it to a variable max_review_length, then run through our data set and pad the end of any reviews with zeroes to ensure that everything is a length of 500 characters.

from keras.preprocessing import sequence
max_review_length = 500
x_train = sequence.pad_sequences(x_train, maxlen=max_review_length)
x_test = sequence.pad_sequences(x_test, maxlen=max_review_length)

If you print x_train[1] you will see that there are 500 entries in there, a bunch of numbers (that we already know correspond to words in our dictionary) and a whole load of zeroes, taking the total up to 500.

Build the Model

I have dropped the code for the model into one simple script to enable anyone to copy and paste easily – its a handy piece of code that is reusable, check the markup for specifics. If you really want to understand how this all ties together there is plenty of information on word embedding in this blog here.

#Build the Model

from keras.models import Sequential
from keras.layers import Dense
from keras.layers.embeddings import Embedding
from keras.layers import Flatten

embedding_vector_length = 32

#Define the Model:  This tells us that we are using a sequential model - i.e. one that is built from a number of layers stacked together, theres plenty of info on this in the Keras Documentation.
model = Sequential()

#Embedding layer - its a best practice to use this when dealing with text - it arranges the vectors into tables so makes it easier to process huge amounts of data. 
model.add(Embedding(top_words, embedding_vector_length, input_length=max_review_length))

#Take the input data and flatten it so can be consumed by the next layer
model.add(Flatten())

#These dense layers control "how" the model learns  - the first two layers have 16 neurons, we can mess with these to alter the accuracy of the NN
#Relu is an activation function - theres more in the documentation, but for now we just need to know that Relu is the most popular for what we are doing.
model.add(Dense(16, activation='relu'))
model.add(Dense(16, activation='relu'))

#this final dense layer is our output layer - it is set to 1 neuron, because we only want one output for the whole thing - a score for the sentiment!
model.add(Dense(1, activation='sigmoid'))

#Compile (stick it all together).  Again as this is a beginner tutorial, I wont dig into the optimisers, do check the Keras documentation for furhter info. 
model.compile(loss='binary_crossentropy',optimizer='adam', metrics=['accuracy'])

#Print the summary to enable us to check we are happy with the parameters.
print(model.summary())
Here is a graphical representation of what we are doing. The middle “dense” layers will also be referred to as ‘Hidden’ layers in the documentation and around the web. *not 100% accurate to what we have created, this is someone else’s diagram re-purposed – didn’t have work ethic to create my own.


Run the code and see what happens, you should be presented with a nice summary of everything we just coded.

After running the model your output should look like this.

Fitting the Model

Fitting the model – nothing new here. The epoch number is the number of iterations we decide to take i.e. the amount of times we pass through the NN.
Notice we fit with the train data AND validate with the test data all in the same line of code?
Batch size 128 is random, generally a smaller batch size will be more accurate, but we still need to be aware of overfitting – discussed later on.

hist = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=5, batch_size=128)


Run the code, you should be able to track the neural net through whatever number of Epoch’s you asked it to do – in my example 5, so 5 lines of result code.

This is where we need to be careful of overfitting. If we see growing accuracy between train acc and val_acc values then we could have an overfitting problem, a quick fix for this is to take down the number of Epoch’s we completed.

Notice our training accuracy (acc) is 100%, is this something we need to look into? or are we happy with a validation accuracy (val_acc) of around 87%

Model Score

The scoring is there built in. All we need to do is visualise it!

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

sns.set()
val = hist.history['val_acc']
epochs = range(1, len(acc) + 1)

plt.plot(epochs, val, ':', label='Validation accuracy')
plt.title('Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(loc='upper left')
plt.plot()
A Line graph showing our success, the neural network gets more accurate as our number of Epoch’s increases.

Using Our Code

Now all we need to do is code in a function that allows us to submit our own reviews and analyse them based on our model! The code below does that.

import string
import numpy as np

def analyze(text):
    # Prepare the input by removing punctuation characters, converting
    # characters to lower case, and removing words containing numbers
    translator = str.maketrans('', '', string.punctuation)
    text = text.translate(translator)
    text = text.lower().split(' ')
    text = [word for word in text if word.isalpha()]

    # Generate an input tensor
    input = [1]
    for word in text:
        if word in word_dict and word_dict[word] < top_words:
            input.append(word_dict[word])
        else:
            input.append(2)
    padded_input = sequence.pad_sequences([input], maxlen=max_review_length)

    # Invoke the model and return the result
    result = model.predict(np.array([padded_input][0]))[0][0]
    return result

Testing the NN

Now we can test the neural network using our own reviews. All we have to do is type our review into the argument of the “analyse” function we just created.

We can see from the basic reviews I typed below, the very strong review has a very high score, and the very negative review has a low score, so it appears our Sentiment Analyser Neural Network works!



Click To Tweet