Neural Networks are now widely used in many ways. From image caption generation to breast cancer prediction, this great diversity of applications is a natural consequence of the important variety of neural architectures (Feed Forward Neural Networks, Convolutional Neural Networks, etc…). Among all these architectures, Long Short Term Memory (LSTM) — a particular case of Recurrent Neural Networks — have proven very successful on tasks such as machine translation, time series prediction or generally anything where the data is sequential. This is mainly due to their ability to memorize relatively long term dependencies, which is achieved by taking into account previous information for further predictions.

But LSTMs alone aren’t always enough. Sometimes we need to tweak these layers and adapt them to the task at hand.

At Kwyk, we provide online math exercises. And every time an exercise is answered, we collect some data which is then used to tailor homework for each student using the website. In order to determine which exercises are the most likely to make a student progress, we need to know how likely he/she is to succeed on each exercise at any given point in time. And by being able to correctly predict these probabilities of success we can choose the homework that maximizes the overall progress.

This post aims to introduce our modified LSTM cell, the “Multi-state LSTM”, which we based our model on to try and solve this problem.