Unleashing Creativity with Character-Level LSTM Text Generationcharacter-level-text-generation-lstm-cell.png

In [ ]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, Flatten, LSTM
import numpy as np
from keras.models import model_from_json

In this blog post, we delve into the intriguing world of character-level LSTM text generation using Keras, a popular deep learning library in Python.

Model Architecture: The model is built using the Sequential API of Keras. It consists of an LSTM layer followed by a dropout layer for regularization and a dense layer with a sigmoid activation function for binary classification.

In [ ]:
char_indices={'x': 0, 'i': 1, 'r': 2, '8': 3, 'c': 4, ',': 5, 'y': 6, '?': 7, 'h': 8, 'm': 9, 'z': 10, 'k': 11, '5': 12, 'a': 13, '0': 14, 'g': 15, ';': 16, ' ': 17, 'f': 18, 'u': 19, 'w': 20, 'n': 21, '4': 22, '!': 23, '2': 24, 'p': 25, 'd': 26, '"': 27, 'l': 28, '3': 29, '.': 30, 'UNK': 31, '6': 32, 'PAD': 33, 'o': 34, '7': 35, '1': 36, 'e': 37, ':': 38, 'q': 39, 'b': 40, 'j': 41, 't': 42, 's': 43, '9': 44, 'v': 45, "'": 46}
In [ ]:
maxlen = 1000
num_neurons = 40
model = Sequential()
model.add(LSTM(num_neurons,
return_sequences=True,
input_shape=(maxlen, len(char_indices.keys()))))
In [ ]:
model.add(Dropout(.2))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile('rmsprop', 'binary_crossentropy', metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm (LSTM)                 (None, 1000, 40)          14080     
                                                                 
 dropout (Dropout)           (None, 1000, 40)          0         
                                                                 
 flatten (Flatten)           (None, 40000)             0         
                                                                 
 dense (Dense)               (None, 1)                 40001     
                                                                 
=================================================================
Total params: 54,081
Trainable params: 54,081
Non-trainable params: 0
_________________________________________________________________

Data Preparation: The model is trained on input data (x_train) and target labels (y_train) loaded from numpy files. Similarly, test data (x_test) and labels (y_test) are loaded for evaluation.

In [ ]:
x_test= np.load('x_test.npy')
x_train= np.load('x_train.npy')
y_test= np.load('y_test.npy')
y_train= np.load('y_train.npy')

Training: The model is trained using the fit function, specifying batch size and number of epochs. Training data and validation data are provided for monitoring the model's performance.

In [ ]:
print(x_test.shape)
(29, 1000, 47)

Model Saving: After training, the model structure is serialized to a JSON file (char_lstm_model3.json), and the weights are saved to an HDF5 file (char_lstm_weights3.h5).

In [ ]:
batch_size = 10
epochs = 10
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test))
Epoch 1/10
12/12 [==============================] - 9s 466ms/step - loss: 0.8566 - accuracy: 0.4174 - val_loss: 0.7230 - val_accuracy: 0.5172
Epoch 2/10
12/12 [==============================] - 6s 515ms/step - loss: 0.5946 - accuracy: 0.7130 - val_loss: 0.8452 - val_accuracy: 0.3448
Epoch 3/10
12/12 [==============================] - 5s 413ms/step - loss: 0.4804 - accuracy: 0.8957 - val_loss: 0.8949 - val_accuracy: 0.3448
Epoch 4/10
12/12 [==============================] - 5s 418ms/step - loss: 0.3819 - accuracy: 0.9304 - val_loss: 0.7802 - val_accuracy: 0.4138
Epoch 5/10
12/12 [==============================] - 6s 519ms/step - loss: 0.2954 - accuracy: 0.9652 - val_loss: 0.7229 - val_accuracy: 0.5517
Epoch 6/10
12/12 [==============================] - 5s 414ms/step - loss: 0.1993 - accuracy: 0.9913 - val_loss: 1.1303 - val_accuracy: 0.4138
Epoch 7/10
12/12 [==============================] - 6s 520ms/step - loss: 0.1298 - accuracy: 1.0000 - val_loss: 0.8328 - val_accuracy: 0.5517
Epoch 8/10
12/12 [==============================] - 5s 406ms/step - loss: 0.0767 - accuracy: 1.0000 - val_loss: 1.0394 - val_accuracy: 0.5517
Epoch 9/10
12/12 [==============================] - 6s 496ms/step - loss: 0.0475 - accuracy: 1.0000 - val_loss: 1.1458 - val_accuracy: 0.5517
Epoch 10/10
12/12 [==============================] - 5s 421ms/step - loss: 0.0265 - accuracy: 1.0000 - val_loss: 1.5484 - val_accuracy: 0.4483
Out[ ]:
<keras.callbacks.History at 0x7b097cd5fc10>
In [ ]:
model_structure = model.to_json()
with open("char_lstm_model3.json", "w") as json_file:
  json_file.write(model_structure)
model.save_weights("char_lstm_weights3.h5")

Text Generation: A function generate_text is defined to generate text using the trained model. The function takes an input text seed and iteratively predicts the next character, forming a sequence of generated text.

In [ ]:
# Function to generate text using the trained model

with open("char_lstm_model3.json", "r") as json_file:
    loaded_model_json = json_file.read()

model = model_from_json(loaded_model_json)
model.load_weights("char_lstm_weights3.h5")
model.compile('rmsprop', 'binary_crossentropy', metrics=['accuracy'])

def generate_text(model, input_text, max_length=10):
    generated_text = input_text[-maxlen:]  # Initialize the generated text with the last 'maxlen' characters of the input
    for _ in range(max_length - len(input_text)):
        encoded_input = np.zeros((1, maxlen, len(char_indices.keys())))
        for t, char in enumerate(generated_text[-maxlen:]):
            encoded_input[0, t, char_indices[char]] = 1.0

        predicted_char_prob = model.predict(encoded_input)[0][0]
        predicted_char = next((char for char, index in char_indices.items() if index == round(predicted_char_prob)), "UNK")
        generated_text += predicted_char

    return generated_text

# User input
user_input = input("Enter a seed text: ")

# Generate text and display it
generated_text = generate_text(model, user_input)
print("Generated text:")
print(generated_text)
Enter a seed text: hi there 
1/1 [==============================] - 1s 772ms/step
Generated text:
hi there x
In [ ]: