Building a Simple AI Model

Building a Simple AI Model: A Step-by-Step Guide with Real Code and Outputs

Artificial intelligence (AI) is transforming industries by learning patterns from data and making intelligent decisions. Today, we'll focus on a practical implementation by building a simple AI model using Python and TensorFlow that classifies handwritten digits from the MNIST dataset. Along with the code, we also include output to help you understand the steps.

What are we building?

We build a neural network that can classify images of handwritten digits (0-9). The dataset consists of 28x28 grayscale images of digits, and the model predicts which digits the image represents.

Step 1: Installing Dependencies

Start by installing the required libraries. Run the following command:

pip install tensorflow numpy matplotlib

Step 2: Importing Libraries

Now, import the libraries we need to build the model:

import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt

Step 3: Loading and preprocessing the MNIST dataset

The MNIST dataset is preloaded in TensorFlow, so we can easily load and normalize it. Normalization helps the model perform better.

# Load the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

# Normalize the pixel values to a range of 0 to 1
train_images = train_images / 255.0
test_images = test_images / 255.0

print(f'Train images shape: {train_images.shape}')
print(f'Train labels shape: {train_labels.shape}')

Output:

Train images shape: (60000, 28, 28)
Train labels shape: (60000,)

Step 4: Visualize the data

Let's take a quick look at the data by processing some images and their labels:

# Plot a sample image
plt.figure(figsize=(6,6))
for i in range(9):
    plt.subplot(3, 3, i+1)
    plt.imshow(train_images[i], cmap='gray')
    plt.title(f'Label: {train_labels[i]}')
    plt.axis('off')
plt.show()

Output:

Step 5: Building the neural network model

Now let us define the architecture of the neural network. We will use TensorFlow's Keras API to build a simple model.

# Build the model
model = models.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Output:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten (Flatten)           (None, 784)               0         
 
 dense (Dense)               (None, 128)               100480    
 
 dense_1 (Dense)             (None, 10)                1290      
 
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________

Step 6: Train the model

Next, we train the model using the training data. We train it for 5 epochs.

# Train the model
history = model.fit(train_images, train_labels, epochs=5)

Output:

Epoch 1/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2556 - accuracy: 0.9271
Epoch 2/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.1111 - accuracy: 0.9674
Epoch 3/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.0766 - accuracy: 0.9770
Epoch 4/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.0578 - accuracy: 0.9818
Epoch 5/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.0445 - accuracy: 0.9861

Step 7: Evaluate the model

Let's evaluate the model on test data to see how well it generalizes.

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)

print(f'Test accuracy: {test_acc:.4f}')

Output:

313/313 [==============================] - 1s 2ms/step - loss: 0.0706 - accuracy: 0.9778
Test accuracy: 0.9778

Step 8: Forecasting

Let's use the trained model to make predictions on test data and visualize some results.

# Make predictions
predictions = model.predict(test_images)

# Display a sample prediction
def plot_image(i, predictions_array, true_label, img):
    predictions_array, true_label, img = predictions_array[i], true_label[i], img[i]
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])

    plt.imshow(img, cmap=plt.cm.binary)

    predicted_label = np.argmax(predictions_array)
    if predicted_label == true_label:
        color = 'blue'
    else:
        color = 'red'

    plt.xlabel(f"{predicted_label} ({100*np.max(predictions_array):.2f}%)", color=color)

plt.figure(figsize=(6, 3))
plot_image(0, predictions, test_labels, test_images)
plt.show()

Output:

Predicted label: 7 (with 99.99% confidence)

True label: 7

Complete code example with output

Here is the complete code with outputs generated at each step as shown above.

import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt

# Load and preprocess data
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0

# Visualize some data
plt.figure(figsize=(6,6))
for i in range(9):
    plt.subplot(3, 3, i+1)
    plt.imshow(train_images[i], cmap='gray')
    plt.title(f'Label: {train_labels[i]}')
    plt.axis('off')
plt.show()

# Build the model
model = models.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
history = model.fit(train_images, train_labels, epochs=5)

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f'Test accuracy: {test_acc:.4f}')

# Make predictions
predictions = model.predict(test_images)

# Visualize predictions
plt.figure(figsize=(6, 3))
plot_image(0, predictions, test_labels, test_images)
plt.show()

Output:

Conclusion

In this guide, we walked through building a simple AI model using Python and TensorFlow to classify handwritten digits. We learned how to load the dataset, preprocess the data, build a neural network, train the model, evaluate its performance, and visualize predictions. This is just the beginning of AI and machine learning; there are countless possibilities to explore!

Search This Blog

Manula Kavishka

Follow the Co. Code of Conduct following the blog title to update its sentiment.

Building a Simple AI Model: A Step-by-Step Guide with Real Code and Outputs

What are we building?

Step 1: Installing Dependencies

Step 2: Importing Libraries

Step 3: Loading and preprocessing the MNIST dataset

Step 4: Visualize the data

Step 5: Building the neural network model

Step 6: Train the model

Step 7: Evaluate the model

Step 8: Forecasting

Complete code example with output

Conclusion

Comments

Post a Comment

Popular posts from this blog

Follow the Co. Code of Conduct following the blog title to update its sentiment.