A Python-based machine learning project that recognizes handwritten digits (0-9) using TensorFlow/Keras and OpenCV. The project uses a neural network trained on the MNIST dataset to predict digits from custom image files.
- Overview
- Features
- Project Structure
- Installation
- Usage
- How It Works
- Model Architecture
- Sample Images
- Contributing
- License
This project implements a handwritten digit recognition system using a neural network trained on the MNIST dataset. The model can predict digits from custom PNG images stored in the digits
folder, processing them automatically and displaying both the prediction and the image.
- Pre-trained Model: Uses a saved Keras model trained on MNIST dataset
- Automatic Image Processing: Processes multiple digit images sequentially
- Real-time Visualization: Displays each processed image with matplotlib
- Robust Image Handling: Includes error handling for missing or corrupted images
- Simple Interface: Easy-to-use script that processes all images in the digits folder
Handwritten-Digit-Recognition/
├── digits/ # Directory containing digit image samples
│ ├── digit1.png # First digit image
│ ├── digit2.png # Second digit image
│ ├── digit3.png # Additional digit images...
│ └── ... # More digit images (sequential numbering)
├── digit_recognize.py # Main recognition script
├── handwritten.keras # Pre-trained model file (generated after training)
├── .gitattributes # Git configuration file
└── README.md # Project documentation
- Python 3.7 or higher
- pip package manager
Install the required libraries using pip:
pip install tensorflow opencv-python matplotlib numpy
- Clone the repository:
git clone https://github.com/Savidilsh/Handwritten-Digit-Recognition.git
cd Handwritten-Digit-Recognition
- Ensure you have the pre-trained model file
handwritten.keras
in the root directory - Add your digit images to the
digits
folder following the naming conventiondigit1.png
,digit2.png
, etc.
Simply execute the main script:
python digit_recognize.py
- Format: PNG images
- Naming Convention:
digit1.png
,digit2.png
,digit3.png
, etc. - Location: Place all images in the
digits
folder - Content: Black digits on white background (or will be inverted automatically)
- The script loads the pre-trained model
- It processes each image in sequence (digit1.png, digit2.png, etc.)
- For each image:
- Reads the image using OpenCV
- Preprocesses it (inverts colors, reshapes)
- Makes a prediction using the neural network
- Displays the result in the console
- Shows the processed image using matplotlib
- Continues until no more sequential images are found
img = cv2.imread(f"digits/digit{image_number}.png")[:,:,0] # Read as grayscale
img = np.invert(np.array([img])) # Invert colors and reshape
prediction = model.predict(img)
result = np.argmax(prediction) # Get the digit with highest probability
plt.imshow(img[0], cmap=plt.cm.binary) # Display in binary colormap
plt.show()
The neural network uses a simple feedforward architecture:
Input Layer: 28x28 pixels (flattened to 784)
↓
Flatten Layer: Converts 28x28 to 1D array
↓
Dense Layer 1: 128 neurons, ReLU activation
↓
Dense Layer 2: 128 neurons, ReLU activation
↓
Output Layer: 10 neurons, Softmax activation (for digits 0-9)
- Dataset: MNIST (28x28 handwritten digits)
- Optimizer: Adam
- Loss Function: Sparse Categorical Crossentropy
- Training Epochs: 3
- Data Normalization: Pixel values scaled to [0,1]
The project includes sample digit images:
- digit1.png: Contains the digit "2"
- digit2.png: Contains the digit "7"
Add more images following the same naming pattern for batch processing.
To retrain the model, uncomment the training code in digit_recognize.py
:
# Uncomment these lines to retrain the model:
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3)
model.save('handwritten.keras')
Contributions are welcome! Here are some ways you can contribute:
- Add more sample digit images
- Improve image preprocessing
- Enhance the model architecture
- Add evaluation metrics
- Create a GUI interface
- Add support for different image formats
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is open source and available under the MIT License.
- TensorFlow team for the MNIST dataset and Keras framework
- OpenCV community for image processing capabilities
- Contributors to the machine learning community
For questions or feedback, please open an issue in the repository.
Note: Make sure you have the handwritten.keras
model file in your project directory. If not, uncomment and run the training code first to generate the model.