Model Name
panda_cat_dog_classification
Model Description
This model classifies animals among pandas, cats and dogs. It was trained using custom CNN model.
- Developed by: Neelima Monjusha Preeti
- Model type: custom CNN model
- Language(s): Python
- License: MIT
- Contact: monjusha.2017@juniv.edu
Task Description
This panda_cat_dog_classification app classifies between panda, cat, or dog. So the input field is going to take input an image of one of three classes of dog, cat and panda.Then as output, it is going to show the name of the animal to which it belongs. It first processes the data and resizes it. Then custom CNN model is developed. The loss function and optimizer are calculated. After that, the custom model is trained and tested then the app is launched using gradio in Hugging Face.
Data Preprocessing
The image dataset is preprocessed with the following portion:
transform = transforms.Compose([
transforms.Resize((224,224)),
transforms.ToTensor(),
transforms.Normalize((0.485,0.456,0.406),(0.229,0.224,0.225))
])
transforms.Resize((224,224)) resizes the input image to (224, 224) pixels. transforms.ToTensor() converts the input image into a PyTorch tensor. Neural networks typically operate on tensors, so this transformation converts the image into a format suitable for further processing. transforms.Normalize(()) normalizes the tensor image with mean and standard deviation. The values provided are mean and standard deviation values for each channel in the tensor.
Model Architecture
The model was trained with custom CNN() model. this CNN architecture consists of two convolutional layers followed by two fully connected layers, and it is designed for a classification task with three classes.
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(16 * 53 * 53, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 3)
def forward(self, x):
x = self.conv1(x)
x = self.pool(x)
x = self.conv2(x)
x = self.pool(x)
x = x.view(-1, 16 * 53 * 53)
x = self.fc1(x)
x = self.fc2(x)
x = self.fc3(x)
return x
Then used batch_size = 8 and CrossEntropyLoss() for loss function. Then used Adam optimizer with a learning rate 0.001 for optimization process.
loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
Training Loop
Loading the data then breaking it into mini batches. Then forward pass and loss function calculation. After that backward propagation and optimization. Backward Propagation and Optimization:
optimizer.zero_grad()
loss.backward()
optimizer.step()
Test data
Test data loaded and calculate the accuracy.
The accuracy was 53.333333333333336% .
Result Analysis
The packages needed for creating the huggingface interface is loaded with:
import gradio as gr
import torch
from torchvision import transforms
The model was saved with the following:
model_scripted = torch.jit.script(model)
model_scripted.save('./models/cat_dog_cnn.pt')
HuggingFace Result analysis
First the custom model cat_dog_cnn.pt is loaded. Then the output function is specified. As this is a Image Classification model.
|---app_data
| |---cat.jpg
| |---dog.jpg
| |---panda.jpg
|
Example images are loaded. The classes for prediction are - CLASSES = ["Cat", "Dog", "Panda"]. The output function for prediction is
def classify_image(inp):
inp = transform(inp).unsqueeze(0)
out = model(inp)
return CLASSES[out.argmax().item()]
This will return the classes of the input image.
Interface Creation
For creating huggingface interface this following portion is added:
iface = gr.Interface(fn=classify_image,
inputs=gr.Image(type="pil", label="Input Image"),
outputs="text",
examples=[
"./app_data/cat.jpg",
"./app_data/dog.jpg",
"./app_data/panda.jpg",
])
This portion is going to create an interface for taking the image input. Then example images and output is defined to be the classes from cat, dog and panda. Now with the following the interface of the app is loaded.
iface.launch()
The app interface looks like this:
Project Structure
|
|---app_data
| |---images(used for examples)
|
|---models
| |---cat_dog_cnn.pt
|
|---train(image dataset for training)
|
|---test(image dataset for testing)
|
|---Readme.md(about project)
|
|---app.py(the interface for project)
|
|---requirements.txt(libraries needed for project)
|
|---main.ipynb(project code)
How to Run
git clone https://huggingface.co/spaces/neelimapreeti297/panda_cat_dog_classification/tree/main
cd panda_cat_dog_classification
pip install -r requirements.txt
python app.py
License
This project is licensed under the MIT License.
Contributor
Neelima Monjusha Preeti - monjusha.stu2017@juniv.edu
link: https://huggingface.co/spaces/neelimapreeti297/panda_cat_dog_classification
- Downloads last month
- 2