Feather Vision

Project description

Ornithology enthusiasts and zoologists are people we thank for protecting and discovering new species on almost daily basis. Their jobs are already extremely difficult with the amount of information they have to keep track of while always being on the lookout for rare sightings of exotic birds. With FeatherVision their job will be made much easier, when it comes to identifying numerous different species of birds.

The objective of our project is to create a machine learning model capable of accurately classifying bird species based on visual attributes in images. This way anyone could become a beginner ornithologist by simply capturing an image of their own and finding out the details of the fauna in their local area. And for those advanced in the field, they could spend more time taking care of endangered species and researching by decreasing the precious time needed to properly identify these species.

The world of birds is enormous and with FeatherVision you will never get lost in it again!

Business Objectve Definition

We believe that our solution will help in multiple branches. First of all, as mentioned above, ornithology would gain an easy to use tool for identifying bird species, which means that everyone could track the different species in their local area. Our solution could also serve for everyday use for people who's hobby it is to observe different bird species in their local are or when on vacation.

Our goal is to be able to identify bird species by just looking at them. You simply take a picture of the bird and our trained model will tell you what kind of bird this is. Since our dataset is fully labeled, meaning all the pictures have a label (bird species name) that the model can check, we chose the Supervised Learning approach. This means we will most likely choose a classification model to classify an image to a species' name.

Performance Metrics

We've chosen accuracy as our main performance metric. Accuracy measures the proportion of correctly classified instances out of all instance.

We don't see any specific benefit in focusing on either precision or recall, which might be the case if we wanted to ensure a rare bird species is correctly identified (high recall) or to minimize missclasification among species (high precision). To complement accuracy and gain deeper insights into our model's performance across different species, we'll also utilize a confusion matrix. This approach will help us identify how well the model differentiates between species and highlight areas for improvement.

Data exploration

We have decided to use birds classification dataset from kaggle. It contains 525 bird species, 84635 training images, 2625 test images, and 2625 validation images (5 images per species for both test and validation). The author of the dataset notes that the 'best' pictures were chosen for the test and validation data sets, so we might achieve slightly worse results if we split the data ourselves. This is not an issue, since we want to use the model on our own images anyway.

Notes on the dataset

All the images are photographed and not AI generated.
Each picture contains only one bird.
In most pictures, the pixels of the bird take up at least 50% of the image.
Around 80% of the images show male birds, which are typically much more colorful than females. This means that the model will perform considerably worse on female birds, especially if they look different from males.

Size and type of data

As mentioned earlier, the dataset contains almost 85K good quality images.

The images are sized 224 X 224 x 3 color channels (RGB) making them suitable as input to convoluted neural networks.
The images are in JPG format.
The average image size is around 20kB.
The images are photographed during all seasons and it should contain species from every continent.

Conclusion

In summary, we are satisfied with the project and the results it yielded. Our approach was incremental as we found better ways to approach the problem in the course of the development. After introducing the problem and exploring the data, the first shortlisted model was Support Vector Machine trained on the pixel distribution data of the bird pictures. This method was quickly abandoned, since the results were poor with

Train Accuracy of 0.323
Test Accuracy 0.171

pixel distribution does not hold enough information for the model in this application.

Switching to Image2Vector library for extracting image information and training the SVM model on the vectors, the model performance skyrocketed to

Train Accuracy: 0.993
Test Accuracy: 0.951
Test Precision: 0.954
Test Recall: 0.951

which was very surprising as we expected that only a convoluted neural network would be able to deliver such results.

The neural network proved to be very computationally demanding, as training it without a pre-trained model on only a small subset of the original dataset took more than 3 hours. As the Image2Vector proved to be an effective tool, the image vectors were used as an input to the second version of the neural network and this resulted in our best model with

Validation Accuracy: 0.887
Test accuracy: 0.910

This performance is slightly worse than the SVM with feature vectors, but offers the versatility of the neural network and allows us to train the model on the large original dataset in about 90 seconds.

We set our goal to be using our own pictures and recognizing the bird species using our models, which we successfully showcase in the last steps and it concludes our project.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
__pycache__		__pycache__
archive_exploration		archive_exploration
example_images		example_images
exploration_v2		exploration_v2
presaved_data_all		presaved_data_all
test_images		test_images
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
7. Final Group Project.ipynb		7. Final Group Project.ipynb
FeatherVision.ipynb		FeatherVision.ipynb
README.md		README.md
birds_classification_model_checkpoint.weights.h5		birds_classification_model_checkpoint.weights.h5
image.png		image.png
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feather Vision

Project description

Business Objectve Definition

Performance Metrics

Data exploration

Notes on the dataset

Size and type of data

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Feather Vision

Project description

Business Objectve Definition

Performance Metrics

Data exploration

Notes on the dataset

Size and type of data

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages