NeuralNetwork-Based-Generic-Compression

Generic compression using autoencoder

Installation

Python 3.X.X is required for this to work.

Create your python(v3) virtual environment
Install the necessary packages

pip install tensorflow==2.1.0
pip install keras==2.3.1
pip install opencv-python==4.2.0.32
pip install Pillow==7.0.0
pip install image==1.5.28
pip install noisereduce
pip install numpy
pip install matplotlib

Download the following files from this github repo:

OR

Download this .zip file --> Click Here

Put the downloaded files in a single folder

Usage

Open your CMD and navigate to the installation folder from step 4 (from the Installation section)

Note: the model detects the filetype (image/audio) automatically, you don't have to specify.

Encode Mode

To Encode (compress), use the following command:

python main.py encode [input_file_path] [compressed_file_path]

Examples:
python main.py encode myimage.png mycompressed
python main.py encode myaudio.wav mycompressed

Note: You are required to include the input file extension but not the compressed file.

Decode Mode

To Decode (decompress), use the following command:

python main.py decode [compressed_file_path] [output_file_path]

Example:
python main.py decode mycompressed my_image_output.png
python main.py decode mycompressed my_audio_output

Note: You are required to include the output file extension for the image output only.

Supported formats

Audio:

.wav

Image:

.JPEG
.JPG
.PNG
.TIFF

Datasets Used

Image Datasets

Note: Not all the content of the datasets were used due to resources limitations. https://www.kaggle.com/evgeniumakov/images4k
http://www.cs.toronto.edu/~kriz/cifar.html
https://www.kaggle.com/hsankesara/flickr-image-dataset
https://www.kaggle.com/vishalsubbiah/pokemon-images-and-types

All images are processed first using data_generator.py before being used for training.
All images are cut into 32x32 blocks to match the model's input size.
Around ~1,000,000 32x32x3 images are used for training. (dataset contains 15,000,000+)

Audio Dataset

Beatport EDM Key Dataset https://zenodo.org/record/1101082#.XqyLuqgzZPZ

Note: A portion of 2.49 GB Wav files (125 song) of the dataset is used in training.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
Audio		Audio
Image Datasets Preprocessing		Image Datasets Preprocessing
Image		Image
.gitignore		.gitignore
EncoderDecoderAudio.py		EncoderDecoderAudio.py
EncoderDecoderImage.py		EncoderDecoderImage.py
README.md		README.md
audio_autoencoder.model		audio_autoencoder.model
image_autoencoder.h5		image_autoencoder.h5
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuralNetwork-Based-Generic-Compression

Installation

OR

Usage

Encode Mode

Decode Mode

Supported formats

Audio:

Image:

Datasets Used

Image Datasets

Audio Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeuralNetwork-Based-Generic-Compression

Installation

OR

Usage

Encode Mode

Decode Mode

Supported formats

Audio:

Image:

Datasets Used

Image Datasets

Audio Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages