A C++ framework implementing the MapReduce programming model with multi-threading support, designed for efficient parallel data processing.
This project was developed by Noam Kimhi and Or Forshmit as part of the course
67808 - Operating Systems at
The Hebrew University of Jerusalem (HUJI).
The framework enables the distribution of data processing tasks across multiple threads,
allowing for scalable and efficient computation of large datasets.
🎓 Final Grade: 100
- 📘 Overview
- 🧾 Table of Contents
- ⚙️ Features
- 🛠️ Requirements
- 📦 Installation
- 🚀 Usage
- 🗂️ Project Structure
- 📄 License
- Multithreaded Execution: Processes data in parallel, leveraging multiple CPU cores.
- MapReduce Paradigm: Supports the standard
MapandReducefunctions for data transformation and aggregation. - Modular Design: Easily extendable and adaptable to various data processing tasks.
- C++20 or higher
- CMake 3.10 or higher (if using the provided
CMakeLists.txt) - GNU Make (if you use the provided
Makefile) - A C++ compiler supporting C++20 (e.g., GCC, Clang, MSVC)
- Clone the repository:
git clone https://github.com/OrF8/MapReduceFramework.git - Navigate to the project directory:
cd MapReduceFramework - Create a build directory:
mkdir build - Navigate to the build directory:
cd build - Build the project:
- Using Cmake:
cmake .. make - Or using GNU Makefile:
make
libMapReduceFramework.ain thebuild/directory. - Using Cmake:
- Additional Make and CMake targets are available, such as
tar.
To use the MapReduce framework, follow these steps:
- Implement your own
MapandReducefunctors by inheriting from the provided interfaces ininclude/. - Call the
startMapReduceJobfunction with your data. - An example of how to use the framework can be found in the
examples/directory.- To run the example with CMake:
mkdir build cd build cmake .. make ./SampleClient (or ./SampleClient.exe on Windows) - Or using GNU Makefile:
make runSampleClient
- To run the example with CMake:
.
├── example/ # Sample jobs (e.g. char count)
│ └── SampleClient.cpp
├── include/ # Public headers (MapReduceFramework API)
│ ├── Barrier.h
│ ├── JobStateManager.h
│ ├── MapReduceClient.h
│ └── MapReduceFramework.h
├── src/ # Framework implementation
│ ├── Barrier.cpp
│ ├── JobStateManager.cpp
│ └── MapeduceFramework.cpp
├── CMakeLists.txt # CMake build script
├── Makefile # Alternative Makefile build
├── LICENSE # The license file
└── README.md # ← you are here
This project is licensed under the MIT License-see the LICENSE file for details.