Skip to content

Latest commit

 

History

History
31 lines (20 loc) · 733 Bytes

File metadata and controls

31 lines (20 loc) · 733 Bytes

Distributed MapReduce

MapReduce model implemented as per this paper

MapReduce: Simplified Data Processing on Large Clusters

Overview

MapReduce Overview

Execution

Master

  • keeps track of map jobs, reduce jobs and backlog jobs
  • pings worker at regular intervals about job status
  • in case worker is down, pushes its job to backlog
  • when worker asks for job, assigns one from backlog, map, or reduce jobs

Worker

  • connects to master
  • asks for job
  • pings master when job is done

In Action

MapReduce

Usage

  • start a master: make master
  • start workers: make worker NAME=worker_name