Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 1.28 KB

File metadata and controls

27 lines (20 loc) · 1.28 KB

MapReduce-Udacity

MapReduce and MapReduce Design Patterns examples scripts written in python with Udacity MapReduce class materials

AccessLogMR

  • HitToPages: gives the number of hits for each different file on the Web site
  • HitsFromIP: gives the number of hits to the site made by each different IP address
  • MostPopularPath: finds the most popular file on the website: the file whose path occurs most often in access_log

MapReducePatterns

  • StructualPatterns: making natural join effect in SQL
  • SummarizationPatterns
    • Combiner: combiner pattern with an example showing the summery of each day's sales average
    • InvertedIndex: shows which words, in the text file, show up which pages such as an index page in a book
    • NumericalSumm: shows numerical summary of the data ( with the example of finding mean)

SalesMR

  • HighestSale: gives a sales breakdown by product category across all of the stores
  • SalesPerCategory: finds the monetary value for the highest individual sale for each separate store

StadiumMR(Not from Udacity):

  • finds how many stadiums have roof on it from the given dataset

WordCount:

  • gives which words are and how many they are in the given txt file