Introduction & Summarization Patterns:
Learning Objectives - In this module, you will be introduced to Design Patterns vis-a-vis MapReduce, the general structure of the course & project work. Also, discussion on Summarization Patterns: Patterns that give a summarized top-level view of large data sets.
Topics - Review of MapReduce, Why are Design Patterns required for MapReduce, Discussion of different classes of Design Patterns, Discussion of project work and problem, About Summarization Patterns, Types of Summarization Patterns – Numerical Summarization Patterns, Inverted Index Pattern and Counting with counters pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
Filtering Patterns:
Learning Objectives - In this module, we will discuss Filtering Patterns: Patterns that create subsets of data for a more detailed view.
Topics - About Filtering Patterns, Explain & Distinguish 4 different types of Filtering Patterns: Filtering Pattern, Bloom Filter Pattern, Top Ten Pattern and Distinct Pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
Data Organization Patterns:
Learning Objectives - In this module, we will discuss Data Organization Patterns: Patterns that are about re-organizing and transforming data. Categories of these patterns are used together to achieve the end objective.
Topics - About Organization patterns, Explain 5 different types of Organization Patterns – Structured to Hierarchical Pattern, Partitioning Pattern, Binning Pattern, Total Order Sorting Pattern and Shuffling Pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
Join Patterns:
Learning Objectives - In this module, we will discuss Join Patterns: Patterns to be used when your data is scattered across multiple sources and you want to uncover interesting relationships using these sources together.
Topics - About Join Patterns, Explain 4 different types of Join Patterns: Reduce Side Join Pattern, Replicated Join Pattern, Composite Join Pattern, Cartesian Product Join Pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
Meta Patterns & Graph Patterns:
Learning Objectives - In this module, we will discuss Meta Patterns & Graph Patterns. Meta Patterns are different from other Patterns discussed above i.e. these are not basic patterns, but Pattern about Patterns, Introduction to Graph Patterns.
Topics - About Meta Patterns, Types of Meta Patterns: Job Chaining – Description, use cases, chaining with driver, basic & parallel job chaining, chaining with shell scripts, chaining with job control, Example code walk-through, Chain Folding – Description, What to fold, Chain mapper, Chain Reducer, Example code walk-through, Job Merging - Description, Steps for merging two jobs, Example code walk-through, Introduction to Graph design Pattern, Types of Graph Design Patterns: In-mapper Combining Pattern, Shimmy Pattern and Range Partitioning Pattern Pseudo-code for each pattern applied to Page-rank algorithm.
Input-Output Pattern & Project Review:
Learning Objectives - In this module, we discuss Input-Output Pattern: Input-Output Patterns are about customizing input & output to increase the value of map-reduce, Project Review.
Topics - About Input Output Patterns, Types of Input-Output Patterns – Customizing Input & Output, Generating Data, External Source output, External Source Input, Partition Pruning: Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & reviewing the project work solutions.