Design patterns for the MapReduce framework, until now, have been scattered among various research papers, blogs, and books. This handy guide brings together a unique collection of valuable MapReduce patterns that will save you time and effort regardless of the domain, language, or development framework you're using. Each pattern is explained in context, with pitfalls and caveats clearly identified - so you can avoid some of the common design mistakes when modeling your Big Data architecture. This book also provides a complete overview of MapReduce that explains its origins and implementations, and why design patterns are so important. Hadoop MapReduce code is provided to help you learn how to apply the design patterns by example. Topics include: Basic patterns, including map-only filter, group by, aggregation, distinct, and limit Joins: traditional reduce-side join, reduce-side join with Bloom filter, replicated join with distributed cache, merge join, Cartesian products, and intersections Binning, sharding for other systems, sorting, sampling, unions, and other patterns for organizing data Job optimization patterns, including multi-job map-only job folding, and overloading the key grouping to perform two jobs at once
评分
评分
评分
评分
慢慢思索,仍需品味…
评分花了大概3-4个小时快速看完,温习了一下Input/OutputFormat, RecordReader/Writer, InputSplit,基本没收获,比较适合刚会写MapReduce的码农们快速浏览一遍
评分慢慢思索,仍需品味…
评分作者也是屌。几道MR例题也能出本书。
评分就告诉你如何用MR实现SQL中的JOIN、聚合函数等
本站所有内容均为互联网搜索引擎提供的公开搜索信息,本站不存储任何数据与内容,任何内容与数据均与本站无关,如有需要请联系相关搜索引擎包括但不限于百度,google,bing,sogou 等
© 2025 book.wenda123.org All Rights Reserved. 图书目录大全 版权所有