Hadoop: The Definitive Guide

Hadoop: The Definitive Guide pdf epub mobi txt 电子书 下载 2025

Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.

出版者:O'Reilly Media
作者:Tom White
出品人:
页数:756
译者:
出版时间:2015-4-11
价格:USD 49.99
装帧:Paperback
isbn号码:9781491901632
丛书系列:
图书标签:
  • Hadoop 
  • 大数据 
  • BigData 
  • 计算机 
  • 分布式 
  • hadoop 
  • 机器学习 
  • O'Reilly 
  •  
想要找书就要到 图书目录大全
立刻按 ctrl+D收藏本页
你会得到大惊喜!!

Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.

Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.

Learn fundamental components such as MapReduce, HDFS, and YARN

Explore MapReduce in depth, including steps for developing applications with it

Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN

Learn two data formats: Avro for data serialization and Parquet for nested data

Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)

Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop

Learn the HBase distributed database and the ZooKeeper distributed configuration service

具体描述

读后感

评分

你的履历添了一笔<hadoop权威指南>译者,但是你不配 这是我见过的最不用心的翻译, 字里行间行文不通顺, 请别勉强自己,map reduce shuffle机制都没翻译的好 虽然原作者写作功底也实在是一般 第 1 2 5 6 7 这几章 翻译的实在是太烂了 请不要呐Google翻译糊弄人阿 误人子弟 ...  

评分

很多地方翻译的不行,需要对照英文看才能明白。。。不过对于快速学习,仍然是不错的选择。建议译者看看每部分内容的重要性,不重要的瞎翻翻就算了,重要的部分还是好好花点功夫,不要本末倒置了。比如第三章的数据流部分,这么经典的地方居然被翻译烂的一塌糊涂。不知道译者会...  

评分

很好的Hadoop教程,比Apache和Yahoo !网页版guide详细很多,很多想不明白的Hadoop实现细节都可以在这本书里找到。  

评分

中文版412页: 所以理论上,任何东西都可以表示成二进制形式,然后转化成为长整型的字符串或直接对数据结构进行序列化,来作为键值。 原文460页: ..., so theoretically anything can serve as row key, from strings to binary representations of long or even serialized ...  

评分

书中没有透露太多实现架构方面的细节,更多的是从使用者的角度上介绍了Hadoop的各种知识,包括MapReduce, HDFS, Hive, Pig, HBase, ZooKeeper。几乎涉及了Hadoop的所有关于使用方面的知识,包括安装和使用。 你甚至可以直接在自己的电脑上装上一个Hadoop,对着书中的例子实际演...  

用户评价

评分

入门hadoop的好书

评分

看前两部分就行,相关的pig hive spark如果不实践也不需要深入。本科上课读过那google三篇论文,扫这本书还是很快的。

评分

读完了,第一次接触大数据相关的内容。这本书的内容相当全面,第一部分讲原理,中间详细介绍基于hadoop的project,最后有具体的应用举例。很多地方理解的还不是很透彻,需要进一步的阅读。

评分

Have read the first part of it for overview. Superb. Definitely come back for details before the third year career.

评分

Have read the first part of it for overview. Superb. Definitely come back for details before the third year career.

本站所有内容均为互联网搜索引擎提供的公开搜索信息,本站不存储任何数据与内容,任何内容与数据均与本站无关,如有需要请联系相关搜索引擎包括但不限于百度google,bing,sogou

© 2025 book.wenda123.org All Rights Reserved. 图书目录大全 版权所有