Mining of Massive Datasets 在线电子书 图书标签: 数据挖掘 计算机 机器学习 Data Coursera CS 数据分析 软件工程
发表于2025-03-22
Mining of Massive Datasets 在线电子书 pdf 下载 txt下载 epub 下载 mobi 下载 2025
内容不错,但作为技术向的书有些浮于表面。
评分bug非常之多, 还找不到地方提交, 读起来极度痛苦, 前看后忘, 也许里面的算法本质上就是这样, bottom line至少近15年最新的论文成果被这么串讲一下, 本科生也能看懂
评分花费6个月时间,断断续续看完,哈希和近似的想法真是开阔了眼界。第一回看比较急促,此书值得反复看,多实践。
评分bug非常之多, 还找不到地方提交, 读起来极度痛苦, 前看后忘, 也许里面的算法本质上就是这样, bottom line至少近15年最新的论文成果被这么串讲一下, 本科生也能看懂
评分下学期课程参考textbook,听说professor还不错,打算好好学一下这门课
Jure Leskovec is Assistant Professor of Computer Science at Stanford University. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including a Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, Okawa Foundation Fellowship, and numerous best paper awards. His research has also been featured in popular press outlets such as the New York Times, the Wall Street Journal, the Washington Post, MIT Technology Review, NBC, BBC, CBC and Wired. Leskovec has also authored the Stanford Network Analysis Platform (SNAP, http://snap.stanford.edu), a general purpose network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes and billions of edges. You can follow him on Twitter at @jure.
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets and clustering. This second edition includes new and extended coverage on social networks, machine learning and dimensionality reduction.
看到好多人说这本书是大纲,是目录,没啥内容,讲的浅。 那就对了。 本书是Stanford CS246课程MMDS使用的讲义,还有配套的Slides和HW,所以观看本书请配套课程进行学习,同时coursera上也有配套的课程。 See more detail: http://www.mmds.org/
评分这本书其实挺好的,但是真得看英文版。 这是我们上课的参考书之一,英文版有的地方没看懂,就打算找个中文版来看。看了中文版发现,这个翻译的水平基本是跟我大四,研一给老师翻译文章的水平一样的,可以看出这本书应该是找学生翻译的,而且是对专业领域还了解不深的学生翻译的...
评分Web数据挖掘特点,相比较ML增加了哪些理论和技术? (1) 大约覆盖了20篇论文。用了统一的语言,统一深度数学来表达。 (2) Hash用的特别多。方式各异。如下。 a. 提高检索速度,如index b. 数据随机分组。 c. 定义数据映射,重复这些映射。最基本功能。但对于新数据映射会存...
评分终于看完了这本书,读的比较粗,但是还是发现了很多的小错误,不知道是作者的错误还是译者的错误,总之给人不严谨不严肃的印象,知识还是比较容易理解的(虽然本人没记住多少。。汗。。),还是积累了不错的知识,天道酬勤!
评分麻烦支那猪以后翻译外文书籍,先找个稍微懂行的把书看一遍行吗! 鉴于中文翻译缩水不准的情况,本掉千辛万苦找来英文原版,一看到目录,本屌就硬了,尼玛作者太牛逼了! 最新补充一句,话说如果这本书的名字叫做类似《数据挖掘基础》的话,本屌绝壁不喷它。本来就是基础的基...
Mining of Massive Datasets 在线电子书 pdf 下载 txt下载 epub 下载 mobi 下载 2025