ARTICLE "Composition Techniques with JCascalog" read this article now in
Manning's Free Content Center

ch 1 audio

Resources

Source Code Book Forum ARTICLE "Data Storage in the Batch Layer with Pail" ARTICLE "Composition Techniques with JCascalog" Register your pBook for a free eBook more

Become a
Reviewer

Help us create great books

Big Data

you own this product

Principles and best practices of scalable realtime data systems

Nathan Marz and James Warren

April 2015
ISBN 9781617290343
328 pages

Included with a Manning Online subscription

printed in black & white

Available translations: German, Polish, Russian, Simplified Chinese

catalog / Data Science / Big Data / Big Data Processing

read now

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$39.99 $29.99

you save $10.00 (25%)

include audio $19.99 $14.99

Look inside

Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built.

about the book

Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive.

Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases.

what's inside

Introduction to big data systems
Real-time processing of web-scale data
Tools like Hadoop, Cassandra, and Storm
Extensions to traditional database skills

about the reader

This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.

about the authors

Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing.

eBook

$39.99 $29.99

you save $10.00 (25%)

include audio $19.99 $14.99

Transcends individual tools or platforms. Required reading for anyone working with big data systems.

Jonathan Esterhazy, Groupon

A comprehensive, example-driven tour of the Lambda Architecture with its originator as your guide.

Mark Fisher, Pivotal

Contains wisdom that can only be gathered after tackling many big data projects. A must-read.

Pere Ferrera Bertran, Datasalt

The de facto guide to streamlining your data pipeline in batch and near-real time.

Alex Holmes, Author of "Hadoop in Practice"

Big Data

pro $24.99 per month

lite $19.99 per month

team

about the book

what's inside

about the reader

about the authors

related titles

related titles

pro

team

pro

team

pro

team