Manning Early
Access Program
Big Data
Principles and best practices of scalable realtime data systems


Nathan Marz and James Warren

MEAP Began: January 2012
Softbound print: April 2015 (est.) | 425 pages | B&W
ISBN: 9781617290343

Become a reviewer
Pre-Order options*
Order now and start reading Big Data today through MEAP                    
  MEAP + eBook only - $39.99
  MEAP + Print book (includes eBook) when available - $49.99
* For more information, please see the MEAP FAQs page.
  About MEAP Release Date Estimates     

Table of Contents, MEAP Chapters & Resources

Table of Contents         Resources 
PART 1: BATCH LAYER   1. A new paradigm for Big Data - FREE
  2. Data model for Big Data - AVAILABLE
  3. Data model for Big Data: illustration - AVAILABLE
  4. Data storage on the batch layer - AVAILABLE
  5. Data storage on the batch layer: illustration - AVAILABLE
  6. Batch layer - AVAILABLE
  7. Batch layer: illustration - AVAILABLE
  8. An example batch layer: architecture and algorithms - AVAILABLE
  9. An example batch layer: implementation - AVAILABLE

10. Serving layer - AVAILABLE
11. Serving layer: illustration - AVAILABLE

12. Realtime views - AVAILABLE
13. Realtime views: illustration - AVAILABLE
14. Queuing and stream processing - AVAILABLE
15. Queuing and stream processing: illustration - AVAILABLE
16. Micro-batch stream processing - AVAILABLE
17. Micro-batch stream processing: illustration - AVAILABLE
18. Lambda Architecture in-depth - AVAILABLE


Services like social networks, web analytics, and intelligent e-commerce often need to manage data at a scale too big for a traditional database. Complexity increases with scale and demand, and handling big data is not as simple as just doubling down on your RDBMS or rolling out some trendy new technology. Fortunately, scalability and simplicity are not mutually exclusive—you just need to take a different approach. Big data systems use many machines working in parallel to store and process data, which introduces fundamental challenges unfamiliar to most developers.

Big Data teaches you to build these systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy to understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built.

Big Data shows you how to build the back-end for a real-time service called—our version of Google Analytics. As you read, you'll discover that many standard RDBMS practices become unwieldy with large-scale data. To handle the complexities of Big Data and distributed systems, you must drastically simplify your approach. This book introduces a general framework for thinking about big data, and then shows how to apply technologies like Hadoop, Thrift, and various NoSQL databases to build simple, robust, and efficient systems to handle it.


This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.


Nathan Marz is currently working on a new startup. Previously, he was the lead engineer at BackType before being acquired by Twitter in 2011. At Twitter, he started the streaming compute team which provides and develops shared infrastructure to support many critical realtime applications throughout the company. Nathan is the creator of Cascalog and Storm, open-source projects which are relied upon by over 50 companies around the world, including Yahoo!, Twitter, Groupon, The Weather Channel, Taobao, and many more companies.

James Warren is an analytics architect at Storm8 with a background in big data processing, machine learning and scientific computing.


This Early Access version of Big Data enables you to receive new chapters as they are being written. You can also interact with the authors to ask questions, provide feedback and errata, and help shape the final manuscript on the Author Online forum.


Sign up to read more content when it is released and to receive news about this book.