Five or six years ago, analysts working with big datasets made queries and got the results back overnight. The data world was revolutionized a few years ago when Hadoop and other tools made it possible to get the results from queries in minutes. But the revolution continues. Analysts now demand sub-second, near real-time query results. Fortunately, we have the tools to deliver them. This report examines tools and technologies that are driving real-time big data analytics.
The first edition of Ralph Kimball’s The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more.
Nowadays, distributed systems are increasingly present, for public software applications as well as critical systems. software applications as well as critical systems. This title and Distributed Systems: Design and Algorithms – from the same editors – introduce the underlying concepts, the associated design techniques and the related security issues.
This practical book is a complete guide to installing, configuring, and managing Pentaho Kettle. If you’re a database administrator or developer, you’ll first get up to speed on Kettle basics and how to apply Kettle to create ETL solutions – before progressing to specialized concepts such as clustering, extensibility, and data vault models. Learn how to design and build every phase of an ETL solution.
Aimed at helping business and IT managers clearly communicate with each other, this helpful book addresses concerns straight-on and provides practical methods to building a collaborative data warehouse. You’ll get clear explanations of the goals and objectives of each stage of the data warehouse lifecycle while learning the roles that both business managers and technicians play at each stage. Discussions of the most critical decision points for success at each phase of the data warehouse lifecycle help you understand ways in which both business and IT management can make decisions that best meet unified objectives.
How can you learn to manage and analyze all kinds of data? Turn to Head First Data Analysis, where you’ll learn how to collect and organize your data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. The unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool.