Edward CaPriolo,是Media6degrees的系统管理员,也是Apache软件基金会的成员和Hadoop—Hive项目的委员之一。
Dean Wampler,是Think Big Analytics公司的资深咨询顾问,他专长于大数据问题以及诸如HadOOP这样的工具和MachineLearning(机器学习)。
Jason Rutherglen,是Think Big Analytics公司的软件架构师,他专长于大数据、Hadoop、搜索和安全。
内页插图
目录
Preface 1. Introduction An Overview of Hadoop and MapReduce Hive in the Hadoop Ecosystem Pig HBase Cascading, Crunch, and Others Java Versus Hive: The Word Count Algorithm What's Next
2. Getting Started Installing a Preconfigured Virtual Machine Detailed Installation Installing Java Installing Hadoop Local Mode, Pseudodistributed Mode, and Distributed Mode Testing Hadoop Installing Hive What Is Inside Hive? Starting Hive Configuring Your Hadoop Environment Local Mode Configuration Distributed and Pseudodistributed Mode Configuration Metastore Using JDBC The Hive Command Command Options The Command-Line Interface CLI Options Variables and Properties Hive "One Shot" Commands Executing Hive Queries from Files The .hiverc File More on Using the Hive CLI Command History Shell Execution Hadoop dfs Commands from Inside Hive Comments in Hive Scripts Query Column Headers
3. Data Types and File Formats Primitive Data Types Collection Data Types Text File Encoding of Data Values Schema on Read
4. HiveQL: Data Definition Databases in Hive Alter Database Creating Tables Managed Tables External Tables Partitioned, Managed Tables External Partitioned Tables Customizing Table Storage Formats Dropping Tables Alter Table Renaming a Table Adding, Modifying, and Dropping a Table Partition Changing Columns Adding Columns Deleting or Replacing Columns Alter Table Properties Alter Storage Properties Miscellaneous Alter Table Statements
5. HiveQt: Data Manipulation Loading Data into Managed Tables Inserting Data into Tables from Queries Dynamic Partition Inserts Creating Tables and Loading Them in One Query Exporting Data ……
6.HiveQL: Queries 7.HiveQL: Views 8.HiveQL: Indexes 9.Schema Design 10.Tuning 11.Other File Formats and Compression 12.Developing 13.Functions 14.Streaming 15.Customizing Hive File and Record Formats 16.Hive Thrift Service 17.Storage Handlers and NoSQL 18.Security 19.Locking 20.Hive Integration with Oozie 21.Hive and Amazon Web Services(AWS) 22.HCatalog 23.Case Studies Glossary Appendix:References Index