20 lines
1.2 KiB
Markdown
20 lines
1.2 KiB
Markdown
|
---
|
||
|
title: Hive
|
||
|
updated: 2022-05-24 18:43:47Z
|
||
|
created: 2022-05-24 18:35:26Z
|
||
|
---
|
||
|
|
||
|
The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.
|
||
|
|
||
|
Built on top of Apache Hadoop™, Hive provides the following features:
|
||
|
|
||
|
- Tools to enable easy access to data via SQL, thus enabling data warehousing tasks such as extract/transform/load (ETL), reporting, and data analysis.
|
||
|
- A mechanism to impose structure on a variety of data formats
|
||
|
- Access to files stored either directly in **Apache HDFS™** or in other data storage systems such as **Apache HBase™**
|
||
|
- Query execution via **Apache Tez™, Apache Spark™**, or **MapReduce**
|
||
|
- Procedural language with HPL-SQL
|
||
|
- Sub-second query retrieval via Hive LLAP, Apache YARN and Apache Slider.
|
||
|
|
||
|
Hive's SQL can also be extended with user code via user defined functions (**UDF**s), user defined aggregates (UDAFs), and user defined table functions (UDTFs).
|
||
|
|
||
|
Hive is not designed for online transaction processing (OLTP) workloads. It is best used for traditional data warehousing tasks.
|