--- title: Hive updated: 2022-05-24 18:43:47Z created: 2022-05-24 18:35:26Z --- The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive. Built on top of Apache Hadoop™, Hive provides the following features: - Tools to enable easy access to data via SQL, thus enabling data warehousing tasks such as extract/transform/load (ETL), reporting, and data analysis. - A mechanism to impose structure on a variety of data formats - Access to files stored either directly in **Apache HDFS™** or in other data storage systems such as **Apache HBase™** - Query execution via **Apache Tez™, Apache Spark™**, or **MapReduce** - Procedural language with HPL-SQL - Sub-second query retrieval via Hive LLAP, Apache YARN and Apache Slider. Hive's SQL can also be extended with user code via user defined functions (**UDF**s), user defined aggregates (UDAFs), and user defined table functions (UDTFs). Hive is not designed for online transaction processing (OLTP) workloads. It is best used for traditional data warehousing tasks.