This chapter explains how to create a table and how to insert data into it. The conventions of creating a table in HIVE is quite similar to creating a table using SQL.
Create Table is a statement used to create a table in Hive. The syntax and example are as follows:
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name [(col_name data_type [COMMENT col_comment], ...)] [COMMENT table_comment] [ROW FORMAT row_format] [STORED AS file_format]
Let us assume you need to create a table named employee using CREATE TABLE statement. The following table lists the fields and their data types in employee table:
Sr.No | Field Name | Data Type |
---|---|---|
1 | Eid | int |
2 | Name | String |
3 | Salary | Float |
4 | Designation | string |
The following data is a Comment, Row formatted fields such as Field terminator, Lines terminator, and Stored File type.
COMMENT ‘Employee details’ FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED IN TEXT FILE
The following query creates a table named employee using the above data.
hive> CREATE TABLE IF NOT EXISTS employee ( eid int, name String, salary String, destination String) COMMENT ‘Employee details’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED AS TEXTFILE;
If you add the option IF NOT EXISTS, Hive ignores the statement in case the table already exists.
On successful creation of table, you get to see the following response:
OK Time taken: 5.905 seconds hive>
The JDBC program to create a table is given example.
import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveCreateTable { private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; public static void main(String[] args) throws SQLException { // Register driver and create driver instance Class.forName(driverName); // get connection Connection con = DriverManager.getConnection("jdbc:hive://localhost:10000/userdb", "", ""); // create statement Statement stmt = con.createStatement(); // execute statement stmt.executeQuery("CREATE TABLE IF NOT EXISTS " +" employee ( eid int, name String, " +" salary String, destignation String)" +" COMMENT ‘Employee details’" +" ROW FORMAT DELIMITED" +" FIELDS TERMINATED BY ‘\t’" +" LINES TERMINATED BY ‘\n’" +" STORED AS TEXTFILE;"); System.out.println(“ Table employee created.”); con.close(); } }
Save the program in a file named HiveCreateDb.java. The following commands are used to compile and execute this program.
$ javac HiveCreateDb.java $ java HiveCreateDb
Table employee created.
Generally, after creating a table in SQL, we can insert data using the Insert statement. But in Hive, we can insert data using the LOAD DATA statement.
While inserting data into Hive, it is better to use LOAD DATA to store bulk records. There are two ways to load data: one is from local file system and second is from Hadoop file system.
The syntax for load data is as follows:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
We will insert the following data into the table. It is a text file named sample.txt in /home/user directory.
1201 Gopal 45000 Technical manager 1202 Manisha 45000 Proof reader 1203 Masthanvali 40000 Technical writer 1204 Kiran 40000 Hr Admin 1205 Kranthi 30000 Op Admin
The following query loads the given text into the table.
hive> LOAD DATA LOCAL INPATH '/home/user/sample.txt' OVERWRITE INTO TABLE employee;
On successful download, you get to see the following response:
OK Time taken: 15.905 seconds hive>
Given below is the JDBC program to load given data into the table.
import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveLoadData { private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; public static void main(String[] args) throws SQLException { // Register driver and create driver instance Class.forName(driverName); // get connection Connection con = DriverManager.getConnection("jdbc:hive://localhost:10000/userdb", "", ""); // create statement Statement stmt = con.createStatement(); // execute statement stmt.executeQuery("LOAD DATA LOCAL INPATH '/home/user/sample.txt'" + "OVERWRITE INTO TABLE employee;"); System.out.println("Load Data into employee successful"); con.close(); } }
Save the program in a file named HiveLoadData.java. Use the following commands to compile and execute this program.
$ javac HiveLoadData.java $ java HiveLoadData
Load Data into employee successful