Swift is a distributed and consistent object/blob store. Swift offers cloud storage software so that you can store and retrieve lots of data with a simple API. Tajo supports Swift integration.
The following are the prerequisites of Swift Integration −
Add the following changes to the hadoop “core-site.xml” file −
<property> <name>fs.swift.impl</name> <value>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</value> <description>File system implementation for Swift</description> </property> <property> <name>fs.swift.blocksize</name> <value>131072</value> <description>Split size in KB</description> </property>
This will be used for Hadoop to access the Swift objects. After you made all the changes move to the Tajo directory to set Swift environment variable.
Open the Tajo configuration file and add set the environment variable as follows −
$ vi conf/tajo-env.h export TAJO_CLASSPATH = $HADOOP_HOME/share/hadoop/tools/lib/hadoop-openstack-x.x.x.jar
Now, Tajo will be able to query the data using Swift.
Let’s create an external table to access Swift objects in Tajo as follows −
default> create external table swift(num1 int, num2 text, num3 float) using text with ('text.delimiter' = '|') location 'swift://bucket-name/table1';
After the table has been created, you can run the SQL queries.