Monday, January 27, 2014

Executing Hive Scripts

Step 1: Writing a Hive script.

To write the Hive Script the file should be saved with .sql extension. Open a terminal in your Cloudera CDH4 distribution and give the following command to create a Hive Script.
Command: sudo gedit sample.sql

On executing the above command, it will open the file with the list of all the Hive commands that need to be executed.

In this script, a table will be created, described and data will be loaded and retrieved from the table.

1. To create the table in Hive:

Command: create table product ( productid: int, productname: string, price: float, category: string) rows format delimited fields terminated by ‘,’ ;

Here, product is the table name and { productid, productname, price, category} are the columns of this table.

Fields terminated by ‘,’ indicate that the columns in the input file are separated by the symbol ‘,’.

By default the records in the input file are separated by a new line.

2. Describing the table:

Command: describe product;

3. Loading the data into the table.

To load the data into the table first we need to create an input file which contains the records that need to be inserted in the table.

Let us create an input file. The command is:

Command: sudo gedit input.txt

Edit the contents in the file as shown in the figure.

4. Retrieving the data:

To retrieve the data, the select command is used.

Command: Select * from product;

The above command is used to retrieve the value of all the columns present in the table. The script should be like as it is shown in the below image.

Now we are done with writing the Hive script. The file sample.sql can now be saved.

Step 2: Running the Hive Script

The following is the command to run the Hive script:

Command: hive –f /home/cloudera/sample.sql

While executing the script, make sure that the entire path of the location of the Script file is present.

We can see that all the commands are executed successfully.

This is how Hive scipts are run and executed in CDH4.

Apache Hive Installation on Ubuntu

6:40 AM Karthik 1 comment

Hive Installation on Ubuntu:

Please follow the below steps to install Apache Hive on Ubuntu:

Step 1: Download Hive tar.

Command: wget -c http://archive.apache.org/dist/hive/hive-0.9.0/hive-0.9.0-bin.tar.gz

Step 2: Extract the tar file.

Command: tar -xzvf hive-0.9.0-bin.tar.gz

Step 3: Edit the “.bashrc” file to update the environment variables for user.

hadoop fs -mkdir /user/hive/warehouse

hadoop fs -mkdir /temp

hadoop fs -chmodg+w /user/hive/warehouse

hadoop fs -chmodg+w /temp

Command: sudo gedit .bashrc

Add the following at the end of the file:

export HADOOP_HOME=/home/user/hadoop-1.2.0

export HIVE_HOME=/home/user/hive-0.9.0-bin

export PATH=$PATH:$HIVE_HOME/bin

export PATH=$PATH:$HADOOP_HOME/bin

Step 4: Create Hive directories within HDFS.

Command:

The directory ‘warehouse’ is the location to store the table or data related to hive.

The temporary directory ‘temp’is the temporary location to store the intermediate result of processing.

Step 5: Set read/write permissions for table.

Command:

In this command we are giving written permission to the group:

Step 6: Set Hadoop path in Hive config.sh.

Command: sudo gedit hive-config.sh

Step 7: Launch Hive.

Command: hive

Step 8: Create sample tables.

Command: hive> CREATE TABLE shakespeare (freq INT, word STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ STORED AS TEXTFILE;

Create sample tables:

Step 9: To exit from Hive:

Command: hive> exit;

Video Tutorial for Data mining with Weka from University of Waikato

1:48 AM Karthik 355 comments

Week - 1 : Getting started with Weka

Week - 2 : Evaluation

Week - 3 : Simple Classifiers

Week - 4 : More Classifiers

Week - 5 : Putting it all together

Video Tutorial for MongoDB DBA Course from MongoDB University

6:08 AM Karthik 2 comments

Week - 1 :

Week - 2 :

Week - 3 :

Week - 4 :

Introduction to Replication
Replica Sets Overview
Replica Sets Demo
Replica Sets Demo (Windows)
Replica Sets - the Simple http admin UI
Replica Set Configuration
GetLasterror and cluster wide commits
Multi data center and sample configurations
ReadPreference (SlaveOK)

Week - 5 :

Week - 6 :

Week - 7 :

BI and Big Data Adventure via Open Source Technologies

Monday, January 27, 2014

Executing Hive Scripts

Step 1: Writing a Hive script.

1. To create the table in Hive:

2. Describing the table:

3. Loading the data into the table.

4. Retrieving the data:

Step 2: Running the Hive Script

Apache Hive Installation on Ubuntu

Thursday, January 9, 2014

Video Tutorial for Data mining with Weka from University of Waikato

Tuesday, January 7, 2014

Video Tutorial for MongoDB DBA Course from MongoDB University

About Me

Popular Posts

Blog Archive

My Blog List

Total Pageviews