Lesson 3 - Database Administration
The HAWQ gpadmin user and other users who are granted the necessary privileges can execute SQL commands to create HAWQ databases and tables. These commands may be invoked via scripts, programs, and from the psql client utility.
This lesson introduces basic HAWQ database administration commands and tasks using psql. You will create a database and a simple table, and add data to and query the table.
Prerequisites
Ensure that you have Set Up your HAWQ Runtime Environment and that your HAWQ cluster is up and running.
Exercise: Create the HAWQ Tutorial Database
In this exercise, you use the psql command line utility to create a HAWQ database.
Start the
psqlsubsystem:gpadmin@master$ psql -d postgresYou enter the
psqlinterpreter, connecting to thepostgresdatabase.postgresis a default template database created during HAWQ installation.psql (8.2.15) Type "help" for help. postgres=#The
psqlprompt is the database name followed by=#or=>.=#identifies the session as that of a database superuser. The defaultpsqlprompt for a non-superuser is=>.Create a database named
hawqgsdb:postgres=# CREATE DATABASE hawqgsdb; CREATE DATABASEThe
;at the end of theCREATE DATABASEstatement instructspsqlto interpret the command. SQL commands that span multiple lines are not interpreted until the;is entered.Connect to the
hawqgsdbdatabase you just created:postgres=# \c hawqgsdb You are now connected to database "hawqgsdb" as user "gpadmin". hawqgsdb=#Use the
psql\lmeta-command to list all HAWQ databases:hawqgsdb=# \l List of databases Name | Owner | Encoding | Access privileges -----------------+---------+----------+------------------- hawqgsdb | gpadmin | UTF8 | postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (4 rows)HAWQ creates two additional template databases during installation,
template0andtemplate1, as you see above. Your HAWQ cluster may list additional databases.Exit
psql:hawqgsdb=# \q
Exercise: Use psql for Table Operations
You manage and access HAWQ databases and tables via the psql utility, an interactive front-end to the HAWQ database. In this exercise, you use psql to create, add data to, and query a simple HAWQ table.
Start the
psqlsubsystem:gpadmin@master$ psql -d hawqgsdbThe
-d hawqgsdboption instructspsqlto connect directly to thehawqgsdbdatabase.Create a table named
first_tblthat has a single integer column namedi:hawqgsdb=# CREATE TABLE first_tbl( i int ); CREATE TABLEDisplay descriptive information about table
first_tbl:hawqgsdb=# \d first_tbl Append-Only Table "public.first_tbl" Column | Type | Modifiers --------+---------+----------- i | integer | Compression Type: None Compression Level: 0 Block Size: 32768 Checksum: f Distributed randomlyfirst_tblis a table in the HAWQpublicschema.first_tblhas a single integer column, was created with no compression, and is distributed randomly.Add some data to
first_tbl:hawqgsdb=# INSERT INTO first_tbl VALUES(1); INSERT 0 1 hawqgsdb=# INSERT INTO first_tbl VALUES(2); INSERT 0 1Each
INSERTcommand adds a row tofirst_tbl, the first adding a row with the valuei=1, and the second, a row with the valuei=2. EachINSERTalso displays the number of rows added (1).HAWQ provides several built-in functions for data manipulation. The
generate_series(<start>, <end>)function generates a series of numbers beginning with<start>and finishing at<end>. Use thegenerate_series()HAWQ built-in function to add rows fori=3,i=4, andi=5tofirst_tbl:hawqgsdb=# INSERT INTO first_tbl SELECT generate_series(3, 5); INSERT 0 3This
INSERTcommand uses thegenerate_series()built-in function to add 3 rows tofirst_tbl, starting withi=3and writing and incrementingifor each new row.Perform a query to return all rows in the
first_tbltable:hawqgsdb=# SELECT * FROM first_tbl; i ---- 1 2 3 4 5 (5 rows)The
SELECT *command queriesfirst_tbl, returning all columns and all rows.SELECTalso displays the total number of rows returned in the query.Perform a query to return column
ifor all rows infirst_tblwhereiis greater than 3:hawqgsdb=# SELECT i FROM first_tbl WHERE i>3; i ---- 4 5 (2 rows)The
SELECTcommand returns the 2 rows (i=4andi=5) in the table whereiis larger than 3 and displays the value ofi.Exit the
psqlsubsystem:hawqgsdb=# \qpsqlincludes an option,-c, to run a single SQL command from the shell command line. Perform the same query you ran in Step 7 using the-c <sql-command>option:gpadmin@master$ psql -d hawqgsdb -c 'SELECT i FROM first_tbl WHERE i>3'Notice that you enclose the SQL command in single quotes.
Set the HAWQ
PGDATABASEenvironment variable to identifyhawqsgdb:gpadmin@master$ export PGDATABASE=hawqgsdb$PGDATABASEidentifies the default database to which to connect when invoking the HAWQpsqlcommand.Re-run the query from the command line again, this time omitting the
-doption:gpadmin@master$ psql -c 'SELECT i FROM first_tbl WHERE i>3'When no database is specified on the command line,
psqlattempts to connect to the database identified by$PGDATABASE.Add the
PGDATABASEsetting to your.bash_profile:export PGDATABASE=hawqgsdb
Summary
You created the database you will use in later lessons. You also created, inserted data into, and queried a simple HAWQ table usingpsql.
For information on SQL command support in HAWQ, refer to the SQL Command reference.
For detailed information on the psql subsystem, refer to the psql reference page. Commonly-used psql meta-commands are identified in the table below.
| Action | Command |
|---|---|
| List databases | \l |
| List tables in current database | \dt |
| Describe a specific table | \d <table-name> |
| Execute an SQL script | \i <script-name> |
| Quit/Exit | \q |
Lesson 4 introduces the Retail demo, a more complicated data set used in upcoming lessons. You will download and examine the data set and work files. You will also load some of the data set into HDFS.
Lesson 4: Sample Data Set and HAWQ Schemas