Lesson 1 - Runtime Environment
This section introduces you to the HAWQ runtime environment. You will examine your HAWQ installation, set up your HAWQ environment, and execute HAWQ management commands. If installed in your environment, you will also explore the Ambari management console.
Prerequisites
Install a HAWQ commercial product distribution or HAWQ sandbox virtual machine or docker environment, or build and install HAWQ from source. Ensure that your HAWQ installation is configured appropriately.
Make note of the HAWQ master node hostname or IP address.
The HAWQ administrative user is named
gpadmin
. This is the user account from which you will administer your HAWQ cluster. To perform the exercises in this tutorial, you must:- Obtain the
gpadmin
user credentials. - Ensure that your HAWQ runtime environment is configured such that the HAWQ admin user
gpadmin
can run commands to access the HDFS Hadoop system accounts (hdfs
,hadoop
) viasudo
without having to provide a password. - Obtain the Ambari UI user name and password (optional, if Ambari is installed in your HAWQ deployment). The default Ambari user name and password are both
admin
.
- Obtain the
Exercise: Set Up your HAWQ Runtime Environment
HAWQ installs a script that you can use to set up your HAWQ cluster environment. The greenplum_path.sh
script, located in your HAWQ root install directory, sets $PATH
and other environment variables to find HAWQ files. Most importantly, greenplum_path.sh
sets the $GPHOME
environment variable to point to the root directory of the HAWQ installation. If you installed HAWQ from a product distribution or are running a HAWQ sandbox environment, the HAWQ root is typically /usr/local/hawq
. If you built HAWQ from source or downloaded the tarball, your $GPHOME
may differ.
Perform the following steps to set up your HAWQ runtime environment:
Log in to the HAWQ master node using the
gpadmin
user credentials; you may not need to provide a password:$ ssh gpadmin@<master> Password: gpadmin@master$
Set up your HAWQ operating environment by sourcing the
greenplum_path.sh
file. If you built HAWQ from source or downloaded the tarball, substitute the path to the installed or extractedgreenplum_path.sh
file (for example/opt/hawq-2.1.0.0/greenplum_path.sh
):gpadmin@master$ source /usr/local/hawq/greenplum_path.sh
source
inggreenplum_path.sh
sets:$GPHOME
$PATH
to include the HAWQ$GPHOME/bin/
directory$LD_LIBRARY_PATH
to include the HAWQ libraries in$GPHOME/lib/
gpadmin@master$ echo $GPHOME /usr/local/hawq/. gpadmin@master$ echo $PATH /usr/local/hawq/./bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/gpadmin/bin gpadmin@master$ echo $LD_LIBRARY_PATH /usr/local/hawq/./lib
Note: You must source
greenplum_path.sh
before invoking any HAWQ commands.Edit your (
gpadmin
).bash_profile
or other shell initialization file to sourcegreenplum_path.sh
on login. For example, add:source /usr/local/hawq/greenplum_path.sh
Set the HAWQ-specific environment variables relevant to your deployment in your shell initialization file. These include
PGDATABASE
,PGHOST
,PGOPTIONS
,PGPORT
, andPGUSER.
You may not need to set any of these environment variables. For example, if you use a custom HAWQ master port number, make this port number the default by setting thePGPORT
environment variable in your shell initialization file; add:export PGPORT=5432
Setting
PGPORT
simplifiespsql
invocation by providing a default for the port option value.Similarly, setting
PGDATABASE
simplifiespsql
invocation by providing a default for the database option value.Examine your HAWQ installation:
gpadmin@master$ ls $GPHOME bin docs etc greenplum_path.sh include lib sbin share
The HAWQ command line utilities are located in
$GPHOME/bin
.$GPHOME/lib
includes HAWQ and PostgreSQL libraries.View the current state of your HAWQ cluster, and if it is not already running, start the cluster. In practice, you will perform different procedures depending upon whether you manage your cluster from the command line or use Ambari. While you are introduced to both in this tutorial, lessons will focus on command line instructions, as not every HAWQ deployment will utilize Ambari.
Command Line:
gpadmin@master$ hawq state Failed to connect to database, this script can only be run when the database is up.
If your cluster is not running, start it:
gpadmin@master$ hawq start cluster 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Prepare to do 'hawq start' 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-You can find log in: 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_start_20170411.log 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-GPHOME is set to: 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-/usr/local/hawq/. 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Start hawq with args: ['start', 'cluster'] 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Gathering information and validating the environment... 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-No standby host configured 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Start all the nodes in hawq cluster 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Starting master node 'master' 20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Start master service 20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Master started successfully 20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Start all the segments in hawq cluster 20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Start segments in list: ['segment'] 20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Start segment service 20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Total segment number is: 1 ..... 20170411:15:54:53:357122 hawq_start:master:gpadmin-[INFO]:-1 of 1 segments start successfully 20170411:15:54:53:357122 hawq_start:master:gpadmin-[INFO]:-Segments started successfully 20170411:15:54:53:357122 hawq_start:master:gpadmin-[INFO]:-HAWQ cluster started successfully
Get the status of your cluster:
gpadmin@master$ hawq state 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- HAWQ instance status summary 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:------------------------------------------------------ 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Master instance = Active 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- No Standby master defined 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Total segment instance count from config file = 1 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:------------------------------------------------------ 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Segment Status 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:------------------------------------------------------ 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Total segments count from catalog = 1 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Total segment valid (at master) = 1 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Total segment failures (at master) = 0 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Total number of postmaster.pid files missing = 0 20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- Total number of postmaster.pid files found = 1
State information returned includes the status of the master node, standby master, number of segment instances, and for each segment, the number valid and failed.
Ambari:
If your deployment includes an Ambari server, perform the following steps to start and view the current state of your HAWQ cluster.
Start the Ambari management console by entering the following URL in your favorite (supported) browser window:
<ambari-server-node>:8080
Log in with the Ambari credentials (default
admin
:admin
) and view the Ambari dashboard:The Ambari dashboard provides an at-a-glance status of the health of your HAWQ cluster. A list of each running service and its status is provided in the left panel. The main display area includes a set of configurable tiles providing specific information about your cluster, including HAWQ segment status, HDFS disk usage, and resource manager metrics.
Navigate to the HAWQ service listed in the left pane. If the service is not running (i.e. no green checkmark to the left of the service name), start your HAWQ cluster by clicking the HAWQ service name, and then selecting the Start operation from the Service Actions menu button.
Log out of the Ambari console by clicking the admin button and selecting the Sign out drop down menu item.
Summary
Your HAWQ cluster is now running. For additional information:
- HAWQ Files and Directories identifies HAWQ files and directories and their install locations.
- Environment Variables includes a complete list of HAWQ deployment-specific environment variables.
- Running a HAWQ Cluster provides an overview of the components comprising a HAWQ cluster, including the users (administrative and operating), deployment systems (HAWQ master, standby, and segments), databases, and data sources.
Lesson 2 introduces basic HAWQ cluster administration activities and commands.
Lesson 2: Cluster Administration