Client-Based HAWQ Load Tools
HAWQ supports data loading from Red Hat Enterprise Linux 5, 6, and 7 and Windows XP client systems. HAWQ Load Tools include both a loader program and a parallel file distribution program.
This topic presents the instructions to install the HAWQ Load Tools on your client machine. It also includes the information necessary to configure HAWQ databases to accept remote client connections.
RHEL Load Tools
The RHEL Load Tools are provided in a HAWQ distribution.
Installing the RHEL Loader
Download a HAWQ installer package or build HAWQ from source.
Refer to the HAWQ command line install instructions to set up your package repositories and install the HAWQ binary.
Install the
libevent
andlibyaml
packages. These libraries are required by the HAWQ file server. You must have superuser privileges on the system.$ sudo yum install -y libevent libyaml
About the RHEL Loader Installation
The files/directories of interest in a HAWQ RHEL Load Tools installation include:
bin/
— data loading command-line tools (gpfdist and hawq load)
greenplum_path.sh
— environment set up file
Configuring the RHEL Load Environment
A greenplum_path.sh
file is located in the HAWQ base install directory following installation. Source greenplum_path.sh
before running the HAWQ RHEL Load Tools to set up your HAWQ environment:
$ . /usr/local/hawq/greenplum_path.sh
Continue to Using the HAWQ File Server (gpfdist) for specific information about using the HAWQ load tools.
Windows Load Tools
Installing Python 2.5
The HAWQ Load Tools for Windows requires that the 32-bit version of Python 2.5 be installed on your system.
Note: The 64-bit version of Python is not compatible with the HAWQ Load Tools for Windows.
Download the Python 2.5 installer for Windows. Make note of the directory to which it was downloaded.
Double-click on the
python Load Tools for Windows-2.5.x.msi
package to launch the installer.Select Install for all users and click Next.
The default Python install location is
C:\Pythonxx
. Click Up or New to choose another location. Click Next.Click Next to install the selected Python components.
Click Finish to complete the Python installation.
Running the Windows Installer
Download the
greenplum-loaders-4.3.x.x-build-n-WinXP-x86_32.msi
installer package from Pivotal Network. Make note of the directory to which it was downloaded.Double-click the
greenplum-loaders-4.3.x.x-build-n-WinXP-x86_32.msi
file to launch the installer.Click Next on the Welcome screen.
Click I Agree on the License Agreement screen.
The default install location for HAWQ Loader Tools for Windows is
C:\"Program Files (x86)"\Greenplum\greenplum-loaders-4.3.8.1-build-1
. Click Browse to choose another location.Click Next.
Click Install to begin the installation.
Click Finish to exit the installer.
About the Windows Loader Installation
Your HAWQ Windows Load Tools installation includes the following files and directories:
bin/
— data loading command-line tools (gpfdist and gpload)
lib/
— data loading library files
greenplum_loaders_path.bat
— environment set up file
Configuring the Windows Load Environment
A greenplum_loaders_path.bat
file is provided in your load tools base install directory following installation. This file sets the following environment variables:
GPHOME_LOADERS
- base directory of loader installationPATH
- adds the loader and component program directoriesPYTHONPATH
- adds component library directories
Execute greenplum_loaders_path.bat
to set up your HAWQ environment before running the HAWQ Windows Load Tools.
Enabling Remote Client Connections
The HAWQ master database must be configured to accept remote client connections. Specifically, you need to identify the client hosts and database users that will be connecting to the HAWQ database.
Ensure that the HAWQ database master
pg_hba.conf
file is correctly configured to allow connections from the desired users operating on the desired database from the desired hosts, using the authentication method you choose. For details, see Configuring Client Access.Make sure the authentication method you choose is supported by the client tool you are using.
If you edited the
pg_hba.conf
file, reload the server configuration. If you have any active database connections, you must include the-M fast
option in thehawq stop
command:$ hawq stop cluster -u [-M fast]
Verify and/or configure the databases and roles you are using to connect, and that the roles have the correct privileges to the database objects.