Basic Data Operations

This topic describes basic data operations that you perform in HAWQ.

Inserting Rows

Use the INSERT command to create rows in a table. This command requires the table name and a value for each column in the table; you may optionally specify the column names in any order. If you do not specify column names, list the data values in the order of the columns in the table, separated by commas.

For example, to specify the column names and the values to insert:

INSERT INTO products (name, price, product_no) VALUES ('Cheese', 9.99, 1);

To specify only the values to insert:

INSERT INTO products VALUES (1, 'Cheese', 9.99);

Usually, the data values are literals (constants), but you can also use scalar expressions. For example:

INSERT INTO films SELECT * FROM tmp_films WHERE date_prod <
'2004-05-07';

You can insert multiple rows in a single command. For example:

INSERT INTO products (product_no, name, price) VALUES
    (1, 'Cheese', 9.99),
    (2, 'Bread', 1.99),
    (3, 'Milk', 2.99);

To insert data into a partitioned table, you specify the root partitioned table, the table created with the CREATE TABLE command. You also can specify a leaf child table of the partitioned table in an INSERT command. An error is returned if the data is not valid for the specified leaf child table. Specifying a child table that is not a leaf child table in the INSERT command is not supported.

To insert large amounts of data, use external tables or the COPY command. These load mechanisms are more efficient than INSERT for inserting large quantities of rows. See Loading and Unloading Data for more information about bulk data loading.

Vacuuming the System Catalog Tables

Only HAWQ system catalog tables use multiple version concurrency control. Deleted or updated data rows in the catalog tables occupy physical space on disk even though new transactions cannot see them. Periodically running the VACUUM command removes these expired rows.

The VACUUM command also collects table-level statistics such as the number of rows and pages.

For example:

VACUUM pg_class;

Configuring the Free Space Map

Expired rows are held in the free space map. The free space map must be sized large enough to hold all expired rows in your database. If not, a regular VACUUM command cannot reclaim space occupied by expired rows that overflow the free space map.

Note: VACUUM FULL is not recommended with HAWQ because it is not safe for large tables and may take an unacceptably long time to complete. See VACUUM.

Size the free space map with the following server configuration parameters:

  • max_fsm_pages
  • max_fsm_relations