Transforming with GPLOAD

To transform data using the GPLOAD control file, you must specify both the file name for the TRANSFORM_CONFIG file and the name of the TRANSFORM operation in the INPUT section of the GPLOAD control file.

  • TRANSFORM_CONFIGspecifies the name of the gpfdist configuration file.
  • The TRANSFORM setting indicates the name of the transformation that is described in the file named in TRANSFORM_CONFIG.
---
VERSION: 1.0.0.1
DATABASE: ops
USER: gpadmin
GPLOAD:
INPUT:
- TRANSFORM_CONFIG: config.yaml
- TRANSFORM: prices_input
- SOURCE:
FILE: prices.xml

The transformation operation name must appear in two places: in the TRANSFORM setting of the gpfdist configuration file and in the TRANSFORMATIONS section of the file named in the TRANSFORM_CONFIG section.

In the GPLOAD control file, the optional parameter MAX_LINE_LENGTH specifies the maximum length of a line in the XML transformation data that is passed to hawq load.

The following diagram shows the relationships between the GPLOAD control file, the gpfdist configuration file, and the XML data file.