FTP/SFTP connector
FTP/SFTP connector
Parent document: Connectors
Main functionalities
This connector can be used to read files from FTP/SFTP servers in batch scenarios. Its functionalities mainly include:
- Support reading files in multiple directories
- Support reading files of various formats
Maven dependency
<dependency>
<groupId>com.bytedance.bitsail</groupId>
<artifactId>bitsail-connector-ftp</artifactId>
<version>${revision}</version>
</dependency>
Supported data types
- Basic data types supported:
- Integer type:
- tinyint
- smallint
- int
- bigint
- Float type:
- float
- double
- decimal
- Time type:
- timestamp
- date
- String type:
- string
- varchar
- char
- Bool type:
- boolean
- Binary type:
- binary
- Integer type:
- Composited data types supported:
- map
- array
Parameters
The following mentioned parameters should be added to job.reader
block when using, for example: ftp-connector-example
Necessary parameters
Param name | Required | Optional value | Description |
---|---|---|---|
class | Yes | Class name of connector,com.bytedance.bitsail.connector.legacy.ftp.source.FtpInputFormat | |
path_list | Yes | Specifies the path of the read in file. Multiple paths can be specified, separated by ',' | |
content_type | Yes | JSON/CSV | Specify the format of the read in file. For details, refer to Supported formats |
columns | Yes | Describing fields' names and types | |
port | Yes | Server port,normally FTP is 21, SFTP is 22 | |
host | Yes | Server host | |
user | Yes | Username | |
password | Yes | Password | |
protocol | Yes | FTP/SFTP | Protocol |
success_file_path | Yes | Path to SUCCESS tag file |
Optional parameters
Param name | Required | Default value | Optional value | Description |
---|---|---|---|---|
connect_pattern | No | PASV if FTP, NULL if SFTP | PASV/PORT/NULL | In ftp mode, connect pattern can be PASV or PORT. In sftp mode, connect pattern is NULL |
time_out | No | 5000ms | Connection timeout | |
enable_success_file_check | No | True | Enabled by default, the job will not start if SUCCESS tag doesn't exist | |
max_retry_time | No | 60 | Max time to check for SUCCESS tag file | |
retry_interval_ms | No | 60s | Retry interval to check for SUCCESS tag file | |
charset | No | utf-8 | File encoding |
Supported formats
Support the following formats(configured by content_type
):
JSON
It supports parsing text files in json format. Each line is required to be a standard json string.
The following parameters are supported to adjust the json parsing stype:
Parameter name | Default value | Description |
---|---|---|
job.common.case_insensitive | true | Whether to be sensitive to the case of the key in the json field |
job.common.json_serializer_features | Specify the mode when 'FastJsonUtil' is parsed. The format is ',' separated string, for example "QuoteFieldNames,UseSingleQuotes" | |
job.common.convert_error_column_as_null | false | Whether to set the field with parsing error to null |
CSV
Support parsing of text files in csv format. Each line is required to be a standard csv string.
The following parameters are supported to adjust the csv parsing style:
Parameter name | Default value | Description |
---|---|---|
job.common.csv_delimiter | ',' | csv delimiter |
job.common.csv_escape | escape character | |
job.common.csv_quote | quote character | |
job.common.csv_with_null_string | Specify the conversion value of null field. It is not converted by default |
Related documents
Configuration examples: FTP/SFTP connector example