FTP/SFTP connector


FTP/SFTP connector

Parent document: Connectors

Main functionalities

This connector can be used to read files from FTP/SFTP servers in batch scenarios. Its functionalities mainly include:

  • Support reading files in multiple directories
  • Support reading files of various formats

Maven dependency

<dependency>
   <groupId>com.bytedance.bitsail</groupId>
   <artifactId>bitsail-connector-ftp</artifactId>
   <version>${revision}</version>
</dependency>

Supported data types

  • Basic data types supported:
    • Integer type:
      • tinyint
      • smallint
      • int
      • bigint
    • Float type:
      • float
      • double
      • decimal
    • Time type:
      • timestamp
      • date
    • String type:
      • string
      • varchar
      • char
    • Bool type:
      • boolean
    • Binary type:
      • binary
  • Composited data types supported:
    • map
    • array

Parameters

The following mentioned parameters should be added to job.reader block when using, for example: ftp-connector-example

Necessary parameters

Param nameRequiredOptional valueDescription
classYesClass name of connector,com.bytedance.bitsail.connector.legacy.ftp.source.FtpInputFormat
path_listYesSpecifies the path of the read in file. Multiple paths can be specified, separated by ','
content_typeYesJSON/CSVSpecify the format of the read in file. For details, refer to Supported formats
columnsYesDescribing fields' names and types
portYesServer port,normally FTP is 21, SFTP is 22
hostYesServer host
userYesUsername
passwordYesPassword
protocolYesFTP/SFTPProtocol
success_file_pathYesPath to SUCCESS tag file

Optional parameters

Param nameRequiredDefault valueOptional valueDescription
connect_patternNoPASV if FTP, NULL if SFTPPASV/PORT/NULLIn ftp mode, connect pattern can be PASV or PORT. In sftp mode, connect pattern is NULL
time_outNo5000msConnection timeout
enable_success_file_checkNoTrueEnabled by default, the job will not start if SUCCESS tag doesn't exist
max_retry_timeNo60Max time to check for SUCCESS tag file
retry_interval_msNo60sRetry interval to check for SUCCESS tag file
charsetNoutf-8File encoding

Supported formats

Support the following formats(configured by content_type):

JSON

It supports parsing text files in json format. Each line is required to be a standard json string.

The following parameters are supported to adjust the json parsing stype:

Parameter nameDefault valueDescription
job.common.case_insensitivetrueWhether to be sensitive to the case of the key in the json field
job.common.json_serializer_featuresSpecify the mode when 'FastJsonUtil' is parsed. The format is ',' separated string, for example "QuoteFieldNames,UseSingleQuotes"
job.common.convert_error_column_as_nullfalseWhether to set the field with parsing error to null

CSV

Support parsing of text files in csv format. Each line is required to be a standard csv string.

The following parameters are supported to adjust the csv parsing style:

Parameter nameDefault valueDescription
job.common.csv_delimiter','csv delimiter
job.common.csv_escapeescape character
job.common.csv_quotequote character
job.common.csv_with_null_stringSpecify the conversion value of null field. It is not converted by default

Configuration examples: FTP/SFTP connector example