JDBC Connector

Parent document: Connectors

The JDBC Connector directly connects to the database through JDBC, and imports data into other storages or imports other stored data into the database in a batch manner. JDBC connectors internally read from slaves to minimize the impact on DB.

Currently, supports reading and writing three kinds of data sources including MySQL, Oracle, PgSQL, SqlServer.

Supported data types

Supported by MySQL

bit
tinyint
tinyint unsigned
smallint
smallint unsigned
mediumint
mediumint unsigned
enum
int
int unsigned
bigint
bigint unsigned
timestamp
datetime
float
float unsigned
double
double unsigned
decimal
decimal unsigned
real
date
time
year
char
varchar
longvarchar
nvar
nvarchar
longnvarchar
tinytext
mediumtext
longtext
text
string
json
blob
mediumblob
tinyblob
longblob
binary
longvarbinary
varbinary
geometry
multipolygon
set

Supported by Oracle

char
varchar
interval day
interval year
intervalds
intervalym
varchar2
nchar
nvarchar2
long
blob
clob
nclob
string
character
number
integer
int
smallint
float
double
double precision
numeric
decimal
real
bit
bool
date
timestamp
timestamp with time zone
timestamp with local time zone
datetime
blob
bfile
raw
long raw
rowid
urowid
xmltype
binary_float
binary_double

Supported by PgSQL

char
bpchar
varchar
text
character varying
string
character
bigint
int8
integer
int
int4
smallserial
serial
bigserial
smallint
int2
double
money
double precision
float8
numeric
decimal
real
float
float4
boolean
bool
date
time
timetz
timestamp
timestamptz
bytea
bit
bit varying
varbit
uuid
cidr
xml
inet
macaddr
enum
json
jsonb
aclitem
_aclitem
_int2
_int4
_float4
_text
_char
cid
inet
int2vector
interval
oid
_oid
pg_node_tree

Supported by SqlServer

char
varchar
text
nchar
nvarchar
ntext
bigint
int
int identity
integer
smallint
tinyint
float
double precision
numeric
decimal
money
real
bit
date
timestamp
datetime
datetime2
time
binary
varbinary
image
datetimeoffset
smalldatetime
sql_variant
uniqueidentifier
xml

JDBC Source

Please add property `permitMysqlScheme in connection url when use MySQL.

Main function

Supports multiple sharding algorithms
Support the reading of sub-database and sub-table database
Support table synchronization and SQL synchronization
Support filter statement

Parameters

General parameters

Param name	Default value	Required	Parameter type	Recommended value / Example value	Description
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.JDBCInputFormat	Reader class name for mysql
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.OracleInputFormat	Reader class name for Oracle
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.PostgresqlInputFormat	Reader class name for Pgsql
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.SqlServerInputFormat	Reader class name for SqlServer
columns	-	No	list<map<string,string>>	"[ { "name":"id", "type":"int" }, { "name":"name", "type":"varchar" } ]	Describing fields' names and types. It needs to be consistent with the number of columns specified by the writer.

Database connection configuration

Param name	Default value	Required	Parameter type	Recommended value / Example value	Description
user_name	-	Yes	string	abc	JDBC username
password	-	Yes	string	password	JDBC password
query_timeout_seconds	300	No	int	300	JDBC connection timeout (s)
query_retry_times	3	No	int	3	Max retry times for JDBC query
connections	-	Yes		[ { "slaves": [ {"db_url": "jdbc:mysql://address=(protocol=tcp)(host=192.168.1.202)(port=3306)/test?permitMysqlScheme&rewriteBatchedStatements=true&autoReconnect=true&useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehavior=convertToNull&jdbcCompliantTruncation=false"} ]} ]	JDBC connection urls

Table synchronization configuration parameters

Param name	Default value	Required	Parameter type	Recommended value / Example value	Description
db_name	-	Yes	string	db	JDBC connection database name
table_schema	-	No	string	schema	JDBC connection schema name, usually only used for PgSql
table_name	-	Necessary if using table synchronization	string	table	Table to read
split_pk	-	Necessary if using table synchronization	string	id	The primary key used by the shard
split_pk_jdbc_type	int	No	string	Int/String	Shard key field type, supports numeric and string types
shard_split_mode	accurate	No	string	quick, accurate, parallelism	Splitting mode accurate: ensure that only `reader_fetch_size` if pulled from table in each request. parallelism: Splitting all data according to the reader parallelism num. The splitting will be fast, but may be nonuniform.
reader_fetch_size	10000	No	int	10000	Number of data pulled each time

SQL Synchronization Configuration Parameters

Param name	Default value	Required	Parameter type	Recommended value / Example value	Description
customized_sql	-	Necessary if using SQL Synchronization	string	Select id,name from xx.xx where id > 10	Custom SQL Statement for pulling data from table

Other parameters

Param name	Default value	Required	Parameter type	Recommended value / Example value	Description
filter	-	No	string	id>100	Filter conditions when pulling data. Will be placed after the query statement through the where statement

JDBC Sink

Please add property `permitMysqlScheme in connection url when use MySQL.

Main function

Supports TTL.
- Before executing the import task, expired data will be deleted according to the TTL parameter configured by the user.
- The default TTL is 0, that is, the data is permanently valid.
Supports multiple write modes: clear write mode and overwrite write mode
- Clear write: A time partition field is required. When writing, if the time partition already exists, clear the existing time partition data, and then write.
- Overwrite write: No time partition field is required. When writing, the data is not cleared. According to the unique key upsert, the old data is overwritten with the new data. When a duplicate key appears in the write, the on duplicate key update operation will be performed to update the field. In addition, note that sharding and sharding do not support updating shards. You need to configure the job.writer.shard_key parameter. The value is sharding. Multiple shards are separated by ','.

Parameters

General parameters

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.JDBCOutputFormat	Writer class name for mysql
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.OracleOutputFormat	Writer class name for Oracle
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.PostgresqlOutputFormat	Writer class name for Pgsql
class	-	Yes	string	com.bytedance.bitsail.connector.legacy.jdbc.source.SqlServerOutputFormat	Writer class name for SqlServer
columns	-	Yes	list<map<string,string>>	"[ { "name":"id", "type":"int" }, { "name":"name", "type":"varchar" } ]	Describing fields' names and types. It needs to be consistent with the number of columns specified by the reader.

Database connection configuration

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
user_name	-	Yes	string	abc	JDBC username
password	-	Yes	string	password	JDBC password
connections	-	Yes		[ { "db_url": "jdbc:mysql://address=(protocol=tcp)(host=192.168.1.202)(port=3306)/test?rewriteBatchedStatements=true&autoReconnect=true&useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehavior=convertToNull&jdbcCompliantTruncation=false" } ]	JDBC connection urls
db_name	-	Yes	string	db	Database to connect
table_schema	"public" for PgSql "dbo" for Sqlserver	No	string	schema	Schema to connect，usually used only in PgSql
table_name	-	Yes	string	table	Table to write

Write Mode Configuration Parameter

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
write_mode	insert	No	string	insert	Insert Write mode. In order to ensure the consistency of repeated execution results, data is cleared according to the partition column before writing. The resulting write statement is similar to INSERT INTO xx (xx) VALUES (xx)
write_mode				overwrite	Overwrite write mode. Data is not cleared before writing. The resulting write statement looks like INSERT INTO xx (xx) VALUES (xx) ON DUPLICATE KEY UPDATE (xx) VALUES(xx)

In insert mode, data will be deleted according to partition information. The following parameters are for insert mode:

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
partition_name	-	Yes	string	date	Partition name, this is a logical concept, meaning the data of partition value will be deleted according to this field before writing data.
partition_value	-	Yes	string	20220727	Partition value
partition_pattern_format	-	No	string	yyyyMMdd/yyyy-MM-dd	Partition Field format
mysql_data_ttl	0	No	int	0	The number of days that data is kept in database. The delete operation will be performed according to the value of the configured ddl and partition_name fields. For example, if ttl is set to 3, partition name is date, and partition value is set to 20220727, all data with date<=20220724 in the database will be deleted.
delete_threshold	10000	No	int	10000	When deleting, the number of pieces of data deleted each time
delete_interval_ms	100	No	int	100	Interval between deletes

Parameters for batch write

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
write_batch_interval	100	No	int	100	write batch interval
write_retry_times	3	No	int	3	max retry time when writing
retry_interval_seconds	10	No	int	10	retry interval (s)

Other parameters

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
pre_query	-	No	string	Select 1	The first statement to execute after connecting to the database
verify_query	-	No	string	Select 1	Validation statement to be executed after the task runs
shard_key	-	Necessary for sharded database	string	id	The shard key of the sharded database, no need to configure the non-sharded database

Parameters for PgSql

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
primary_key	-	No	string	id	The primary key of the table, if you need to limit the rate when pgSQL deletes, you need to use the primary key value to use the select limit statement to limit the delete rate
upsert_key	-	No	string	id	Unique index, supports overwriting, PG only supports overwriting for a single unique index
delete_threshold_enabled	TRUE	No	string	Truefalse	Whether to limit the deletion rate, the default is true, when false, you do not need to provide the primary key
is_truncate_mode	FALSE	No	string	Truefalse	Whether it is truncate mode, true will delete the whole table first and no partition column is required; non-truncate mode requires a partition column

Parameters for Oracle

Param name	Default value	Is necessary	Parameter type	Recommended value / Example value	Description
primary_key	-	Yes in Sink	string (case sensitive)	ID	The primary key of the table, if you need to limit the rate when Oracle deletes, you need to use the primary key value to use the select limit statement to limit the delete rate
partition_name	-	Yes	string (case sensitive)	DATETIME	Same as general parameters except value is case sensitive.
db_name	-	Yes	string (case sensitive)	DB	Same as general parameters except value is case sensitive.
columns.nameopen in new window	-	No	string (case sensitive)	COLUMN	Same as general parameters except value is case sensitive.

Configuration examples: JDBC connector example

JDBC Connector

JDBC Connector

Supported data types

Supported by MySQL

Supported by Oracle

Supported by PgSQL

Supported by SqlServer

JDBC Source

Main function

Parameters

General parameters

Database connection configuration

Table synchronization configuration parameters

SQL Synchronization Configuration Parameters

Other parameters

JDBC Sink

Main function

Parameters

General parameters

Database connection configuration

Write Mode Configuration Parameter

Parameters for batch write

Other parameters

Parameters for PgSql

Parameters for Oracle

Documents

Community

More

JDBC Connector

# JDBC Connector

# Supported data types

# Supported by MySQL

# Supported by Oracle

# Supported by PgSQL

# Supported by SqlServer

# JDBC Source

# Main function

# Parameters

# General parameters

# Database connection configuration

# Table synchronization configuration parameters

# SQL Synchronization Configuration Parameters

# Other parameters

# JDBC Sink

# Main function

# Parameters

# General parameters

# Database connection configuration

# Write Mode Configuration Parameter

# Parameters for batch write

# Other parameters

# Parameters for PgSql

# Parameters for Oracle

# Related documents

JDBC Connector

Supported data types

Supported by MySQL

Supported by Oracle

Supported by PgSQL

Supported by SqlServer

JDBC Source

Main function

Parameters

General parameters

Database connection configuration

Table synchronization configuration parameters

SQL Synchronization Configuration Parameters

Other parameters

JDBC Sink

Main function

Parameters

General parameters

Database connection configuration

Write Mode Configuration Parameter

Parameters for batch write

Other parameters

Parameters for PgSql

Parameters for Oracle

Related documents