The JDBC Connector directly connects to the database through JDBC, and imports data into other storages or imports other stored data into the database in a batch manner. JDBC connectors internally read from slaves to minimize the impact on DB.
Currently, supports reading and writing three kinds of data sources including MySQL, Oracle, PgSQL, SqlServer.
JDBC connection schema name, usually only used for PgSql
table_name
-
Necessary if using table synchronization
string
table
Table to read
split_pk
-
Necessary if using table synchronization
string
id
The primary key used by the shard
split_pk_jdbc_type
int
No
string
Int/String
Shard key field type, supports numeric and string types
shard_split_mode
accurate
No
string
quick, accurate, parallelism
Splitting mode accurate: ensure that only reader_fetch_size if pulled from table in each request. parallelism: Splitting all data according to the reader parallelism num. The splitting will be fast, but may be nonuniform.
Clear write: A time partition field is required. When writing, if the time partition already exists, clear the existing time partition data, and then write.
Overwrite write: No time partition field is required. When writing, the data is not cleared. According to the unique key upsert, the old data is overwritten with the new data. When a duplicate key appears in the write, the on duplicate key update operation will be performed to update the field. In addition, note that sharding and sharding do not support updating shards. You need to configure the job.writer.shard_key parameter. The value is sharding. Multiple shards are separated by ','.
Insert Write mode. In order to ensure the consistency of repeated execution results, data is cleared according to the partition column before writing. The resulting write statement is similar to INSERT INTO xx (xx) VALUES (xx)
write_mode
overwrite
Overwrite write mode. Data is not cleared before writing. The resulting write statement looks like INSERT INTO xx (xx) VALUES (xx) ON DUPLICATE KEY UPDATE (xx) VALUES(xx)
In insert mode, data will be deleted according to partition information. The following parameters are for insert mode:
Param name
Default value
Is necessary
Parameter type
Recommended value / Example value
Description
partition_name
-
Yes
string
date
Partition name, this is a logical concept, meaning the data of partition value will be deleted according to this field before writing data.
partition_value
-
Yes
string
20220727
Partition value
partition_pattern_format
-
No
string
yyyyMMdd/yyyy-MM-dd
Partition Field format
mysql_data_ttl
0
No
int
0
The number of days that data is kept in database. The delete operation will be performed according to the value of the configured ddl and partition_name fields. For example, if ttl is set to 3, partition name is date, and partition value is set to 20220727, all data with date<=20220724 in the database will be deleted.
delete_threshold
10000
No
int
10000
When deleting, the number of pieces of data deleted each time
The primary key of the table, if you need to limit the rate when pgSQL deletes, you need to use the primary key value to use the select limit statement to limit the delete rate
upsert_key
-
No
string
id
Unique index, supports overwriting, PG only supports overwriting for a single unique index
delete_threshold_enabled
TRUE
No
string
Truefalse
Whether to limit the deletion rate, the default is true, when false, you do not need to provide the primary key
is_truncate_mode
FALSE
No
string
Truefalse
Whether it is truncate mode, true will delete the whole table first and no partition column is required; non-truncate mode requires a partition column
The primary key of the table, if you need to limit the rate when Oracle deletes, you need to use the primary key value to use the select limit statement to limit the delete rate
partition_name
-
Yes
string (case sensitive)
DATETIME
Same as general parameters except value is case sensitive.
db_name
-
Yes
string (case sensitive)
DB
Same as general parameters except value is case sensitive.