LarkSheet connector-v1
LarkSheet connector-v1
Parent document: Connectors
The BitSail LarkSheet connector supports reading lark sheets. The main function points are as follows:
- Support batch read from single or multiple lark sheets at once.
- Support authentication by static token and application.
- Support read a portion of columns from sheets.
Maven dependency
<dependency>
<groupId>com.bytedance.bitsail</groupId>
<artifactId>connector-larksheet</artifactId>
<version>${revision}</version>
</dependency>
LarkSheet reader
Supported data types
BitSail LarkSheet reader processes all data as string.
Parameters
The following mentioned parameters should be added to job.reader
block when using, for example:
{
"job": {
"reader": {
"class": "com.bytedance.bitsail.connector.legacy.larksheet.source.LarkSheetInputFormat",
"sheet_urls": "https://e4163pj5kq.feishu.cn/sheets/shtcnQmZNlZ9PjZUJKT5oU3Sjjg?sheet=ZbzDHq",
"columns": [
{
"name": "id",
"type": "string"
},
{
"name": "datetime",
"type": "string"
}
]
}
}
}
Necessary parameters
Param name | Required | Optional value | Description |
---|---|---|---|
class | Yes | LarkSheet reader class name, com.bytedance.bitsail.connector.legacy.larksheet.source.LarkSheetInputFormat | |
sheet_urls | Yes | A list of sheet to read. Multi sheets urls are separated by comma. | |
columns | Yes | Describing fields' names and types. |
The following parameters are for authentication, you have to set (sheet_token
) or (app_id
and app_secret
) in your configuration.
Param name | Required | Optional value | Description |
---|---|---|---|
sheet_token | At least set one: 1. sheet_token 2. app_id and app_secret | Token for get permission to visit feishu open api. | |
app_id | Use app_id and app_secret to generate token for visiting feishu open api. | ||
app_secret |
Note that if you use sheet_token
, it may expire when the job runs. If you use app_id
and app_secret
, the token will be refreshed if it expires.
Optional parameters
Param name | Required | Optional value | Description |
---|---|---|---|
reader_parallelism_num | No | Read parallelism num | |
batch_size | No | Number of lines extracted once. | |
skip_nums | no | A list of numbers indicating how many lines should be skipped in each sheet. |
Related documents
Configuration examples: LarkSheet connector example