Technical Reference
Setup details
User and role management
Consider not using the default user; instead, create a dedicated one to use it with this Fivetran
destination only. The following commands, executed with the default user, will create a new fivetran_user with the
required privileges.
Additionally, you can revoke access to certain databases from the fivetran_user.
For example, by executing the following statement, we restrict access to the default database:
You can execute these statements in the ClickHouse SQL console.
Advanced configuration
The ClickHouse Cloud destination supports an optional JSON configuration file for advanced use cases. This file allows you to fine-tune destination behavior by overriding the default settings that control batch sizes, parallelism, connection pools, and request timeouts.
NOTE: This configuration is entirely optional. If no file is uploaded, the destination uses sensible defaults that work well for most use cases.
The file must be valid JSON and conform to the schema described below.
If you need to modify the configuration after the initial setup, you can edit the destination configurations in the Fivetran dashboard and upload an updated file.
The configuration file has a top-level section:
Inside of it you can specify the following configurations that control the internal behavior of the ClickHouse destination connector itself. These configurations affect how the connector processes data before sending it to ClickHouse.
| Setting | Type | Default | Allowed Range | Description |
|---|---|---|---|---|
write_batch_size | integer | 100000 | 5,000 – 100,000 | Number of rows per batch for insert, update, and replace operations. |
select_batch_size | integer | 1500 | 200 – 1,500 | Number of rows per batch for SELECT queries used during updates. |
mutation_batch_size | integer | 1500 | 200 – 1,500 | Number of rows per batch for ALTER TABLE UPDATE mutations in history mode. Lower it if you are experiencing large SQL statements. |
hard_delete_batch_size | integer | 1500 | 200 – 1,500 | Number of rows per batch for hard delete operations in history mode. Lower it if you are experiencing large SQL statements. |
All fields are optional. If a field is not specified, the default value is used. If a value is outside the allowed range, the destination will report an error during sync. Unknown fields are silently ignored (a warning is logged) and do not cause errors, which allows forward compatibility when new settings are added.
Example:
Type transformation mapping
The Fivetran ClickHouse destination maps Fivetran data types to ClickHouse types as follows:
| Fivetran type | ClickHouse type |
|---|---|
| BOOLEAN | Bool |
| SHORT | Int16 |
| INT | Int32 |
| LONG | Int64 |
| BIGDECIMAL | Decimal(P, S) |
| FLOAT | Float32 |
| DOUBLE | Float64 |
| LOCALDATE | Date |
| LOCALDATETIME | DateTime |
| INSTANT | DateTime64(9, 'UTC') |
| STRING | String |
| BINARY | String * |
| XML | String * |
| JSON | String * |
* BINARY, XML, and JSON are stored as String because ClickHouse's String type can represent an arbitrary set of bytes. The destination adds a column comment to indicate the original data type. The ClickHouse JSON data type is not used as it was marked as obsolete and never recommended for production usage.
Destination tables
The ClickHouse Cloud destination uses
Replacing engine type of
SharedMergeTree family
(specifically, SharedReplacingMergeTree), versioned by the _fivetran_synced column.
Every column except primary (ordering) keys and Fivetran metadata columns is created
as Nullable(T), where T is a
ClickHouse Cloud type based on the data types mapping.
Every destination table includes the following metadata columns:
| Column | Type | Description |
|---|---|---|
_fivetran_synced | DateTime64(9, 'UTC') | Timestamp when the record was synced by Fivetran. Used as the version column for SharedReplacingMergeTree. |
_fivetran_deleted | Bool | Soft delete marker. Set to true when the source record is deleted. |
_fivetran_id | String | Auto-generated unique identifier. Only present when the source table has no primary keys. |
Single primary key in the source table
For example, source table users has a primary key column id (INT) and a regular column name (STRING).
The destination table will be defined as follows:
In this case, the id column is chosen as a table sorting key.
Multiple primary keys in the source table
If the source table has multiple primary keys, they are used in order of their appearance in the Fivetran source table definition.
For example, there is a source table items with primary key columns id (INT) and name (STRING), plus an
additional regular column description (STRING). The destination table will be defined as follows:
In this case, id and name columns are chosen as table sorting keys.
No primary keys in the source table
If the source table has no primary keys, a unique identifier will be added by Fivetran as a _fivetran_id column.
Consider an events table that only has the event (STRING) and timestamp (LOCALDATETIME) columns in the source.
The destination table in that case is as follows:
Since _fivetran_id is unique and there are no other primary key options, it is used as a table sorting key.
Selecting the latest version of the data without duplicates
SharedReplacingMergeTree performs background data deduplication
only during merges at an unknown time.
However, selecting the latest version of the data without duplicates ad-hoc is possible with the FINAL keyword and
select_sequential_consistency
setting:
See also Duplicate records with ReplacingMergeTree in the troubleshooting guide.
Retries on network failures
The ClickHouse Cloud destination retries transient network errors using the exponential backoff algorithm.
This is safe even when the destination inserts the data, as any potential duplicates are handled by
the SharedReplacingMergeTree table engine, either during background merges,
or when querying the data with SELECT FINAL.