Version: nightly

Configuration

GreptimeDB supports layered configuration with the following precedence order (where each item overrides the one below it):

Greptime command line options
Configuration file options
Environment variables
Default values

You only need to set up the configurations you require. GreptimeDB will assign default values for any settings not configured.

How to set up configurations

Greptime command line options

You can specify several configurations using command line arguments. For example, to start GreptimeDB in standalone mode with a configured HTTP address:

greptime standalone start --http-addr 127.0.0.1:4000

For all the options supported by the Greptime command line, refer to the GreptimeDB Command Line Interface.

Configuration file options

You can specify configurations in a TOML file. For example, create a configuration file standalone.example.toml as shown below:

[storage]
type = "File"
data_home = "/tmp/greptimedb/"

Then, specify the configuration file using the command line argument -c [file_path].

greptime [standalone | frontend | datanode | metasrv]  start -c config/standalone.example.toml

For example, to start in standalone mode:

greptime standalone start -c standalone.example.toml

Example files

Below are example configuration files for each GreptimeDB component, including all available configurations. In actual scenarios, you only need to configure the required options and do not need to configure all options as in the sample file.

Environment variable

Every item in the configuration file can be mapped to environment variables. For example, to set the data_home configuration item for the datanode using an environment variable:

# ...
[storage]
data_home = "/data/greptimedb"
# ...

Use the following shell command to set the environment variable in the following format:

export GREPTIMEDB_DATANODE__STORAGE__DATA_HOME=/data/greptimedb

Environment Variable Rules

Each environment variable should have the component prefix, for example:
- GREPTIMEDB_FRONTEND
- GREPTIMEDB_METASRV
- GREPTIMEDB_DATANODE
- GREPTIMEDB_STANDALONE
Use double underscore __ separators. For example, the data structure storage.data_home is transformed to STORAGE__DATA_HOME.

The environment variable also accepts lists that are separated by commas ,, for example:

GREPTIMEDB_METASRV__META_CLIENT__METASRV_ADDRS=127.0.0.1:3001,127.0.0.1:3002,127.0.0.1:3003

Options

In this section, we will introduce some main configuration options. For all options, refer to the Configuration Reference on Github.

Protocol options

Protocol options are valid in frontend and standalone subcommands, specifying protocol server addresses and other protocol-related options.

Below is an example configuration with default values. You can change the values or disable certain protocols in your configuration file. For example, to disable OpenTSDB protocol support, set the enable parameter to false. Note that HTTP and gRPC protocols cannot be disabled for the database to function correctly.

[http]
addr = "127.0.0.1:4000"
timeout = "30s"
body_limit = "64MB"

[grpc]
addr = "127.0.0.1:4001"
runtime_size = 8

[mysql]
enable = true
addr = "127.0.0.1:4002"
runtime_size = 2

[mysql.tls]
mode = "disable"
cert_path = ""
key_path = ""

[postgres]
enable = true
addr = "127.0.0.1:4003"
runtime_size = 2

[postgres.tls]
mode = "disable"
cert_path = ""
key_path = ""

[opentsdb]
enable = true

[influxdb]
enable = true

[prom_store]
enable = true
with_metric_engine = true

The following table describes the options in detail:

Option	Key	Type	Description
http			HTTP server options
	addr	String	Server address, "127.0.0.1:4000" by default
	timeout	String	HTTP request timeout, "30s" by default
	body_limit	String	HTTP max body size, "64MB" by default
	is_strict_mode	Boolean	Whether to enable the strict verification mode of the protocol, which will slightly affect performance. False by default.
grpc			gRPC server options
	addr	String	Server address, "127.0.0.1:4001" by default
	runtime_size	Integer	The number of server worker threads, 8 by default
mysql			MySQL server options
	enable	Boolean	Whether to enable MySQL protocol, true by default
	addr	String	Server address, "127.0.0.1:4002" by default
	runtime_size	Integer	The number of server worker threads, 2 by default
influxdb			InfluxDB Protocol options
	enable	Boolean	Whether to enable InfluxDB protocol in HTTP API, true by default
opentsdb			OpenTSDB Protocol options
	enable	Boolean	Whether to enable OpenTSDB protocol in HTTP API, true by default
prom_store			Prometheus remote storage options
	enable	Boolean	Whether to enable Prometheus Remote Write and read in HTTP API, true by default
	with_metric_engine	Boolean	Whether to use the metric engine on Prometheus Remote Write, true by default
postgres			PostgresSQL server options
	enable	Boolean	Whether to enable PostgresSQL protocol, true by default
	addr	String	Server address, "127.0.0.1:4003" by default
	runtime_size	Integer	The number of server worker threads, 2 by default

Storage options

The storage options are valid in datanode and standalone mode, which specify the database data directory and other storage-related options.

GreptimeDB supports storing data in local file system, AWS S3 and compatible services (including MinIO, digitalocean space, Tencent Cloud Object Storage(COS), Baidu Object Storage(BOS) and so on), Azure Blob Storage and Aliyun OSS.

Option	Key	Type	Description
storage			Storage options
	type	String	Storage type, supports "File", "S3" and "Oss" etc.
File			Local file storage options, valid when type="File"
	data_home	String	Database storage root directory, "/tmp/greptimedb" by default
S3			AWS S3 storage options, valid when type="S3"
	bucket	String	The S3 bucket name
	root	String	The root path in S3 bucket
	endpoint	String	The API endpoint of S3
	region	String	The S3 region
	access_key_id	String	The S3 access key id
	secret_access_key	String	The S3 secret access key
Oss			Aliyun OSS storage options, valid when type="Oss"
	bucket	String	The OSS bucket name
	root	String	The root path in OSS bucket
	endpoint	String	The API endpoint of OSS
	access_key_id	String	The OSS access key id
	secret_access_key	String	The OSS secret access key
Azblob			Azure Blob Storage options, valid when type="Azblob"
	container	String	The container name
	root	String	The root path in container
	endpoint	String	The API endpoint of Azure Blob Storage
	account_name	String	The account name of Azure Blob Storage
	account_key	String	The access key
	sas_token	String	The shared access signature
Gsc			Google Cloud Storage options, valid when type="Gsc"
	root	String	The root path in Gsc bucket
	bucket	String	The Gsc bucket name
	scope	String	The Gsc service scope
	credential_path	String	The Gsc credentials path
	endpoint	String	The API endpoint of Gsc

A file storage sample configuration:

[storage]
type = "File"
data_home = "/tmp/greptimedb/"

A S3 storage sample configuration:

[storage]
type = "S3"
bucket = "test_greptimedb"
root = "/greptimedb"
access_key_id = "<access key id>"
secret_access_key = "<secret access key>"

Storage engine provider

[[storage.providers]] setups the table storage engine providers. Based on these providers, you can create a table with a specified storage, see create table:

# Allows using multiple storages
[[storage.providers]]
type = "S3"
bucket = "test_greptimedb"
root = "/greptimedb"
access_key_id = "<access key id>"
secret_access_key = "<secret access key>"

[[storage.providers]]
type = "Gcs"
bucket = "test_greptimedb"
root = "/greptimedb"
credential_path = "<gcs credential path>"

All configured providers can be used as the storage option when creating tables.

Object storage cache

When using S3, OSS or Azure Blob Storage, it's better to enable object storage caching for speedup data querying:

[storage]
type = "S3"
bucket = "test_greptimedb"
root = "/greptimedb"
access_key_id = "<access key id>"
secret_access_key = "<secret access key>"
## Enable object storage caching
cache_path = "/var/data/s3_local_cache"
cache_capacity = "256MiB"

The cache_path is the local file directory that keeps cache files, and the cache_capacity is the maximum total file size in the cache directory.

WAL options

The [wal] section in datanode or standalone config file configures the options of Write-Ahead-Log:

Local WAL

[wal]
provider = "raft_engine"
file_size = "256MB"
purge_threshold = "4GB"
purge_interval = "10m"
read_batch_size = 128
sync_write = false

dir: is the directory where to write logs. When using File storage, it's {data_home}/wal by default. It must be configured explicitly when using other storage types such as S3 etc.
file_size: the maximum size of the WAL log file, default is 256MB.
purge_threshold and purge_interval: control the purging of wal files, default is 4GB.
sync_write: whether to call fsync when writing every log.

Remote WAL

[wal]
provider = "kafka"
broker_endpoints = ["127.0.0.1:9092"]
max_batch_bytes = "1MB"
consumer_wait_timeout = "100ms"
backoff_init = "500ms"
backoff_max = "10s"
backoff_base = 2
backoff_deadline = "5mins"

broker_endpoints: The Kafka broker endpoints.
max_batch_bytes: The max size of a single producer batch.
consumer_wait_timeout: The consumer wait timeout.
backoff_init: The initial backoff delay.
backoff_max: The maximum backoff delay.
backoff_base: The exponential backoff rate.
backoff_deadline: The deadline of retries.

Remote WAL Authentication (Optional)

[wal.sasl]
type = "SCRAM-SHA-512"
username = "user"
password = "secret"

The SASL configuration for Kafka client, available SASL mechanisms: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512.

Remote WAL TLS (Optional)

[wal.tls]
server_ca_cert_path = "/path/to/server_cert"
client_cert_path = "/path/to/client_cert"
client_key_path = "/path/to/key"

The TLS configuration for Kafka client, support modes: TLS (using system ca certs), TLS (with specified ca certs), mTLS.

Examples:

TLS (using system ca certs)

[wal.tls]

TLS (with specified ca cert)

[wal.tls]
server_ca_cert_path = "/path/to/server_cert"

mTLS

[wal.tls]
server_ca_cert_path = "/path/to/server_cert"
client_cert_path = "/path/to/client_cert"
client_key_path = "/path/to/key"

Logging options

frontend, metasrv, datanode and standalone can all configure log and tracing related parameters in the [logging] section:

[logging]
dir = "/tmp/greptimedb/logs"
level = "info"
enable_otlp_tracing = false
otlp_endpoint = "localhost:4317"
append_stdout = true
[logging.tracing_sample_ratio]
default_ratio = 1.0

dir: log output directory.
level: output log level, available log level are info, debug, error, warn, the default level is info.
enable_otlp_tracing: whether to turn on tracing, not turned on by default.
otlp_endpoint: Export the target endpoint of tracing using gRPC-based OTLP protocol, the default value is localhost:4317.
append_stdout: Whether to append logs to stdout. Defaults to true.
tracing_sample_ratio: This field can configure the sampling rate of tracing. How to use tracing_sample_ratio, please refer to How to configure tracing sampling rate.

How to use distributed tracing, please reference Tracing

Region engine options

The parameters corresponding to different storage engines can be configured for datanode and standalone in the [region_engine] section. Currently, only options for mito region engine is available.

Frequently used options:

[[region_engine]]
[region_engine.mito]
num_workers = 8
manifest_checkpoint_distance = 10
max_background_jobs = 4
auto_flush_interval = "1h"
global_write_buffer_size = "1GB"
global_write_buffer_reject_size = "2GB"
sst_meta_cache_size = "128MB"
vector_cache_size = "512MB"
page_cache_size = "512MB"
sst_write_buffer_size = "8MB"
scan_parallelism = 0

[region_engine.mito.inverted_index]
create_on_flush = "auto"
create_on_compaction = "auto"
apply_on_query = "auto"
mem_threshold_on_create = "64M"
intermediate_path = ""

[region_engine.mito.memtable]
type = "time_series"

The mito engine provides an experimental memtable which optimizes for write performance and memory efficiency under large amounts of time-series. Its read performance might not as fast as the default time_series memtable.

[region_engine.mito.memtable]
type = "partition_tree"
index_max_keys_per_shard = 8192
data_freeze_threshold = 32768
fork_dictionary_bytes = "1GiB"

Available options:

Key	Type	Default	Descriptions
`num_workers`	Integer	`8`	Number of region workers.
`manifest_checkpoint_distance`	Integer	`10`	Number of meta action updated to trigger a new checkpoint for the manifest.
`max_background_jobs`	Integer	`4`	Max number of running background jobs
`auto_flush_interval`	String	`1h`	Interval to auto flush a region if it has not flushed yet.
`global_write_buffer_size`	String	`1GB`	Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
`global_write_buffer_reject_size`	String	`2GB`	Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
`sst_meta_cache_size`	String	`128MB`	Cache size for SST metadata. Setting it to 0 to disable the cache. If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
`vector_cache_size`	String	`512MB`	Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache. If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
`page_cache_size`	String	`512MB`	Cache size for pages of SST row groups. Setting it to 0 to disable the cache. If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
`sst_write_buffer_size`	String	`8MB`	Buffer size for SST writing.
`scan_parallelism`	Integer	`0`	Parallelism to scan a region (default: 1/4 of cpu cores). - `0`: using the default value (1/4 of cpu cores). - `1`: scan in current thread. - `n`: scan in parallelism n.
`inverted_index`	--	--	The options for inverted index in Mito engine.
`inverted_index.create_on_flush`	String	`auto`	Whether to create the index on flush. - `auto`: automatically - `disable`: never
`inverted_index.create_on_compaction`	String	`auto`	Whether to create the index on compaction. - `auto`: automatically - `disable`: never
`inverted_index.apply_on_query`	String	`auto`	Whether to apply the index on query - `auto`: automatically - `disable`: never
`inverted_index.mem_threshold_on_create`	String	`64M`	Memory threshold for performing an external sort during index creation. Setting to empty will disable external sorting, forcing all sorting operations to happen in memory.
`inverted_index.intermediate_path`	String	`""`	File system path to store intermediate files for external sorting (default `{data_home}/index_intermediate`).
`memtable.type`	String	`time_series`	Memtable type. - `time_series`: time-series memtable - `partition_tree`: partition tree memtable (experimental)
`memtable.index_max_keys_per_shard`	Integer	`8192`	The max number of keys in one shard. Only available for `partition_tree` memtable.
`memtable.data_freeze_threshold`	Integer	`32768`	The max rows of data inside the actively writing buffer in one shard. Only available for `partition_tree` memtable.
`memtable.fork_dictionary_bytes`	String	`1GiB`	Max dictionary bytes. Only available for `partition_tree` memtable.

Specify meta client

The meta_client options are valid in datanode and frontend mode, which specify the Metasrv client information.

metasrv_addrs = ["127.0.0.1:3002"]
timeout = "3s"
connect_timeout = "1s"
ddl_timeout = "10s"
tcp_nodelay = true

The meta_client configures the Metasrv client, including:

metasrv_addrs: The Metasrv address list.
timeout: operation timeout, 3s by default.
connect_timeout, connect server timeout, 1s by default.
ddl_timeout, DDL execution timeout, 10s by default.
tcp_nodelay, TCP_NODELAY option for accepted connections, true by default.

Monitor metrics options

These options are used to save system metrics to GreptimeDB itself. For instructions on how to use this feature, please refer to the Monitoring guide.

[export_metrics]
# Whether to enable export_metrics
enable=true
# Export time interval
write_interval = "30s"

enable: Whether to enable export_metrics, false by default.
write_interval: Export time interval.

`self_import` method

Only frontend and standalone support exporting metrics using self_import method.

[export_metrics]
# Whether to enable export_metrics
enable=true
# Export time interval
write_interval = "30s"
[export_metrics.self_import]
db = "information_schema"

db: The default database used by self_import is information_schema. You can also create another database for saving system metrics.

`remote_write` method

The remote_write method is supported by datanode, frontend, metasrv, and standalone. It sends metrics to a receiver compatible with the Prometheus Remote-Write protocol.

[export_metrics]
# Whether to enable export_metrics
enable=true
# Export time interval
write_interval = "30s"
[export_metrics.remote_write]
# URL specified by Prometheus Remote-Write protocol
url = "http://127.0.0.1:4000/v1/prometheus/write?db=information_schema"
# Some optional HTTP parameters, such as authentication information
headers = { Authorization = "Basic Z3JlcHRpbWVfdXNlcjpncmVwdGltZV9wd2Q=" }

url: URL specified by Prometheus Remote-Write protocol.
headers: Some optional HTTP parameters, such as authentication information.

Mode option

The mode option is valid in datanode, frontend and standalone, which specify the running mode of the component.

In the configuration files of datanode and frontend of distributed GreptimeDB, the value needs to be set as distributed:

mode = "distributed"

In the configuration files of standalone GreptimeDB, the value needs to be set as standalone:

mode = "standalone"

Metasrv-only configuration

# The working home directory.
data_home = "/tmp/metasrv/"
# The bind address of metasrv, "127.0.0.1:3002" by default.
bind_addr = "127.0.0.1:3002"
# The communication server address for frontend and datanode to connect to metasrv,  "127.0.0.1:3002" by default for localhost.
server_addr = "127.0.0.1:3002"
# Etcd server addresses, "127.0.0.1:2379" by default.
store_addr = "127.0.0.1:2379"
# Datanode selector type.
# - "lease_based" (default value).
# - "load_based"
# For details, please see "https://docs.greptime.com/contributor-guide/meta/selector".
selector = "LeaseBased"
# Store data in memory, false by default.
use_memory_store = false
## Whether to enable region failover.
## This feature is only available on GreptimeDB running on cluster mode and
## - Using Remote WAL
## - Using shared storage (e.g., s3).
enable_region_failover = false

## Procedure storage options.
[procedure]

## Procedure max retry time.
max_retry_times = 12

## Initial retry delay of procedures, increases exponentially
retry_delay = "500ms"

# Failure detectors options.
[failure_detector]

## The threshold value used by the failure detector to determine failure conditions.
threshold = 8.0

## The minimum standard deviation of the heartbeat intervals, used to calculate acceptable variations.
min_std_deviation = "100ms"

## The acceptable pause duration between heartbeats, used to determine if a heartbeat interval is acceptable.
acceptable_heartbeat_pause = "10000ms"

## The initial estimate of the heartbeat interval used by the failure detector.
first_heartbeat_estimate = "1000ms"

## Datanode options.
[datanode]

## Datanode client options.
[datanode.client]

## Operation timeout.
timeout = "10s"

## Connect server timeout.
connect_timeout = "10s"

## `TCP_NODELAY` option for accepted connections.
tcp_nodelay = true

[wal]
# Available wal providers:
# - `raft_engine` (default): there're none raft-engine wal config since metasrv only involves in remote wal currently.
# - `kafka`: metasrv **have to be** configured with kafka wal config when using kafka wal provider in datanode.
provider = "raft_engine"

# Kafka wal config.

## The broker endpoints of the Kafka cluster.
broker_endpoints = ["127.0.0.1:9092"]

## Number of topics to be created upon start.
num_topics = 64

## Topic selector type.
## Available selector types:
## - `round_robin` (default)
selector_type = "round_robin"

## A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.
topic_name_prefix = "greptimedb_wal_topic"

## Expected number of replicas of each partition.
replication_factor = 1

## Above which a topic creation operation will be cancelled.
create_topic_timeout = "30s"
## The initial backoff for kafka clients.
backoff_init = "500ms"

## The maximum backoff for kafka clients.
backoff_max = "10s"

## Exponential backoff rate, i.e. next backoff = base * current backoff.
backoff_base = 2

## Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate.
backoff_deadline = "5mins"

# The Kafka SASL configuration.
# **It's only used when the provider is `kafka`**.
# Available SASL mechanisms:
# - `PLAIN`
# - `SCRAM-SHA-256`
# - `SCRAM-SHA-512`
# [wal.sasl]
# type = "SCRAM-SHA-512"
# username = "user_kafka"
# password = "secret"

# The Kafka TLS configuration.
# **It's only used when the provider is `kafka`**.
# [wal.tls]
# server_ca_cert_path = "/path/to/server_cert"
# client_cert_path = "/path/to/client_cert"
# client_key_path = "/path/to/key"

Key	Type	Default	Descriptions
`data_home`	String	`/tmp/metasrv/`	The working home directory.
`bind_addr`	String	`127.0.0.1:3002`	The bind address of metasrv.
`server_addr`	String	`127.0.0.1:3002`	The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost.
`store_addr`	String	`127.0.0.1:2379`	Etcd server address.
`selector`	String	`lease_based`	Datanode selector type. - `lease_based` (default value). - `load_based` For details, see Selector
`use_memory_store`	Bool	`false`	Store data in memory.
`enable_region_failover`	Bool	`false`	Whether to enable region failover. This feature is only available on GreptimeDB running on cluster mode and - Using Remote WAL - Using shared storage (e.g., s3).
`procedure`	--	--	Procedure storage options.
`procedure.max_retry_times`	Integer	`12`	Procedure max retry time.
`procedure.retry_delay`	String	`500ms`	Initial retry delay of procedures, increases exponentially
`failure_detector`	--	--	--
`failure_detector.threshold`	Float	`8.0`	The threshold value used by the failure detector to determine failure conditions.
`failure_detector.min_std_deviation`	String	`100ms`	The minimum standard deviation of the heartbeat intervals, used to calculate acceptable variations.
`failure_detector.acceptable_heartbeat_pause`	String	`10000ms`	The acceptable pause duration between heartbeats, used to determine if a heartbeat interval is acceptable.
`failure_detector.first_heartbeat_estimate`	String	`1000ms`	The initial estimate of the heartbeat interval used by the failure detector.
`datanode`	--	--	Datanode options.
`datanode.client`	--	--	Datanode client options.
`datanode.client.timeout`	String	`10s`	Operation timeout.
`datanode.client.connect_timeout`	String	`10s`	Connect server timeout.
`datanode.client.tcp_nodelay`	Bool	`true`	`TCP_NODELAY` option for accepted connections.
`wal`	--	--	--
`wal.provider`	String	`raft_engine`	--
`wal.broker_endpoints`	Array	--	The broker endpoints of the Kafka cluster.
`wal.num_topics`	Integer	`64`	Number of topics to be created upon start.
`wal.selector_type`	String	`round_robin`	Topic selector type. Available selector types: - `round_robin` (default)
`wal.topic_name_prefix`	String	`greptimedb_wal_topic`	A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.
`wal.replication_factor`	Integer	`1`	Expected number of replicas of each partition.
`wal.create_topic_timeout`	String	`30s`	Above which a topic creation operation will be cancelled.
`wal.backoff_init`	String	`500ms`	The initial backoff for kafka clients.
`wal.backoff_max`	String	`10s`	The maximum backoff for kafka clients.
`wal.backoff_base`	Integer	`2`	Exponential backoff rate, i.e. next backoff = base * current backoff.
`wal.backoff_deadline`	String	`5mins`	Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate.
`wal.sasl`	String	--	The Kafka SASL configuration.
`wal.sasl.type`	String	--	The SASL mechanisms, available values: `PLAIN`, `SCRAM-SHA-256`, `SCRAM-SHA-512`.
`wal.sasl.username`	String	--	The SASL username.
`wal.sasl.password`	String	--	The SASL password.
`wal.tls`	String	--	The Kafka TLS configuration.
`wal.tls.server_ca_cert_path`	String	--	The path of trusted server ca certs.
`wal.tls.client_cert_path`	String	--	The path of client cert (Used for enable mTLS).
`wal.tls.client_key_path`	String	--	The path of client key (Used for enable mTLS).

Datanode-only configuration

node_id = 42
rpc_hostname = "127.0.0.1"
rpc_addr = "127.0.0.1:3001"
rpc_runtime_size = 8

Key	Type	Description
node_id	Integer	The datanode identifier, should be unique.
rpc_hostname	String	Hostname of this node.
rpc_addr	String	gRPC server address, `"127.0.0.1:3001"` by default.
rpc_runtime_size	Integer	The number of gRPC server worker threads, 8 by default.

Configuration

How to set up configurations​

Greptime command line options​

Configuration file options​

Example files​

Environment variable​

Environment Variable Rules​

Options​

Protocol options​

Storage options​

Storage engine provider​

Object storage cache​

WAL options​

Local WAL​

Remote WAL​

Remote WAL Authentication (Optional)​

Remote WAL TLS (Optional)​

Logging options​

Region engine options​

Specify meta client​

Monitor metrics options​

self_import method​

remote_write method​

Mode option​

Metasrv-only configuration​

Datanode-only configuration​