Description
(you don't have to strictly follow this form)
Describe the issue
I am using the guide at: https://clickhouse.com/docs/en/architecture/replication and it took me 2 days to figure out how to make replication work because the guide did not have clear instructions.
In section https://clickhouse.com/docs/en/architecture/replication#replication-and-sharding-configuration
For self-hosters that have the servers and keepers running on different machines as the guide recommends, follow these steps to configure config.xml:
- On each server, in the commandline issue: hostname --fqdn
- Use exactly that value for the host under <remote_servers> otherwise replication will not work at all!
Example:
<replica>
<host>dev-us-west-infra.us-west1-a.c.leftoverstoday-dev.internal</host>
<port>9000</port>
</replica>
<replica>
<host>dev-us-east-infra.us-east4-a.c.leftoverstoday-dev.internal</host>
<port>9000</port>
</replica>
Additional context
Ideally SELECT hostname(); should also return this complete name dev-us-west-infra.us-west1-a.c.leftoverstoday-dev.internal instead of just returning dev-us-west-infra because its confusing and the short name is not what is used in system.distributed_ddl_queue as the initiator_host.
Please publish a self-host guide where each component is on a different VM and the values actually match what the system expects to function.
Without the above fix you end up with this issue:
ClickHouse/ClickHouse#18341
DDLWorker: Will not execute task query-0000000005: There is no a local address in host list