Bootstrapping the Cluster
This section explains how to start a Cluster for the first time depending on the consensus choice (
raft) made during initialization.
The first start of the Cluster is the most critical step during the lifetime. We must ensure that peers are able to contact each other (connectivity) and discard common configuration errors (like using different value for
Starting a cluster peer is as easy as running:
BUT, unlike the IPFS daemon, which by default connects to the public IPFS network and can discover other peers in it by first connecting to a well known list of available bootstrappers, a Cluster peer runs on a private network and does not have any public peer to bootstrap to.
Thus, when starting IPFS Cluster peers for the first time, it is important to provide information so that they can discover the other peers and join the Cluster. Once a peer has successfully started once, it can be subsequently re-started with the command above. During shutdown, each peer’s
peerstore file will be updated to remember known addresses for other peers.
As we will see below, the first start has slightly different requirements depending on whether you will be running a crdt-based or a raft-based Cluster. You can read more about the differences between the two in the CRDT vs Raft table.
Bootstrapping the Cluster in CRDT mode
This is the easiest option to start a cluster because the only requirement a crdt-based peer has to become part of a Cluster is to contact at least one other peer. This can be achieved in several ways:
- Pre-filling the
peerstorefile with addresses for other peers (as we saw in the previous section).
- Running with the
--bootstrap <peer-multiaddress1,peer-multiaddress2>flag. Note that using this flag will automatically trust the given peers. For more information about trust, read the CRDT section.
- In local networks with mDNS discovery support, peers will autodiscover each other and no additional measures are necessary.
Example 1. Starting the first peer in a CRDT-based Cluster:
ipfs-cluster-service daemon --consensus crdt
Example 2. Starting more peers in a CRDT-based cluster by customizing the peerstore. The given multiaddress corresponds to the first peer:
echo "/dns4/cluster1.domain/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt" >> ~/.ipfs-cluster/peerstore ipfs-cluster-service daemon --consensus crdt
Example 3. Starting more peers in a CRDT-based cluster using the
--bootstrap flag. The given multiaddress corresponds to the first peer:
ipfs-cluster-service daemon --consensus crdt --bootstrap /dns4/cluster1.domain/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt
Bootstrapping the Cluster in Raft mode
In Raft Clusters, the first start of a peer must not only contact a different peer, but also complete the task of becoming a member of the Raft Cluster. Therefore the first start of a peer must always use the
Example 1. Starting the first peer in a Raft-based Cluster:
ipfs-cluster-service daemon --consensus raft
Example 2. Starting more peers in a Raft-based cluster. The given multiaddress corresponds to the first peer:
ipfs-cluster-service daemon --consensus raft --bootstrap /dns4/cluster1.domain/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt
Example 3. Subsequent starts when the peer already successfully joined a Raft cluster before:
ipfs-cluster-service daemon --consensus raft
Verifying a successful bootstrap
After starting your Cluster peers (especially the first time you are doing so), you should check that things are working correctly:
Check for errors in the logs. A successful peer start will print the “READY” message:
INFO cluster: ** IPFS Cluster is READY **
ipfs-cluster-ctl id to verify the details of your local Cluster peer. You should be able to see information for the Cluster peer and for the IPFS daemon it is connected to:
$ ipfs-cluster-ctl id QmYY1ggjoew5eFrvkenTR3F4uWqtkBkmgfJk8g9Qqcwy51 | peername | Sees 3 other peers > Addresses: - /ip4/127.0.0.1/tcp/9096/ipfs/QmYY1ggjoew5eFrvkenTR3F4uWqtkBkmgfJk8g9Qqcwy51 - /ip4/192.168.1.10/tcp/9096/ipfs/QmYY1ggjoew5eFrvkenTR3F4uWqtkBkmgfJk8g9Qqcwy51 > IPFS: QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq - /ip4/127.0.0.1/tcp/4001/ipfs/QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq - /ip4/127.0.0.1/tcp/4002/ws/ipfs/QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq - /ip6/::1/tcp/4001/ipfs/QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq - /ip6/::1/tcp/4002/ws/ipfs/QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq
- CRDT Cluster peers may take a few minutes to discover additional peers in the Cluster (depending on how they were bootstrapped), even after the READY message.
- Raft Cluster peers will fail to start after a few seconds if they have not successfully joined or re-joined the Raft Cluster. The READY message indicates that the peer is fine, even though it may show errors when contacting other peers that are down.
Here are some things that usually go wrong during the first boot. For more instructions on how to debug a cluster peer, see the Troubleshooting guide.
Different secret among Cluster peers
Using different Cluster secrets makes Cluster peers unable to communicate while causing libp2p to throw unobvious errors on the logs:
dial attempt failed: incoming message was too large
No connectivity between peers
The following error messages indicate connectivity issues. Make sure that the necessary ports are open and that connectivity can be established between peers:
dial attempt failed: context deadline exceeded dial backoff dial attempt failed: connection refused