Preparing for the main online line: a quick overview of Libra node operation guide

The writer is Bison Trails, a blockchain architecture company member of the Libra Association. In this article, Bison Trails shares the preparations that Libra node operators can do to run the verifier and the full node, and gives three recommendations.

Libra

As a member of the Libra Association, Bison Trails gained an in-depth experience running the first non-Calibra certifier node on the Libra test network. In this article, we detail the lessons learned from practice and provide suggestions for other verification node operators on how to optimize node performance.

Prepare to run the verification node (simplified version)

Before we dive into some of the lessons we've learned, we recommend that you download and run the Libra network software. The Libra project team provided this open source software on GitHub and provided excellent documentation on the Libra project developer's site. The document is a guide to the Libra blockchain, which introduces the Move programming language and details how to build and run a verification node. In this article, we won't discuss it in depth, but running a simplified version of a node through Docker is as simple as checking the source code. You can refer to the following two forms:

1. Refer to the description of the "Docker" directory in the Libra core test network category and run it locally via Docker.

2. Use Terraform to run the network on AWS. Similarly, refer to the description of the "Terraform" directory in the Libra core test network classification.

In either case, you should use the test network classification of the code, as this is more stable and recommended by the Libra blockchain developer documentation.

Running the verification node using any of the above methods is relatively straightforward. We recommend that you first run locally through Docker, learn about node configuration, use the Docker logs command to view its logs, and see how the verification nodes discover each other. Once you've adapted to the local environment, the Terraform deployment will launch a more realistic certifier network that can communicate with each other over the Internet.

For those who have tried to use both methods to run the software, our suggestions below will make sense.

Preparing for the main network, there are three ways

Next, we will give three recommendations based on our experience with running Libra nodes and our previous experience with other blockchain networks.

Persistent blockchain

When the Libra network starts up, as the number of accounts increases, the book status will grow over time, and verification of the execution of the transaction will create a new book status. The database that stores the book status will also grow accordingly. Importantly, the verifier and the full node can recover quickly with the certifier process restarted – for whatever reason. In the worst case, in theory, a node can always resynchronize the entire history from the creation block, but this expensive and time consuming synchronization can be done by storing the blockchain in a persistent volume. (persistent volume) to avoid.

By convention, Libra certifiers are usually configured to store blockchain data in the directory "/opt/ Libra /data"; you can change it elsewhere by changing the storage section of /opt/libra/etc/node.config.toml Store blockchain data. But we recommend that you use the default location.

Figure 1. Recommended storage configuration from node.config.toml

  Dir = "/opt/libra/data" 

1

Regardless of which system directory your node uses to store the blockchain, you need to mount a persistent volume at a specific location in the directory tree. When running through Docker (we recommend using Docker), it's as simple as specifying the mount details using the -volume or mount(mount) flags. For example, suppose you have a few T persistent volumes mounted on the host /data, and your configuration file is available on a secure volume /libra-config, you can call Docker to use the volume as follows:

Figure 2. Persistence with volume flags

  $ docker run -v /data:/opt/libra/data -v /config:/opt/libra/etc libra_e2e 

2

In fact, the Terraform template provided in the Libra blockchain source code uses such a configuration to store Libra blockchain data in an EBS (Elastic Block Storage) volume.

At Bison Trails, we also have a dedicated system that periodically snapshots blockchain data. If we lose a volume or a particular data center becomes unavailable (thousands of blockchain nodes are running around the world, this Not uncommon), we can quickly start a new node with a new volume or start a new node in a different location. In other words, the first thing our own Libra verification node does is store the blockchain directory in a persistent location that is separate from the systems of these advanced settings.

2. Indicators and alerts

At Bison Trails, we are used to adding a monitoring layer while running the blockchain software so that we can anticipate and take any action required for the normal development of the network and react to all unexpected events.

Taking the Libra blockchain as an example, the core development team provided a good start for all verifiers who have already published very useful standards through Prometheus. Prometheus is a very good time series data solution that is becoming a gold metric for development teams and can issue alerts. The best way to experience these metrics is to run the certifier network through the Terraform method described above when you start running the certifier. As shown in the screenshot below, it provides an out-of-the-box dashboard with many key metrics for individuals and network nodes.

Figure 3. Libra core with operational indicators and sample dashboards

3

Through the experience of running nodes on many networks, we have established a fairly comprehensive and rigorous approach to monitoring our nodes. We look at the metrics in three general categories:

  • System metrics such as CPU/memory/disk utilization
  • Blockchain nodes, such as process health status, node connection status, data transfer
  • Blockchain applications such as block speed, transaction rate, and verification data

Each metric we track has an alert notification that can be roughly divided into important and non-important. Since the Libra main network has not yet been released, core development is fast, and if the verification node process stops, Bison Trails will not receive an alert. However, as the release approaches, we will tighten the alert threshold and severity, and we recommend that Libra Association members of all running nodes monitor key performance indicators and set up alerts where appropriate.

3. Protect your key

The last recommendation we gave was related to the key management of the Libra node. First of all, it should be noted that the operation of the certifier's key management is constantly evolving, so what we point out here is not directly applicable to the main network, but to provide key management ideas for members of the association and other node operators. The following methods will definitely change, as some operational issues with keys, key rotations, HSMs, and other security issues will be resolved in the coming months.

The three key pairs currently running by the Libra certifier are stored in two configuration files:

  • A consensus key stored in /opt/libra/etc/consensus_keypair.config.toml
  • Network identity and signing key stored in /opt/libra/etc/network_keypair.config.toml

At Bison Trails, we use a layered approach to ensure that keys are used. Because the Libra certifier needs to read the key from the file, there are two suggestions:

1. Restrict key file permissions: Regardless of the user, the certifier process is the only process that needs to read these files, and no process needs to write to them, so we recommend setting the permission mode to "400", which means The user can read it while others cannot read or write.

2. Don't touch the disk: We recommend that you use at least the tmpfs volume for Docker images and include the boot code to make the configuration file available on the tmpfs volume.

If you are only testing the verification node locally, you do not need to protect the key, but be sure to distinguish between the development model and your activities in the production environment in order to prepare for the startup of the main network.