Translation: 孙 元 超 @Cdot Network
Programming is difficult.
Not because our bodies are complex in structure, but because we are all just humans. Our attention span is limited and our memories are not permanent-in other words, we often make mistakes. Computers and software are everywhere: in space, in the sky, on the ground, on our bodies, and even in our bodies. Every day, more and more systems are automated, and more and more lives depend on software and its quality, avionics, autonomous vehicles, nuclear power plants, traffic control systems, implantable cardiac pacemakers. Bugs in these systems almost always endanger human life.
There is a huge difference between "program correctness is tested by testing" and "program correctness is logically verified." Unfortunately, even if we test every line of code, we still cannot be sure that it is correct. However, having a formal system to prove that our code is correct (at least in some ways) is another matter.
"Rust as a programming language" differs not because of its fancy syntax or popular community, but because people gain confidence when using it to write programs. Rust is very strict and meticulous. The compiler checks every variable you use and every memory address referenced. It may seem to prevent you from writing efficient and expressive code, but surprisingly, the opposite is true: writing an effective and authentic Rust program is actually easier than writing a potentially dangerous program. In the latter case, you will conflict with the compiler, because almost everything you try will cause memory safety issues.
The right part of the figure above shows the issues related to concurrency and memory safety. The root cause of these problems cannot occur in regular (non-unsafe block) Rust code. So they only need to switch to Rust, and they can eliminate about half of the bugs during this time. At the same time, buffer overflows are among the most dangerous bugs because they often lead to "key leaks", denial of service, and remote code execution vulnerabilities.
The above figure also shows that the thoughts of "one person only needs to know how to write C" and "just need to leave the low-level stuff to professionals" are not enough. The Linux kernel was written by the best 5% in the industry. However, the old friend of memory bugs contributed CVE (Common Vulnerability Disclosure) year after year.
Of course, these 50 bugs are nothing compared to the millions of lines of working code in the kernel. But do you remember the life and death issues mentioned before? When we talk about critical systems, even the smallest mistake can lead to catastrophic consequences. These 50 bugs have not been mentioned yet. Who knows how much remains undiscovered? If we use Rust, we will know these issues beforehand.
How fast is Rust?
You might think: Of course, Rust may provide so many things to eliminate these hidden dangers, but how much does it cost? Generally, memory security in modern programming languages requires the overhead of the garbage collector. Concurrency issues are usually resolved by using a special synchronization primitive to lock all affected data structures and execution paths.
This is not the case for Rust. Its power comes from the clever type system, which can solve all these problems at compile time. The "Design of the Type System" prevents both memory issues and data race issues.
Just like in C ++, you can use only what you need. For example, in Rust, you only use mutex locks when you absolutely need them. And, the location that the Rust compiler will need forces you to use it, so never forget it. All of this is essentially zero cost. Since most checks are performed at compile time, the compiled assembly is not much different from an assembly generated by a C or C ++ compiler.
For this reason, Rust is now very promising in the areas of "embedded electronics", "Internet of Things", and even "operating system development", which were previously dominated by the C language because of the high control requirements, Resources and performance are also severely limited.
The latest version of Rust even introduces SIMD support in user space. Previously, it was only available in the nightly version due to API stability limitations. Now you can unleash the full potential of your hardware by using vector instructions directly or by using the "convenient wrapper library". And, even if you don't plan to do this, the compiler will still automatically vectorize loop statements and other things when possible, and in many cases, you can achieve the performance level of manually writing vector instruction code.
Why we use Rust
Parity Technologies uses Rust for the same reason. Because it allows us to write complex and high-performance software without fear. We are free to experiment because we are convinced that Rust will support us. It doesn't make a difference whether it is implementing a simple command line utility or a multi-threaded behemoth. Rust ensures that our programs are free from undefined behavior, data races, or any memory safety issues. Not to mention, Rust is "very fast", it's fun to write, easy to read, and has almost no runtime.
Memory bugs are hard to find because you can't easily write tests to catch them. If you don't find a bug during beta testing, it could be in the code for years, just like the time bomb waits for the moment to explode. Of course, there are tools like "Valgrind" that can help catch these bugs. But even Valgrind will not catch the bug if the problem occurs when it is not executed in debug mode, or if it does not appear to be a memory issue when executed.
Therefore, by using Rust, we eliminate the most complex and unpredictable type of error.
The role of the test
Of course, memory safety is only part of the problem. For example, we can write a function to sum its integer parameters, but it only returns a constant at will. Or we wrote a random number generator but generated predictable values. This behavior does not violate Rust's memory safety guarantees, but it is clearly incorrect.
This is where testing comes in. Testing allows us to check for invariants that the compiler cannot understand. Basically, we need to ensure that the corresponding test covers every result returned and every point in the program where a decision is made. In the above example, the test must check whether the function does return the sum of its arguments and whether the random number generated is sufficiently random.
In a sense, logical errors are easier to handle. By definition, they are the same areas that programmers consider when writing programs (and memory bugs are not among them).
Fortunately, we know how to deal with these bugs. Over the past few decades, programmers and computer scientists have created a set of methods and tools that we can use to reduce the number of logical errors and keep them to a minimum.
The power of mathematics
In the most rigorous and complex approach, program correctness is verified, not verified. Languages like "Iris" and "Coq" can be used to prove the correctness of the entire procedure. Rather than checking the validity of some inputs like a test, it is proved as a mathematical theorem, once and for all possible inputs and every possible scenario. Only by constructing such a proof can you gain confidence that the program is correct (as long as your specifications and understanding are correct).
Basically, Rust does the same thing, but only for some limited special problems, such as concurrency and memory safety. In fact, it uses logic to prove that your program is correct in these respects. Think about it, just by writing regular Rust code, you might have the same level of confidence as having a group of mathematicians study a theorem every time you compile a project.
Unfortunately, proving every part of the system is so complex and time-consuming that it is usually only the most critical parts of the software that are verified, such as the operating system kernel, cryptographic algorithms, and in some cases, also Standard library of languages.
For a long time, a killer feature of functional programming languages like Haskell is the ability to formally prove code, while traditional imperative programming languages have widely used shared variability, unsafe pointer operations, and uncontrollable Side effects still cannot be applied formally. But Rust can change that. Although it is an imperative language, it is already "on the way to formal proof."
"Ralf Jung et al. From the RustBelt project" has published some "papers" that prove that the basic invariants of the Rust language declaration are indeed included in some important primitives of the standard library.
The problem is that, for performance reasons, the Rust standard library contains many potentially unsafe code and raw pointer operations.
To prove the correctness of the standard library, Ralf Jung and colleagues designed a method called λrust that can use separate logic and their own calculation process without security constraints. Through this calculus, they tried to prove that the standard library primitives and containers worked as expected, and that they did not violate Rust's basic invariants. As a by-product, they even found bugs in synchronization primitives such as "MutexGuard" and "Arc".
But this work is far from complete. As the author points out:
We hope that one day we will be able to bring the same level of proof of correctness to the code written by Parity Technologies. Combined with its level of control, ability to capture memory issues and concurrency issues, Rust is becoming one of the most advanced mainstream general-purpose languages that can be successfully used to write robust, secure, and efficient programs.
- Rust as a programming language: https://www.rust-lang.org/en-US/
- Key leak: http://heartbleed.com/
- TypeSystem: https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html
- Embedded Electronics: http://blog.japaric.io/
- Internet of Things: https://www.tockos.org/
- Operating system development: https://wiki.osdev.org/Rust
- Library: https://github.com/AdamNiederer/faster
- Faster: https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/rust.html
- Iris: http://iris-project.org
- Coq: https://coq.inria.fr
- Formalizing: https://www.ralfj.de/blog/2015/10/12/formalizing-rust.html
- Ralf Jung: http://plv.mpi-sws.org/rustbelt/)
- Ralf Jung related papers: https://people.mpi-sws.org/~dreyer/papers/rustbelt/paper.pdf
- MutexGuard: https://www.ralfj.de/blog/2017/06/09/mutexguard-sync.html
- Arc: https://www.ralfj.de/blog/2018/07/13/arc-synchronization.html