GTech Booster • Is Rust the low-level-ish

Rust is the great hope for a safe low-level programming language. This is often expressed in the motto “fearless concurrency”, but who is to say that it really is better. Perhaps it’s just as bad in different ways.

New research by Zeming Yu, Linhai Song, Yiying Zhang at Pennsylvania State University and Purdue University aims to find out if Rust’s approach to concurrency really does protect the programmer from making the sort of mistake so common in C-like languages.

Rust is a language that all the cool kids are talking about, if not using. If you have missed its back story, it was invented in 2006 by Gradon Hoare, is championed and used by Mozilla and is rapidly gaining a reputation as a serious alternative to C/C++ for systems work. Rust is special because it stays close to the metal and it implements safety in ways that are low-cost in terms of performance and memory use. It has the concept of ownership to avoid the problem of sharing variables between threads and enforces the use of locks. While other languages have locks, generally they are applied in a co-operative manner. If you want to access a resource without using a lock that you declared you were going to use – you can.

Rust has all the low-level things you would expect but with restrictions on how they are used. If you really need to go native then you can code using, say, raw pointers, but you have to mark the code as unsafe. In principle, if you don’t write any unsafe code, or if you write unsafe code correctly, it is claimed that Rust avoids all of the race hazards inherent in other languages.

Can this be true?

The researchers looked at code in GitHub to find out how Rust is used and what sort of errors actually occurred. The first observation was that there are a lot of unsafe tags. Ironically it is Mozilla’s Servo project that used unsafe most often. It seems that Rust is restrictive enough to force developers to turn to unsafe code to get the job done. It might be better to have a more nuanced version of unsafe that only turns off the checks that need to be turned off.

Rust also seems to implement concurrency differently – preferring channel and atomic to mutex. It is suggested that this might make it more difficult to find tools that help with concurrency problems because these are more familiar with mutex.

To find out about errors relating to concurrency, the team looked at bugs classified as race or deadlock bugs. It was found that it was possible to cause deadlock by misusing channels, but of the ten instances of deadlock, seven where caused by double locks and the use of implicit unlock. Of the data race errors found, five out of eight were due to unsafe code. What was surprising is that there were three cases of data races inside supposedly safe code. What could be the problem? It was found that all of the data races were cause by misuse of atomic operations, which are not checked for ownership. This apparently makes them susceptible to reordering by the CPU or the compiler – life is complicated for software running under a modern processor.

The suggestion made by the researchers is:

“Race detection techniques are needed for Rust, and they should focus on unsafe code and atomic operations in safe code.”
Zeming Yu, Linhai Song, Yiying Zhang at Pennsylvania State University and Purdue University

There is more to be done and this is a very preliminary look at Rust. On the whole, it does appear that Rust delivers a higher level of safe concurrency. If you can avoid having to use unsafe code then there are only a small number of ways that you can make a mess of it. The real question is, can we avoid unsafe code? Is it really essential? Or are programmers simply not thinking in the right way? Or are they converting solutions from other languages that are simply not appropriate?

We need more analysis.