Making Rust a Better Fit for CHERI and Other Platforms

Recent posts
Structured Editing and Incremental Parsing
How I Prepare to Make a Video on Programming
pizauth: HTTPS redirects
Recording and Processing Spoken Word
Why the Circular Specification Problem and the Observer Effect Are Distinct
What Factors Explain the Nature of Software?
Some Reflections on Writing Unix Daemons
Faster Shell Startup With Shell Switching
Choosing What To Read
Debugging A Failing Hotkey

Blog archive

Recently, Aria Beingessner published an interesting essay “Rust’s Unsafe Pointer Types Need An Overhaul”. There are several parts to this essay, all of which address issues which cause me pain in Rust, and which I would love to see fixed. I want to focus on one issue: Rust’s usize conflates address, data, and object widths into a single static type. While this conflation is unproblematic on many major platforms, it causes problems on others such as CHERI. In my essay from last year on static integer types, I went into some depth on this, and you might find the motivation therein a useful additional explainer to Aria’s essay.

The basic problem with Rust and CHERI is that capabilities – think “an address alongside additional data” – require additional room over traditional pointers, which contain just an address. Modern CHERI capabilities are typically twice the width of what is often called the “native machine word size”. However, Rust’s integer type hierarchy mandates that usize must be wide enough to allow any pointer type to be cast into it without information loss. That means that the obvious way of making Rust work with CHERI is to double the width of usize so that capabilities can be cast to a usize — but this wastes 50% of bits when usize is used to represent sizes or addresses alone.

Aria then makes an intriguing suggestion for how to push Rust in the right direction. The basic idea is to: break the link between pointers and usize so one can not directly cast [1] between the two; make *mut () the root of the integer/pointer part of the type system (in other words, as well as being a pointer type, *mut () also serves roughly the same purpose as C’s uintptr_t); and to make the address portion of a pointer a usize to allow bit-fiddling operations on that portion. On common platforms (e.g. x86_64) this scheme makes everything work much as it does now, but on systems like CHERI, pointers can now be wider than usize.

There are several clever aspects of this proposal. Most importantly, the proposal is pragmatic: it allows Rust to support more hardware platforms [2], but does so in a way that doesn’t require radical changes. By breaking the link between usize and pointers, it makes issues around pointer provenance simpler to think about. It also provides a plausible migration path for existing Rust code. In essence, one would first add deprecation warnings to casts, so that code like 0x12345678 as *const() will become a warning (I think one would also need to do the same for at least transmute). This would allow people to systematically identify many of the major parts of their code where now-incorrect assumptions about pointers and usize are made. A future Rust edition would then make such casts a hard error and formally break the link between usize and pointers in the language’s semantics. Any code opting into that edition [3] would have to respect those new semantics or be at risk of entering the badlands of undefined behaviour.

It’s important to note that the cast warnings (or errors) are not enough on their own to help make code correct: additional auditing will be required. However, at the very least, this will give programmers a good idea of where to start auditing their code [4]. Some people might choose not to adapt their code, or find it too difficult to do so, but my guess is that most people will choose to do so.

However, I don’t think Aria’s proposal is quite the end of the road for two minor, and one major, reasons.

The first minor reason is that on a platform like CHERI one needs to be able to access the non-address parts of a pointer. CHERI’s C API provides various accessor functions which give access to these bits (e.g. cheri_length_get returns the number of bytes of memory a capability can access), and Rust will also need to provide similar methods for CHERI [5]. Although it’s a niche use case I also think that to fully support techniques like pointer tagging [6] the new *mut () API needs to provide methods which give raw access to loading and storing the non-address bits of a pointer. A problem with this is that, by definition, the number of non-address bits is platform specific. Perhaps the easiest way is to allow the loading / storing of an array of usize where the size of the array is a compile-time constant along the lines of load_non_addr(&self) -> [usize; NON_ADDR_USIZES] [7]. On, say, x86, NON_ADDR_USIZE would be 0; on pure capability CHERI (more on that below…) it would be 1. The challenge for this API on CHERI is that storing all the non-address bits will invalidate the capability, which sometimes one will be OK with, but not always. That suggests that one might also want to expose a way of saying something like “store as many non-address bits as possible without invalidating the capability.” I must admit that I’m not immediately sure what a good API for that might look like.

The second minor reason is that CHERI capabilities are not contiguous in memory. For example on Morello (CHERI for Arm), addresses are 64 bits, but capabilities are conceptually 129 bits — the “extra” bit simply confirms whether the “main” 128 bits form a valid capability or not [8]. This validity bit is stored somewhere secret by the hardware: you can read it (indirectly), but you can never set it. Fortunately, because the validity bit can’t be set, it means that we don’t have to worry about pointer tagging schemes having access to the bit, so the final sentence of the preceding paragraph is safe. That means that all that Rust-for-CHERI would need to add to *mut () is a is_valid() -> bool method.

The major issue is, in my opinion, much more surprising. In essence, most (all?) CHERI devices allow traditional (single width) pointers to be used alongside (double width) capabilities. Conventionally a program which uses only capabilities is said to be compiled and running in “pure capability mode” while a program which uses both traditional pointers and capabilities is said to be compiled and running in “hybrid mode” [9]. Most discussion around CHERI presupposes pure capability mode, but the lesser known hybrid mode has many uses [10]. Hybrid mode does, however, mean that we can no longer assume that all pointers are capabilities.

Fortunately, I believe that Aria’s proposal can be adapted such that a Rust-for-CHERI can cope with both pure capability and hybrid modes. In essence, one needs explicit, separate, types for both traditional pointers and capabilities. *mut () suffices for the former and a wrapper of some sort, e.g. Cap<*mut ()> for the latter. Conceptually this means that *mut () is no longer the root of the integer/pointer type system, because one cannot convert Cap<*mut ()> to *mut () without losing information (the capability’s extra bits). My gut feeling is that in practice, most code can treat *mut () as the root of the integer/pointer type system, and only code which really cares about capabilities need know of Cap’s existence. An additional nice property is that one will be able to write Rust code in a way that can be agnostic about pure capability mode (where size_of::<*mut ()>() == size_of::<Cap<*mut ()>>()) and hybrid mode (where size_of::<*mut ()>() < size_of::<Cap<*mut ()>>()).

There are a number of different ways one might design the resulting API, and while I’ve sketched out several in my mind, I’m not going to pretend that I’ve fully thought through all the trade-offs of each. A high-level version of the API that I’m currently inclined towards looks as follows. On all platforms (CHERI and non-CHERI):

impl<T: ?Sized> *mut T {
  // These two methods as per Aria's proposal
  fn addr(self) -> usize;
  fn with_addr(self, addr: usize) -> Self;

  // Allow access to the non-address parts of a pointer.
  // PTR_NON_ADDR_USIZES is a constant (e.g. x86_64: 0, AArch64: 0,
  //   CHERI hybrid: 0, CHERI pure cap: 1)
  fn non_addr(self) -> [usize; PTR_NON_ADDR_USIZES];
  fn with_non_addr(self, non_addr: [usize; PTR_NON_ADDR_USIZES] -> Self;
}

On Rust-for-CHERI (both pure capability and hybrid modes) additionally an explicit wrapper for capabilities:

// A wrapper type for capabilities, only on Rust-for-CHERI.
#[derive(Clone, Copy)]
struct Cap<T> {
  cap: u128, // Assuming 128 (well, 129...) bit capabilities
  phantom: PhantomData<T>
}

impl<T: ?Sized> Cap<*mut T> {
  // These two methods as per Aria's proposal
  fn addr(self) -> usize;
  fn with_addr(self, addr: usize) -> Self;

  // Allow access to the non-address parts of a capability.
  // CAP_NON_ADDR_USIZES is a constant (1 on most CHERI devices).
  fn non_addr(self) -> [usize; CAP_NON_ADDR_USIZES];
  fn with_non_addr(self, non_addr: [usize; CAP_NON_ADDR_USIZES] -> Self;

  fn is_valid(self) -> bool; // Is this capability valid?
  fn bounds_len(self) -> usize; // How many bytes do the bounds span?
}

We then need a mechanism for casting pointers to capabilities. Additionally in pure capability mode we trivially add a method which is effectively a non-bit-changing transmute:

impl<T: ?Sized> *mut T {
  fn as_cap(self) -> Cap<*mut T>;
}

However in hybrid mode, where do the extra capability bits come from? For example, casting from void * to void * __capability in CHERI hybrid C takes capability provenance from the global DDC register, but you might perhaps want to take capability provenance from a different capability. I would prefer the user to have to specify which capability those extra bits fome from, meaning that the following method would additionally be available in both pure capability and hybrid modes:

impl<T: ?Sized> *mut T {
  fn as_cap_with_perms_from(self, othercap: Cap<*mut ()>) -> Cap<*mut T>;
}

I’m not going to pretend that this API is perfect (e.g. should there be separate Cap and CapMut wrappers? what are the provenance implications if we cast a capability into a u128?), but I hope it gives a rough idea of what might be possible.

There is also another implication in some of what I’ve written above, and it’s contained in the awkward “Rust-for-CHERI” name. I think Aria’s proposal (with the first minor tweak I suggested above) is an unqualified improvement on existing Rust and benefits all platforms. However, the Cap<...> wrapper is something quite specific to CHERI: because of the (minor but noticeable) complexity it adds to the language, I’m not sure every Rust programmer should be forced to know of its existence. That implies that there might be a fork or conditionally compiled version of rustc that exposes this feature only when producing a compiler for CHERI’s hybrid mode. There is, though, a disadvantage to this: library authors who want to do something clever for CHERI’s hybrid mode will also have to conditionally compile all references to Cap<...>. Exactly what the right trade-off would be I’m unsure.

In summary, I think that Aria’s proposal will make Rust a better target for a variety of platforms, including CHERI. With some additional tweaks, it can support both of CHERI’s modes. My suggestions above are one possible way of doing so, but there are definitely other, possibly better, ways of achieving the same outcome.

Acknowledgements: Thanks to Jacob Bramley for many ideas and comments. Any errors and infelicities are my own.

Newer 2022-04-13 08:00 Older
If you’d like updates on new blog posts: follow me on Mastodon or Twitter; or subscribe to the RSS feed; or subscribe to email updates:

Footnotes

[1]

I think as style casting Rust is unfortunate and I’d prefer it was removed entirely, but that’s perhaps a more general point!

I think as style casting Rust is unfortunate and I’d prefer it was removed entirely, but that’s perhaps a more general point!

[2]

Altering Rust to support the same variety of platforms that C does is, I believe, infeasible: it would require such radical changes to the language that they would stand no chance of being accepted. As I said in the static integer types essay, my belief is that unless one thinks about these things early in a language’s design, it’s not possible to fully fix them later without breaking too much code in too many subtle ways.

Altering Rust to support the same variety of platforms that C does is, I believe, infeasible: it would require such radical changes to the language that they would stand no chance of being accepted. As I said in the static integer types essay, my belief is that unless one thinks about these things early in a language’s design, it’s not possible to fully fix them later without breaking too much code in too many subtle ways.

[3]

I can see reasonable arguments for making the cast warnings part of an existing edition (so the “full” change can happen in the next edition), or part of a future edition (so the “full” change takes at least two editions), but I consider this a minor issue either way.

I can see reasonable arguments for making the cast warnings part of an existing edition (so the “full” change can happen in the next edition), or part of a future edition (so the “full” change takes at least two editions), but I consider this a minor issue either way.

[4]

To some extent I’m feeling slightly smug about this, as I’ve long believed that Rust’s usize assumptions were unfortunate — I’ve embedded lots of static and dynamic assertions in code I’ve written in case this assumption is changed. However, I’m feeling slightly less smug than I’d like, because my experience is that the sort of code that embeds such assumption is invariably much more fiddly than average, with extensive use of unsafe. I guarantee that I’ve missed some assertions and that I’ll make some subtle mistakes when adapting it.

To some extent I’m feeling slightly smug about this, as I’ve long believed that Rust’s usize assumptions were unfortunate — I’ve embedded lots of static and dynamic assertions in code I’ve written in case this assumption is changed. However, I’m feeling slightly less smug than I’d like, because my experience is that the sort of code that embeds such assumption is invariably much more fiddly than average, with extensive use of unsafe. I guarantee that I’ve missed some assertions and that I’ll make some subtle mistakes when adapting it.

[5]

My experience is that CHERI’s current standard C API is rather confusing for newcomers: it might be possible to tidy this up a bit in a Rust context.

My experience is that CHERI’s current standard C API is rather confusing for newcomers: it might be possible to tidy this up a bit in a Rust context.

[6]

One doesn’t necessarily have to actually tag a pointer, but the term “pointer tagging” is so commonly used that there’s little point fighting against it.

One doesn’t necessarily have to actually tag a pointer, but the term “pointer tagging” is so commonly used that there’s little point fighting against it.

[7]

Using this approach, it would also be easy to allow access to all the bits of a pointer (address and non-address) in one go.

Using this approach, it would also be easy to allow access to all the bits of a pointer (address and non-address) in one go.

[8]

Unfortunately CHERI calls this the “tag” bit, which confuses people in the context of “pointer tagging”. I’m hoping the community will call it the “validity” bit (or something similar).

Unfortunately CHERI calls this the “tag” bit, which confuses people in the context of “pointer tagging”. I’m hoping the community will call it the “validity” bit (or something similar).

[9]

I’m choosing my terms as carefully as I can because, in a sense, “pure capability” and “hybrid” modes are a fiction. They have implications for things like the size of pointer and integer types, which is why there are two CHERI C compilers, one for “purecap” and another for “hybrid”.

Different CHERI processors might then run programs in special or generic modes. For example, Morello can run in either: A64 mode, with a mostly traditional Arm instruction set, and normal width pointers; or C64 mode, with a CHERIfied instruction set, and double width capabilities. Not only can you switch the entire processor from A64 to C64 mode on-the-fly, but in A64 mode you can access double width capabilities alongside normal width pointers (e.g. A64 mode not only allows access to traditional 64-bit registers but also the processors’s capability 129-bit registers), and in C64 mode you can access normal pointers. Morello’s two modes have different trade-offs, but as this might suggest, the distinction between pure capability and hybrid modes is more fluid than it may at first seem.

I’m choosing my terms as carefully as I can because, in a sense, “pure capability” and “hybrid” modes are a fiction. They have implications for things like the size of pointer and integer types, which is why there are two CHERI C compilers, one for “purecap” and another for “hybrid”.

Different CHERI processors might then run programs in special or generic modes. For example, Morello can run in either: A64 mode, with a mostly traditional Arm instruction set, and normal width pointers; or C64 mode, with a CHERIfied instruction set, and double width capabilities. Not only can you switch the entire processor from A64 to C64 mode on-the-fly, but in A64 mode you can access double width capabilities alongside normal width pointers (e.g. A64 mode not only allows access to traditional 64-bit registers but also the processors’s capability 129-bit registers), and in C64 mode you can access normal pointers. Morello’s two modes have different trade-offs, but as this might suggest, the distinction between pure capability and hybrid modes is more fluid than it may at first seem.

[10]

The “obvious” use case is to make it possible to run existing systems on a CHERI system without fully porting them to full capabilities. Personally I think a more important use case is that you can use special registers (e.g. the DDC and PCC) to create sub-process like compartments on code running in non-capability mode: this concept has only partly been explored in existing work, and I think there’s going to be a lot of mileage in exploring it further.

In the context of Rust, it’s also worth asking whether it’s worth making all pointers double width (which, though perhaps small, will undoubtedly have measurable memory and performance costs). After all, most Rust code is safe, and the compiler can guarantee that pointers can’t be easily misused for things like buffer overruns. Rather, I see the main utility for a language like Rust being to impose various “sub-process” like compartments, with only relatively small portions of the code needing to use capabilities explicitly.

The “obvious” use case is to make it possible to run existing systems on a CHERI system without fully porting them to full capabilities. Personally I think a more important use case is that you can use special registers (e.g. the DDC and PCC) to create sub-process like compartments on code running in non-capability mode: this concept has only partly been explored in existing work, and I think there’s going to be a lot of mileage in exploring it further.

In the context of Rust, it’s also worth asking whether it’s worth making all pointers double width (which, though perhaps small, will undoubtedly have measurable memory and performance costs). After all, most Rust code is safe, and the compiler can guarantee that pointers can’t be easily misused for things like buffer overruns. Rather, I see the main utility for a language like Rust being to impose various “sub-process” like compartments, with only relatively small portions of the code needing to use capabilities explicitly.

Comments



(optional)
(used only to verify your comment: it is not displayed)