Recently, Aria Beingessner published an interesting essay “Rust’s Unsafe Pointer Types Need An Overhaul”. There are several parts to this essay, all of
which address issues which cause me pain in Rust, and which I would love to see
fixed. I want to focus on one issue: Rust’s usize
conflates
address, data, and object widths into a single static type. While this conflation is
unproblematic on many major platforms, it causes problems on others such as CHERI. In my
essay from last year on static integer types, I went into some depth on this, and you might find the motivation
therein a useful additional explainer to Aria’s essay.
The basic problem with Rust and CHERI is that capabilities –
think “an address alongside additional data” – require additional room
over traditional pointers, which contain just an address. Modern CHERI
capabilities are typically twice the width of what is often called the “native machine
word size”. However, Rust’s integer type hierarchy mandates that
usize
must be wide enough to allow any pointer type to be cast
into it without information loss. That means that the obvious way of making
Rust work with CHERI is to double the width of usize
so that
capabilities can be cast to a usize
— but this wastes 50% of
bits when usize
is used to represent sizes or addresses alone.
Aria then makes an intriguing suggestion for how to push Rust in the right
direction. The basic idea is to: break the link between pointers and
usize
so one can not directly cast [1]
between the two; make *mut ()
the root of the integer/pointer part
of the type system (in other words, as well as being a pointer type, *mut ()
also serves roughly the
same purpose as C’s uintptr_t
); and to make the address portion of a pointer a
usize
to allow bit-fiddling operations on that portion. On common
platforms (e.g. x86_64) this scheme makes everything work much as it does now, but on
systems like CHERI, pointers can now be wider than usize
.
There are several clever aspects of this proposal. Most importantly,
the proposal is pragmatic: it allows Rust to support more hardware platforms
[2],
but does so in a way that doesn’t require radical changes. By
breaking the link between usize
and pointers, it makes issues
around pointer provenance simpler to think about. It also provides a
plausible migration path for existing Rust code. In essence, one would first
add deprecation warnings to casts, so that code like 0x12345678 as *const()
will become a warning (I think one would also need to do the
same for at least transmute
). This would allow people to
systematically identify many of the major parts of their code where
now-incorrect assumptions about pointers and usize
are made.
A future Rust edition would then make such casts a hard error and
formally break the link between usize
and pointers in the
language’s semantics. Any code opting into that edition [3] would have to respect those new semantics or be at
risk of entering the badlands of undefined behaviour.
It’s important to note that the cast warnings (or errors) are not enough on their own to help make code correct: additional auditing will be required. However, at the very least, this will give programmers a good idea of where to start auditing their code [4]. Some people might choose not to adapt their code, or find it too difficult to do so, but my guess is that most people will choose to do so.
However, I don’t think Aria’s proposal is quite the end of the road for two minor, and one major, reasons.
The first minor reason is that on a platform like CHERI one needs to be able
to access the non-address parts of a pointer. CHERI’s C API provides various
accessor functions which give access to these bits (e.g. cheri_length_get
returns
the number of bytes of memory a capability can access), and Rust will also need
to provide similar methods for CHERI [5]. Although
it’s a niche use case I also think that to fully support techniques like pointer
tagging [6] the new *mut ()
API
needs to provide methods which give raw access to loading and storing the non-address bits
of a pointer. A problem with this is that, by definition, the number of
non-address bits is platform specific. Perhaps the easiest way is to allow the
loading / storing of an
array of usize
where the size of the array is a compile-time
constant along the lines of load_non_addr(&self) -> [usize; NON_ADDR_USIZES]
[7]. On, say, x86, NON_ADDR_USIZE
would
be 0; on pure capability CHERI (more on that below…) it would be 1. The
challenge for this API on CHERI is that storing all the non-address bits will
invalidate the capability, which sometimes one will be OK with, but not
always. That suggests that one might also want to expose a way of saying
something like “store as many non-address bits as possible without
invalidating the capability.” I must admit that I’m not immediately sure
what a good API for that might look like.
The second minor reason is that CHERI capabilities are not contiguous in
memory. For example on Morello (CHERI for Arm), addresses are 64 bits, but
capabilities are conceptually 129 bits — the “extra” bit simply confirms
whether the “main” 128 bits form a valid capability or not [8]. This validity bit is stored somewhere secret by the hardware:
you can read it (indirectly), but you can never set it. Fortunately, because the
validity bit can’t be set, it means that we don’t have to worry about pointer
tagging schemes having access to the bit, so the final sentence of the
preceding paragraph is safe. That means that all that Rust-for-CHERI would
need to add to *mut ()
is a is_valid() -> bool
method.
The major issue is, in my opinion, much more surprising. In essence, most (all?) CHERI devices allow traditional (single width) pointers to be used alongside (double width) capabilities. Conventionally a program which uses only capabilities is said to be compiled and running in “pure capability mode” while a program which uses both traditional pointers and capabilities is said to be compiled and running in “hybrid mode” [9]. Most discussion around CHERI presupposes pure capability mode, but the lesser known hybrid mode has many uses [10]. Hybrid mode does, however, mean that we can no longer assume that all pointers are capabilities.
Fortunately, I believe that Aria’s proposal can be adapted such that a Rust-for-CHERI
can cope with both pure capability and hybrid modes. In essence, one needs explicit, separate, types for
both traditional pointers and capabilities. *mut ()
suffices for the
former and a wrapper of some sort, e.g. Cap<*mut ()>
for the
latter. Conceptually this means that *mut ()
is no longer the root
of the integer/pointer type system, because one cannot convert Cap<*mut ()>
to *mut ()
without losing information (the
capability’s extra bits). My gut feeling is that
in practice, most code can treat *mut ()
as the root of the
integer/pointer type system, and only code which really cares about capabilities need
know of Cap
’s existence. An additional nice property is
that one will be able to write Rust code in a way that can be agnostic about pure
capability mode (where size_of::<*mut ()>() == size_of::<Cap<*mut ()>>()
) and hybrid mode (where
size_of::<*mut ()>() < size_of::<Cap<*mut ()>>()
).
There are a number of different ways one might design the resulting API, and while I’ve sketched out several in my mind, I’m not going to pretend that I’ve fully thought through all the trade-offs of each. A high-level version of the API that I’m currently inclined towards looks as follows. On all platforms (CHERI and non-CHERI):
impl<T: ?Sized> *mut T { // These two methods as per Aria's proposal fn addr(self) -> usize; fn with_addr(self, addr: usize) -> Self; // Allow access to the non-address parts of a pointer. // PTR_NON_ADDR_USIZES is a constant (e.g. x86_64: 0, AArch64: 0, // CHERI hybrid: 0, CHERI pure cap: 1) fn non_addr(self) -> [usize; PTR_NON_ADDR_USIZES]; fn with_non_addr(self, non_addr: [usize; PTR_NON_ADDR_USIZES] -> Self; }
On Rust-for-CHERI (both pure capability and hybrid modes) additionally an explicit wrapper for capabilities:
// A wrapper type for capabilities, only on Rust-for-CHERI. #[derive(Clone, Copy)] struct Cap<T> { cap: u128, // Assuming 128 (well, 129...) bit capabilities phantom: PhantomData<T> } impl<T: ?Sized> Cap<*mut T> { // These two methods as per Aria's proposal fn addr(self) -> usize; fn with_addr(self, addr: usize) -> Self; // Allow access to the non-address parts of a capability. // CAP_NON_ADDR_USIZES is a constant (1 on most CHERI devices). fn non_addr(self) -> [usize; CAP_NON_ADDR_USIZES]; fn with_non_addr(self, non_addr: [usize; CAP_NON_ADDR_USIZES] -> Self; fn is_valid(self) -> bool; // Is this capability valid? fn bounds_len(self) -> usize; // How many bytes do the bounds span? }
We then need a mechanism for casting pointers to capabilities. Additionally in pure capability mode we trivially add a method which is effectively a non-bit-changing transmute:
impl<T: ?Sized> *mut T { fn as_cap(self) -> Cap<*mut T>; }
However in hybrid mode, where do the extra capability bits come from? For
example, casting from void *
to void * __capability
in CHERI hybrid C takes capability provenance from the global DDC register, but you might
perhaps want to take capability provenance from a different capability. I would
prefer the user to have to specify which capability those extra bits fome
from, meaning that the following method would additionally be available
in both pure capability and hybrid modes:
impl<T: ?Sized> *mut T { fn as_cap_with_perms_from(self, othercap: Cap<*mut ()>) -> Cap<*mut T>; }
I’m not going to pretend that this API is perfect (e.g. should there be
separate Cap
and CapMut
wrappers? what are the
provenance implications if we cast a capability into a u128
?), but
I hope it gives a rough idea of what might be possible.
There is also another implication in some of what I’ve written above, and it’s
contained in the awkward “Rust-for-CHERI” name. I think Aria’s proposal (with
the first minor tweak I suggested above) is an unqualified improvement on
existing Rust and benefits all platforms. However, the
Cap<...>
wrapper is something quite specific to CHERI:
because of the (minor but noticeable)
complexity it adds to the language, I’m not sure every Rust programmer should
be forced to know of its existence. That implies that there might be a
fork or conditionally compiled version of rustc
that
exposes this feature only when producing a compiler for CHERI’s hybrid mode.
There is, though, a disadvantage to this: library authors who want to do
something clever for CHERI’s hybrid mode will also have to conditionally
compile all references to Cap<...>
. Exactly what the right
trade-off would be I’m unsure.
In summary, I think that Aria’s proposal will make Rust a better target for a variety of platforms, including CHERI. With some additional tweaks, it can support both of CHERI’s modes. My suggestions above are one possible way of doing so, but there are definitely other, possibly better, ways of achieving the same outcome.
Acknowledgements: Thanks to Jacob Bramley for many ideas and comments. Any errors and infelicities are my own.
Footnotes
I think as
style casting Rust is unfortunate and I’d prefer it was
removed entirely, but that’s perhaps a more general point!
I think as
style casting Rust is unfortunate and I’d prefer it was
removed entirely, but that’s perhaps a more general point!
Altering Rust to support the same variety of platforms that C does is, I believe, infeasible: it would require such radical changes to the language that they would stand no chance of being accepted. As I said in the static integer types essay, my belief is that unless one thinks about these things early in a language’s design, it’s not possible to fully fix them later without breaking too much code in too many subtle ways.
Altering Rust to support the same variety of platforms that C does is, I believe, infeasible: it would require such radical changes to the language that they would stand no chance of being accepted. As I said in the static integer types essay, my belief is that unless one thinks about these things early in a language’s design, it’s not possible to fully fix them later without breaking too much code in too many subtle ways.
I can see reasonable arguments for making the cast warnings part of an existing edition (so the “full” change can happen in the next edition), or part of a future edition (so the “full” change takes at least two editions), but I consider this a minor issue either way.
I can see reasonable arguments for making the cast warnings part of an existing edition (so the “full” change can happen in the next edition), or part of a future edition (so the “full” change takes at least two editions), but I consider this a minor issue either way.
To some extent I’m feeling slightly smug about this, as I’ve long believed that
Rust’s usize
assumptions were unfortunate — I’ve embedded
lots of static and dynamic assertions in code I’ve written in case this
assumption is changed. However, I’m feeling slightly less smug than I’d like,
because my experience is that the sort of code that embeds such assumption is
invariably much more fiddly than average, with extensive use of
unsafe
. I guarantee that I’ve missed some assertions and that I’ll
make some subtle mistakes when adapting it.
To some extent I’m feeling slightly smug about this, as I’ve long believed that
Rust’s usize
assumptions were unfortunate — I’ve embedded
lots of static and dynamic assertions in code I’ve written in case this
assumption is changed. However, I’m feeling slightly less smug than I’d like,
because my experience is that the sort of code that embeds such assumption is
invariably much more fiddly than average, with extensive use of
unsafe
. I guarantee that I’ve missed some assertions and that I’ll
make some subtle mistakes when adapting it.
My experience is that CHERI’s current standard C API is rather confusing for newcomers: it might be possible to tidy this up a bit in a Rust context.
My experience is that CHERI’s current standard C API is rather confusing for newcomers: it might be possible to tidy this up a bit in a Rust context.
One doesn’t necessarily have to actually tag a pointer, but the term “pointer tagging” is so commonly used that there’s little point fighting against it.
One doesn’t necessarily have to actually tag a pointer, but the term “pointer tagging” is so commonly used that there’s little point fighting against it.
Using this approach, it would also be easy to allow access to all the bits of a pointer (address and non-address) in one go.
Using this approach, it would also be easy to allow access to all the bits of a pointer (address and non-address) in one go.
Unfortunately CHERI calls this the “tag” bit, which confuses people in the context of “pointer tagging”. I’m hoping the community will call it the “validity” bit (or something similar).
Unfortunately CHERI calls this the “tag” bit, which confuses people in the context of “pointer tagging”. I’m hoping the community will call it the “validity” bit (or something similar).
I’m choosing my terms as carefully as I can because, in a sense, “pure capability” and “hybrid” modes are a fiction. They have implications for things like the size of pointer and integer types, which is why there are two CHERI C compilers, one for “purecap” and another for “hybrid”.
Different CHERI processors might then run programs in special or generic modes. For example, Morello can run in either: A64 mode, with a mostly traditional Arm instruction set, and normal width pointers; or C64 mode, with a CHERIfied instruction set, and double width capabilities. Not only can you switch the entire processor from A64 to C64 mode on-the-fly, but in A64 mode you can access double width capabilities alongside normal width pointers (e.g. A64 mode not only allows access to traditional 64-bit registers but also the processors’s capability 129-bit registers), and in C64 mode you can access normal pointers. Morello’s two modes have different trade-offs, but as this might suggest, the distinction between pure capability and hybrid modes is more fluid than it may at first seem.
I’m choosing my terms as carefully as I can because, in a sense, “pure capability” and “hybrid” modes are a fiction. They have implications for things like the size of pointer and integer types, which is why there are two CHERI C compilers, one for “purecap” and another for “hybrid”.
Different CHERI processors might then run programs in special or generic modes. For example, Morello can run in either: A64 mode, with a mostly traditional Arm instruction set, and normal width pointers; or C64 mode, with a CHERIfied instruction set, and double width capabilities. Not only can you switch the entire processor from A64 to C64 mode on-the-fly, but in A64 mode you can access double width capabilities alongside normal width pointers (e.g. A64 mode not only allows access to traditional 64-bit registers but also the processors’s capability 129-bit registers), and in C64 mode you can access normal pointers. Morello’s two modes have different trade-offs, but as this might suggest, the distinction between pure capability and hybrid modes is more fluid than it may at first seem.
The “obvious” use case is to make it possible to run existing systems on a CHERI system without fully porting them to full capabilities. Personally I think a more important use case is that you can use special registers (e.g. the DDC and PCC) to create sub-process like compartments on code running in non-capability mode: this concept has only partly been explored in existing work, and I think there’s going to be a lot of mileage in exploring it further.
In the context of Rust, it’s also worth asking whether it’s worth making all pointers double width (which, though perhaps small, will undoubtedly have measurable memory and performance costs). After all, most Rust code is safe, and the compiler can guarantee that pointers can’t be easily misused for things like buffer overruns. Rather, I see the main utility for a language like Rust being to impose various “sub-process” like compartments, with only relatively small portions of the code needing to use capabilities explicitly.
The “obvious” use case is to make it possible to run existing systems on a CHERI system without fully porting them to full capabilities. Personally I think a more important use case is that you can use special registers (e.g. the DDC and PCC) to create sub-process like compartments on code running in non-capability mode: this concept has only partly been explored in existing work, and I think there’s going to be a lot of mileage in exploring it further.
In the context of Rust, it’s also worth asking whether it’s worth making all pointers double width (which, though perhaps small, will undoubtedly have measurable memory and performance costs). After all, most Rust code is safe, and the compiler can guarantee that pointers can’t be easily misused for things like buffer overruns. Rather, I see the main utility for a language like Rust being to impose various “sub-process” like compartments, with only relatively small portions of the code needing to use capabilities explicitly.