Rust and the Cloneable Globals Pattern

The Rust programming language offers a sleek way to manage shared memory. Rust uses an ownership-and-borrowing scheme that actually does at least 2 things that language designers have been trying to do since C++ (or possibly even earlier!):

  • Automatic and eager deallocation of unneeded objects without a garbage collector.
  • Compile-time verification of thread-safe access to shared memory.

However, Rust’s borrowing scheme takes a little getting-used-to; there are some coding patterns that programmers frequently use that won’t work (easily) in Rust.

One coding pattern is the known-evil Global Variable. A related pattern is the not-yet-understood-as-evil Singleton Pattern. Singletons provide a bit of encapsulation, but they share a common problem with global variables: they do not offer any protection against concurrent modification by other threads. Furthermore, both globals and singletons share an initialization problem: if the initialization of a global or singleton requires executing code, then the initialization must be lazy — which means that the order in which globals and singletons are initialized is frequently not obvious, and sometimes completely undefined.

But in practice, I need to use global variables or something very much like them. There are 2 things that I discovered quite quickly when I started writing my first non-trivial Rust program:

  1. I had some objects that needed to be widely accessible from many parts of the program. For example, I had an OpenGL rendering context that was needed by many functions.
  2. I needed to have a centralized mechanism for loading configuration data. Storing everything in compile-time constants doesn’t allow for run-time or user-set preferences, and configuration data needs to be cached and widely accessible (it would be unreasonable to reload this data separately in many different parts of the program).

People can be remarkably persistent at trying to adapt the same old patterns they’ve used in the past, regardless of whether those patterns are good, bad, or ugly. I briefly considered using lazy statics to solve the globals problem, but quickly rejected it as not solving the fundamental problem in a satisfactory way. As stated by one person in response to a question about global mutable singletons: “Look hard at yourself in the mirror before deciding that you want global mutable variables.”

The standard solution endorsed by Rust is to initialize things in the main function, then pass whatever’s needed to subfunctions as required. But there’s a couple of big problems here:

  1. Functions’ parameter lists can get quite large if every single reference and bit of configuration data is passed as a parameter. Such an approach would visually obscure the meaning of a function call, as the “important” parameters are lost in a sea of common references.
  2. Adding a new common reference or configuration variable would cause a cascade of changes as new parameters would have to be added to all functions that need those references, all functions that call those functions, etc. It’s a code maintenance nightmare.

So what’s my solution?

Quite simply, I introduced a cloneable Globals object that is first created in the main function, and then borrowed by every function that needs access to one of its members. Essentially, this means that most functions will have a parameter that takes a Globals reference, but no other parameters will ever have to be introduced to handle common references or configuration data.

Here’s an example that creates a Globals object and passes references to it:

Screenshot 2017-03-22 10.34.37

This code first creates a Globals object, passing in any references to already-initialized objects that we want to be globally accessible. The globals object is then borrowed mutably (via &mut) by any function that needs access to any of globals’ members.

The definition of Globals itself is the following:

Screenshot 2017-03-22 10.53.46

This Globals object (so far) contains references to 2 other objects: a Display object (which contains an OpenGL rendering context), and a Defaults object (which loads and caches configuration data). Notable features of this definition include:

  • A specific initialization time. All initialization that needs to be done is executed when Globals::new is called from the main function. Initialization order is clear and unambiguous.
  • clone ability. When passing a Globals object to another thread (or when solving tricky borrowing problems), the Globals object may need to be cloned. Essentially, each instance of Globals is thread-local, but may contain references to thread-shared objects. In this example, the Display object is cloned, and the Defaults object is re-created (implementation detail: every instance of Defaults re-loads configuration data on-demand from disk).
  • All global definitions are localized in a single Globals type. No project needs more than one Globals type, all global variables can be seen at a glance from the definition of the Globals structure, and adding new global variables or constants can be done without modifying any other part of the program.

I found it useful to distinguish between 2 different types of global data:

  1. Shared mutable data. Shared mutable data should be protected by locks. Locking has a performance cost, so be sure you really need the data to be both shared and mutable before you introduce locks.
  2. Referentially transparent data. You can expect some data to be equivalent across all threads (like the Defaults objects above). This data can be safely re-computed for each instance of Globals, copied when the Globals object is cloned, or some combination of both. Re-computation can be on-demand, so that each thread only computes the information that it needs.

It surprises me how much global data falls into the “referentially transparent” category. The cost of locking (and other synchronization primitives) can be non-trivial, and if shared-state mutation is not really needed, it can be faster to copy or re-compute the data. Lazy statics are a poorly-thought-out attempt to apply a design pattern that should really be deprecated: they demand mutable shared state where it is not needed, produce unnecessary synchronization overhead, can easily introduce ambiguity in initialization order, and (in extreme cases) raise the possibility of infinite loops during initialization. Global variables and singletons are very well-known design patterns. Try not to use them.

The Cloneable Globals approach I outlined here is my attempt at solving the problem of widely-shared variables and other data in a manner consistent with the thread-safe borrowing mechanisms of Rust.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s