Code Style#
The legate source code “style” is heavily based on the Google style guide and is rigorously enforced through the use of
clang-format and other automatic linting tools.
While seemingly trivial at first, a unified code style helps to significantly increase the readability of a code-base. If all code is written in a similar style and flow, then a reader may begin to form implicit assumptions (which should hold), and does not need to adapt to new forms. In short, the point of a common style is not to enforce one that is “pretty” or “elegant”. It is first and foremost to improve the understand-ability of the code.
However, there are some areas where automatic enforcement is not yet sufficient, or not yet possible. These are discussed below. In all other cases the following rules apply:
- Some particular format is mandated by automated style-checkers. - Unless there are extenuating circumstances (such as a bug in the style-checker, or extreme loss in readability), the word of the checker is law. You should not attempt to override the checker via e.g. - // clang-format offunless it has been specifically approved by a maintainer.
- No particular format is mandated by automated style-checkers. - The author should try to stay consistent with existing practice. More often than not, a similar construct exists somewhere else in the library, which the author should apply to their own situation. The point of this rule is to make future refactoring easier to do. Instead of needing to special-case future checkers for every bespoke solution (and potentially miss cases), they can be tailored to just a handful of cases. 
Functions#
- Functions should be short and sweet. They should do one thing, and do it very well. A general rule is that if a particular function exceeds 50 lines, or 3 levels of nesting then it should be broken up into smaller pieces. 
- Functions that are used only in the current translation unit should be moved inside anonymous namespaces. 
- Functions should not take in-out parameters if this can be avoided. If possible, if a function produces some effect, it should return it as a value. 
- Do not use raw pointer arguments for input parameters. Use either smart pointers, references, or, when necessary, - std::reference_wrapper. Similarly, never use- nullptrto indicate lack of value for a pointer-like type. If a pointer argument is optional to the caller, then wrap it in a- std::optionalinstead. Legate assumes all pointers to be non-NULL.
- Functions taking out-parameters (if it cannot be avoided), should take them as a pointer, not by reference. Out-parameters must come last in the function definition. For example: - // GOOD void foo(const T& in, T value, T* out_1, T* out_2); // BAD: out-param taken by reference void foo(T& out); // BAD: out-param must come last void foo(T* out, const T& in); // BAD: all out-params must come last void foo(const T& in, T* out, T value, T* out_2); - This is to help readability at the call-site: - SomeType f; foo(f); // Does foo() modify f? Who knows? foo(&f); // OK, foo() potentially modifies f 
- Functions that only need to take a view of linear memory or containers (e.g. those taking a - const std::vector&), should always take that view as a- legate::Span<const T>. For example:- // GOOD void foo(Span<const int> values); // BAD void foo(const std::vector<int>& values); - Likewise, functions that take a - const std::string&should instead take a- std::string_view. Both- Spanand- std::string_viewshould always be taken by value.
- For functions in header files, declarations and definitions must always be separate. If the function is a template (or otherwise very very small), the definition should go in the corresponding - .inl. If not, the definition should go in the- .cc. This also applies for any member functions of classes.- This does not apply to translation unit-local functions defined in anonymous namespaces. These may be defined and declared in the same place. - Note - “Small” in this case refers to the code-gen, not the size of the source code itself. Anything that is reasonably expected to be completely optimized away is considered “small”. This is usually either: - Constructors, where everything is - std::move()-ed (which usually end up compiling away to a bunch of pointer shuffling).
- Getters returning some kind of reference (which end up compiling away to just returning a pointer). 
 - Things that are not considered “small” are: - Functions that throw exceptions. 
- Functions that allocate memory. 
- Functions that call other non-small functions. 
 - To elaborate on the rationale for this rule: the goal of defining things in an - .inlis to facilitate compiler optimizations. In the snippet below:- struct Foo { int get_bar() const { return this->bar_; } int bar_; }; int foo(const Foo& f) { return f->get_bar(); } - The compiler should be able to see - get_bar(), because with it, it sees that the code effectively reduces to:- int foo(const Foo& f) { return f.bar_; } - If, however, after unwrapping the various functions, the compiler still has to emit an indirect function call (to a function defined in the - .cc), then there is no point in having it be in the- .inl.- In this case, it is better to have it defined in the - .cc, where the compiler can inline the other function call, resulting in more efficient code overall.
Variables#
- Declarations and code should be separated by a single empty line. Separating declarations and logic helps to improve readability of the code. For example: - // GOOD std::size_t SIZE = 10; std::vector<int> y; y.reserve(SIZE); std::size_t vec_size = y.size(); // BAD: no empty lines before or after declarations std::size_t SIZE = 10; std::vector<int> y; y.reserve(SIZE); std::size_t vec_size = y.size(); 
- Use - autowhenever you already name a type. For example, when using- static_cast()(or the other casts), or when initializing variables. Additionally, use- autowhen the resulting type would be a large template or other such construct whose type may detract from the readability of the code. For example:- // GOOD auto* x = static_cast<int*>(y); auto object = get_complex_type_object(); auto sv = std::string_view{}; // BAD: we already know it will be an int* from the cast int* x = static_cast<int*>(x); // BAD: the type of the variable both is very complex, and needlessly pollutes the code std::unordered_map<std::unique_ptr<SomeType, CustomDeleter>, std::deque<Foo, CustomDeleter>> object = get_complex_type_object(); // BAD: we already know what type it is based on the constructor name std::string_view sv = std::string_view{} 
- When using - autowith pointers, they should be matched with- auto*.
- When using - autowith references, they should be matched with- auto&.
- If it is unclear whether the return value of a function returns by - constreference, value, or r-value reference, the type should be matched with- auto&&.
- Declare variables as - constwhenever possible.
- Variables should be named readably. Unfortunately, there is no hard and fast rule to follow for this, and generally speaking it relies on a programmer’s good judgment, however a few hard rules are: - Never use Hungarian typing. The compiler knows what type it is. You also know what type it is, because the function is short enough to fit on your screen (as per the short functions rule). There is no need to encode this in the name of the variable as well. 
- Never use acronyms. You might know what “ - dcv” (dimension-less color vector) means now, but nobody else does, and neither will you in 4 week’s time.
- Use commonly understood names for common things. For example, - for-loop indices should almost always be- i,- j, or- k. If something returns an iterator, it should probably be called- it, not e.g.- pos,- idx, or- finder.
- Don’t use verbose names. - iterator_into_vectoris no more informative than e.g.- it.- temporary_vectoris no better (arguably, it is worse) than just- tmp.
 
- Instead of - Type var; if (cond) { var = ...; } else { var = ...; } - Prefer returning a value from an immediate lambda: - const auto var = [&] { if (cond) { return ...; } return ...; }(); // Note, lambda is executed immediately - This has 2 main benefits: - It ensures that - varis always initialized (you’ll never forget to set it if you have a lot of if branches).
- You can make - var- const.
 
Misc#
- Use of - LEGATE_CHECK()and- LEGATE_ABORT():- LEGATE_CHECK()and- LEGATE_ABORT()will abort the program if the operand evaluates to false. As such they are to be used only when catching library mistakes. They are semantically equivalent to the standard- abort()macro, in that they enforce foundational pre-conditions or post-conditions which must never be violated in bug-free library code.- For instance, they must never be used to check user-supplied values. In these cases, legate should throw an exception that can be caught and handled by the user. For example: - void foo(int dim) { if (dim < 0) { // The user has given us a bad value, so we should handle this by exception. throw an_exception{...}; } dim = detail::fixup_dim(dim); // The internal function returned a non-positive dim. This can only happen if // there exists a bug within legate itself, in which case the library is probably // in an inconsistent state and the program cannot continue. LEGATE_CHECK(dim > 0); }