Blog

terebinth update

A little while back, I was working on writing an interpreted programming language using C++ as the medium for the interpreter implementation. The repository for that project can be found here, and it has since been converted to a public archive. I was in the middle of investigating the feasibility of modernizing the underlying C++ implementation by moving from C++17 to C++20 with modules, but that task proved Herculean. I abandoned the project for quite awhile until I made a concerted effort a few days ago to get involved again. I soon realized why I had abandoned the modernization effort in the first place, and I considered cessation of the project altogether.

I then had the idea to just archive the original implementation of the project as-is, but instead of discarding the project entirely, I have now decided to pursue reimplementing the code in Rust. This would ensure that the foundation of the terebinth language would be memory-safe and robust. I also decided to change the way I approached the terebinth language. Instead of creating an interpreter, terebinth will now be a compiled language with the full compiler stack written in Rust.

This effort will require a lot of time and dedication, but I am willing to invest the resources necessary to complete this project. The new terebinth repository can be found on GitHub, and a release is available on crates.io. It should be noted that the release is not at all a working product; I published a rough cut just so I could reserve the package name. Subsequent releases will follow as I make progress on the compiler. I will post updates here as well, as I have done with previous projects such as open-dis-rust (which should be moving out of alpha and into beta releases soon).

I plan to clean up a lot of the loose ends that I have currently on GitHub, whether that mean that I delete/archive repositories or get those in-development projects to a state where they are finalized and released. Once that has been accomplished, I will have more of my attention to devote to finishing open-dis-rust and terebinth. As things progress, updates will be posted here if such changes warrant.

This project was something that I wanted to do simply out of curiosity and out of desire for a challenge. I am still very much learning the fundamentals of writing compilers and developing a good programming language, and I don’t anticipate that terebinth will ever be useful to the industry; that doesn’t matter to me, though. I see this as an opportunity for growth and development as a professional in the field of computer science, so the success of the project is not as important as the completion of it. This is a project that I hope will serve as a stepping stone, as something that serves as a building block for bigger and better software in the future.

Release: opendis6 v0.1.0

In tandem with creating an Open DIS package for Rust, I have also been working on modernizing an existing C++ library implementation of the standard. The original library can be found here. The original was autogenerated using an XML parser library that read in the IEEE’s documentation and output C++ code that implemented classes for all the PDUs defined within. This code is several years old and does not adhere to modern C++ style and standards. It is also bug-prone, and it has not received much TLC since its inception. I originally opened a PR in an attempt to clean it up, but I decided that the changes needed to actually complete it and modernize it warranted its own project. With that, opendis6 and opendis7 were born. The latter library has not been implemented yet, but it will be released in the future. For now, opendis6 is available on GitHub via source or on conan.io as a Conan package.

The “6” in the library name is in reference to the sixth revision of the standard: IEEE 1278.1a-1998. This is still widely used across the industry for distributed simulations, so it is vital that it is modernized and kept tidy. As for the “7” in the opendis7 library, that refers to the seventh and latest revision: IEEE 1278.1-2012. Although it is the latest version of the standard, developers can be stubborn when it comes to managing versions of dependencies. Moving to the latest technologies or standards can be challenging, and the effort to migrate to the latest and greatest can sometimes outweigh the benefits. I will eventually get opendis7 released, but it is not much of a priority with the previous thought in mind.

I will continue to maintain opendis6 for as long as I can, but the current version should suffice for most applications. There are a few more improvements I want to add, so subsequent releases should be expected. In the meantime, I will also continue my efforts to get the Rust crate in a releasable state. As with all things, I will post updates to this website.

If users notice any bugs or issues, please report them on the GitHub by opening an Issue or Bug Report. If anyone has changes they want to make, feel free to open a PR and tag me as a reviewer.

Pre-release: open-dis-rust v0.1.0-alpha.8

This is the second pre-release of the day! With this publication, the Logistics family of PDUs has been completed. Now, only 17 PDUs remain. In reviewing the README within the source tree, I realized that I had two PDU types listed that actually do not exist within the standard. I was using the Open DIS C++ library as a reference originally, and it, too, referenced PDUs that did not exist in the IEEE 1278.1 standard. So, having removed those errors, there are now only 17 PDUs left to implement. The remaining PDUs are primarily members of the Entity Management and Synthetic Environment protocol families.

Pre-release: open-dis-rust v0.1.0-alpha.7

With this pre-release version, the Minefield protocol family has been implemented. There are now 24 outstanding PDUs that need to be implemented according to the 2012 version of the IEEE 1278.1 standard. All the latest documentation regarding this pre-release is available on the crates.io page for the package.

Pre-release: open-dis-rust v0.1.0-alpha.5

The Radio Communications protocol family has been implemented in this latest pre-release version. Only 28 PDUs are unimplemented at this point, with a majority of those being within the Minefield protocol family. The next pre-release will address the Minefield PDUs and some other minor cleanup issues. As it stands now, I anticipate there being two or three more pre-release versions in the alpha stage before this moves into beta. From there, it is all bug testing and cleanup before v0.1.0 is cut and released.

As always, I will post here with major updates concerning this project.

Coding with Acne: The pImpl Pattern

Certain class implementations carry with them a lot of compile-time dependencies, and making changes to those will require recompilation of everything that uses a representation of the class. The pImpl (pointer to Implementation) pattern alleviates compile-time dependencies by moving all the implementation-level details of a class into a different class that is accessible via an opaque pointer, which is a fancy name for a pointer that specifically hides implementation details from the caller.

In what use cases would someone need to utilize this pattern? Sometimes, the inclusion of external libraries and dependencies carries unwanted baggage: for instance, the infamous Windows.h library includes macros that can pollute the global scope and introduce undesired behavior. Using the pImpl pattern for a class that must leverage Windows.h would ensure that all the implementation details are hidden away, preventing propagation into other areas of the code.

Other use cases include ABI (application binary interface) stability for libraries. This essentially ensures that the forward-facing binary interface for a library would remain largely intact when a behind-the-scenes implementation change occurs. The public would be none the wiser since all the implementation details for the class are hidden anyways.

The standard implementation of the pattern would look like so:

// firewall.hpp
#include <memory>
#include <cstdint>

class Firewall {
 public:
  /// Default Constructor
  Firewall();
  /// Explicit Constructor
  explicit Firewall(uint64_t);
  /// Default Destructor
  ~Firewall();

  /// Copy constructor
  Firewall(Firewall const&);
  /// Assignment operator
  Firewall& operator=(Firewall const&);

  /// Simple mutator
  void SetData(uint64_t data);
  /// Simple accessor
  uint64_t GetData();

 private:
  /// Forward declaration of the implementation class
  class impl;
  /// Opaque pointer to implementation details
  std::unique_ptr<impl>> pimpl_;
};
// firewall.cpp
#include "firewall.hpp"

class Firewall::impl {
 public:
  /// Constructor
  explicit impl(uint64_t data) : data_(data){};
  /// Destructor
  ~impl();
 private:
  uint64_t data_;
};

/// Firewall class ctor
Firewall::Firewall(uint64_t data) : pimpl_(new impl(data)) {}
/// Firewall class dtor
Firewall::~Firewall() = default;

/// Copy constructor
Firewall::Firewall(Firewall const& other)
    : pimpl_(new impl(*other.pimpl_)) {}

/// Assignment operator
Firewall& Firewall::operator=(Firewall const& rhs) {
  std::swap(pimpl_, rhs.pimpl_);
  return *this;
}

/// Accessor
uint64_t Firewall::GetData() {
  return pimpl_->data_;
}

/// Mutator
void Firewall::SetData(uint64_t data) {
  pimpl_->data_ = data;
}

Wow. That is a lot of code for a simple class, especially when compared to a pattern like the singleton. The complexity of implementing the pattern alone may be daunting enough to drive some developers away. The example provided above is a very bare-bones version of the pattern; more robust (and consequently more complicated) versions exist.

Benefits

The example class provided above is named Firewall in reference to the fact that the pImpl pattern creates a “compilation firewall.” This was briefly discussed at the beginning of the article, but what this implies is that changes to the implementation-level details of the class do not require recompilation of the code that references or uses the class.

Another pro is again something else touched on previously: binary compatibility. When updating a library that uses this pattern, as long as the ABI remains the same, the end user will be able to link to the latest version of the library with no issue.

Drawbacks

The obvious issue here is a hit to runtime performance. The main benefits to using this pattern are realized at compile-time, so naturally this inversely affects runtime. The reason this occurs is that by hiding all the class implementation details behind an opaque pointer, a layer of indirection is added to the class. This means that, to access or modify something within the class, all operations must first be accessed via the unique pointer to the implementation. This may not affect overall runtime that much, but for time-sensitive applications, it may be enough to stay away.

The complexity of the code by itself is a drawback; this pattern is more involved than other C++ design patterns, and at first glance, it can be confusing to understand what is going on. Maintenance quickly becomes a factor as well. Having multiple classes that implement pImpl could become overwhelming without proper documentation and understanding of the code.

Debugging can be a nightmare as well since the class is split. Determining where an issue is occurring could be cumbersome, even with tools such as gdb. Testing might be a problem, although most testing would focus on the front-facing interface.

As with all patterns, even if implemented correctly, it can very easily be overused. This pattern is great for libraries that experience frequent updates that don’t affect their ABI; aside from that, the benefits are up to the discretion of the developer. Personally, I have worked with code that has unnecessarily implemented the pImpl pattern, and it can be overwhelming to read through and understand what the class should be doing.

If you enjoy learning about design patterns, or have suggestions for other discussions / examples, leave a comment below, and make sure to subscribe to email notifications for future articles.

The Singleton Pattern in C++

A singleton is a class that does not allow for dynamic instantiation. At runtime, one instance of the class is statically initialized, and this instance cannot be copied or assigned. To use the object, an accessor method is provided within the class definition to allow another portion of the program to grab a reference. Here is an example class definition that uses this pattern:

// singleton.h
#include <string>

class Singleton {
 public:
  /// Deleted Singleton Copy Constructor
  Singleton(Singleton const&) = delete;

  /// Deleted Singleton Assignment Operator
  void operator=(Singleton const&) = delete;

  /// Static accessor method
  static Singleton& Instance() {
    static Singleton singleton;
    return singleton;
  }
  
  void SetValue(const std::string& str) {
    this.value_ = str;
  }

  std::string GetValue() {
    return value_;
  }

 private:
  /// Privatizing the Singleton default constructor
  Singleton() = default;
  /// Dummy member field for demonstration
  std::string value_;
}

After defining a class with the singleton pattern, it can be used like so:

// main.cpp
#include <iostream>
#include "singleton.h"

int main() {
  auto singleton = Singleton::Instance();
  singleton.SetValue("Hello World!");
  std::cout << singleton.GetValue() << std::endl;
  return 0;
}

The main.cpp file calls Singleton::Instance() which returns a reference to the statically initialized singleton object and assigns the reference to singleton. From there, we now have a reference to the static singleton object that we can use to access or mutate its member fields. Calling SetValue() allows us to pass in a std::string to mutate the singleton’s value_ member variable. Calling GetValue() returns the now-modified member variable and prints it to stdout.

Benefits

The singleton pattern is great for instances where multiple parts of a program need global access to a single resource. Such examples would include a global configuration service or a database management interface; there is hardly a necessity to have multiple copies of such classes, so it would be better to have one instance that is globally accessible to all portions that require its use.

This pattern can also prevent unnecessary memory allocation; by making it impossible to create more than one instance of a class that uses this pattern, it is guaranteed that memory will not have to be allocated more than once for the class (of course, this does not take into account possible memory allocations for dynamically resizable and allocatable member fields).

By ensuring one instance of a class, the program only has one interface for interacting with and accessing the resource. This alleviates having to deal with multiple interfaces or representations of a particular type; calling the Instance method for the class is enough.

Drawbacks

Shared, globally accessible resources can be great for single-threaded programs, but multi-threaded applications can quickly run into race condition issues. To counteract this, the singleton would need to be made thread-safe, but this can increase the complexity of the class design.

Design patterns are great for easily replicable code, but some patterns can be overused or incorrectly used. The singleton pattern can oftentimes be overused; sometimes, a simple, standard class definition would suffice. It is critical that this pattern only be leverage when it is deemed truly necessary.

Nonetheless, the singleton, if used correctly, can greatly improve the design and flow of a program. The benefits often outweigh the drawbacks, but as with any design pattern, it is up to the developer to analyze and determine its effectiveness and applicability to the task at hand.

Basic Enumerations

An enumeration, or enum in Rust, is a way to declare a custom data type that describes, or enumerates, possible variations of a certain category. In the example below, the data type is named Language, and it serves as the category for which all the named items within are a part. EnglishSpanish, etc. are all a member of the Language enumeration. Enumerations are implicitly given an integer value corresponding with their ordinal position within the definition, starting with 0. Unless expressly overridden, the numerical value of each member increments by 1. These types of enumerations are classified as unit-only enums. Other types of enumerations are described in detail in the Advanced Enumerations section.

// Unit-only enumeration
enum Language {
  English,  // = 0
  Spanish,  // = 1
  Italian,  // = 2
  French,   // = 3
  German,   // = 4
}
// Overridden unit-only enumeration
enum Country {
  America = 4,
  Germany,  // = 5
  France,   // = 6
  Austria = 8,
}

Constants

The const keyword is similar to the let keyword: it allows the creation of an immutable variable. Constants can never be mutable, and they must always be set to a constant expression, i.e., something that has a definite value and that is not the result of a runtime calculation.

const GRAVITY: f32 = 9.801;
const PI: f32 = 22 / 7;
const mut NOT_ALLOWED: u32 = 5;  // err: the 'mut' keyword is not allowed with 'const'

Note: the f32 and u32 are type declarations for a 32-bit floating point and 32-bit unsigned integer, respectively. Type definitions are discussed in a separate post.

Mutable Variables

Making a variable mutable is simple with the mut keyword. Mutability means that the value of a variable can be overwritten. See the example below:

fn main() {
  // the `mut` keyword means that 'x' is now able to be overwritten
  let mut x = 1;
  println!("The value of x is {x}");  // prints 1 as the value of x
  x = 2;
  println!("The value of x is {x}");  // prints 2 as the value of x
}