Website banner
Website logo

Fast Customization of Log Messages

Avatar of author
arookas

Treating an integer as its individual bits is useful for manipulating a vector of booleans⧉, a trick known as bit flags or bit fields.

Generally when this is done, the integer is used solely for this purpose and bit fields are masked out, checked, and updated individually. However, there are still uses for reinterpreting the integer as its whole numerical value. I will go over a practical example.

Premise

One might have a logging API with options to customize the output: flags to include source line, file, severity, etc. Different configurations might look like:

// file, line row, message
"parse.cpp(8): null argument\n"
// severity, message
"warning: implicit cast is lossy\n"
// file, severity, code, message
"math.cpp: error 4015: division by zero\n"

It is useful to list out the settings we want. Here are the ones I will use:

Bit Usage
00001 File name
00010 Line number
00100 Row number
01000 Severity
10000 Warning/error code

We will try to implement this flexible logging schema ourselves, with the settings being a single unsigned integer whose bits are treated as the flags listed above. Imagine a report structure with the incoming data and corresponding flag enumeration:

struct Report {
  std::string Message;
  std::string FileName;
  std::string Severity;
  int         LineNumber;
  int         RowNumber;
  int         Code;
  int         Flags;
};

enum Flag {
  FileName   = 0x01,
  LineNumber = 0x02,
  RowNumber  = 0x04,
  Severity   = 0x08,
  Code       = 0x10,
};

First Attempt

An ad hoc implementation would look something like using C++'s std::stringstream and std::cout or C#'s StringBuilder class. Control statements and conditions would be used to build segments of the final message.

Start with the first setting listed, file name:

if (report.Flags & Flag::FileName) {
  std::cout << report.FileName;
  std::cout << ": ";
}

std::cout << report.Message << "\n";

// possible output:
// "foo.cpp: hello world\n"

Looks alright. Now, try to integrate line/row numbers:

if (report.Flags & Flag::FileName) {
  std::cout << report.FileName;
  std::cout << ": ";
}

std::cout << "(";

if (report.Flags & Flag::LineNumber) {
  std::cout << report.LineNumber;
  std::cout << ",";
}

if (report.Flags & Flag::RowNumber) {
  std::cout << report.RowNumber;
}

std::cout << "): ";
std::cout << report.Message << "\n";

// possible output:
// "foo.cpp: (0,0): hello world\n"

Obviously, this implementation is starting to run into some issues.

  1. There is an extra colon between the file name and parenthesis.
  2. If there is a line number but no row number, there will be a trailing comma.
  3. If there is no line or row number, empty parentheses will be printed.

Problem #3 hints at a larger issue of missing or optional data. This will be addressed gracefully later. For now, we will focus on the rising complexity for each log message to support the custom schema.

Second Attempt

Conditionally building the punctuation around the report fields based on presence and settings is both tedious for the programmer and slow for the machine. Switching to modern C++20 output via std::format, we can achieve something much better.

Note: Before C++20, the library {fmt}⧉ may be used as a standin.

Similar to listing the settings, let us list the fields from the report structure we will opt into substitution:

Index (n) Field
0 report.Message
1 report.FileName
2 report.LineNumber
3 report.RowNumber
4 report.Severity
5 report.Code

With each field substituted with {n}, write out the different combinations in a list:

/* 00000 */ "{0}\n"
/* 00001 */ "{1}: {0}\n"
/* 00010 */ "({2}): {0}\n"
/* 00011 */ "{1}({2}): {0}\n"
/* 00100 */ "({3}): {0}\n"
/* 00101 */ "{1}({3}): {0}\n"
/* 00110 */ "({2},{3}): {0}\n"
/* 00111 */ "{1}({2},{3}): {0}\n"
/* 01000 */ "{4}: {0}\n"
/* 01001 */ "{1}: {4}: {0}\n"
/* 01010 */ "({2}): {4}: {0}\n"
/* 01011 */ "{1}({2}): {4}: {0}\n"
/* 01100 */ "({3}): {4}: {0}\n"
/* 01101 */ "{1}({3}): {4}: {0}\n"
/* 01110 */ "({2},{3}): {4}: {0}\n"
/* 01111 */ "{1}({2},{3}): {4}: {0}\n"
/* 10000 */ "{5}: {0}\n"
/* 10001 */ "{1}: {5}: {0}\n"
/* 10010 */ "({2}): {5}: {0}\n"
/* 10011 */ "{1}({2}): {5}: {0}\n"
/* 10100 */ "({3}): {5}: {0}\n"
/* 10101 */ "{1}({3}): {5}: {0}\n"
/* 10110 */ "({2},{3}): {5}: {0}\n"
/* 10111 */ "{1}({2},{3}): {5}: {0}\n"
/* 11000 */ "{4} {5}: {0}\n"
/* 11001 */ "{1}: {4} {5}: {0}\n"
/* 11010 */ "({2}): {4} {5}: {0}\n"
/* 11011 */ "{1}({2}): {4} {5}: {0}\n"
/* 11100 */ "({3}): {4} {5}: {0}\n"
/* 11101 */ "{1}({3}): {4} {5}: {0}\n"
/* 11110 */ "({2},{3}): {4} {5}: {0}\n"
/* 11111 */ "{1}({2},{3}): {4} {5}: {0}\n"

Reinterpret each combination of bit flags as its entire numerical value. If we then sort the combinations in ascending order, a lookup table⧉ is created. The idea is simple: interpret the integer not as of bit flags but as of its numerical value and use the integer as an index into this lookup table.

Let's call the list of schemas above Schemas. The implementation collapses to tidy terms:

std::cout << std::vformat(
  Schemas[report.Flags],
  std::make_format_args(
    report.Message,    // {0}
    report.FileName,   // {1}
    report.LineNumber, // {2}
    report.RowNumber,  // {3}
    report.Severity,   // {4}
    report.Code        // {5}
  )
);

Note: With C++26, this may be simplified even further with std::print and std::runtime_format.

Handling Optional Data

As mentioned in problem #3, if the user allows a particular field to be reported yet that field is missing for this particular report, one might want to handle this case gracefully.

If this is desired, the simplest solution is to filter the flags integer before it is passed to lookup. Unset each flag if the corresponding field in the report is missing:

int flags = report.Flags;

if (report.FileName.empty()) {
  flags &= ~Flag::FileName;
}

if (report.LineNumber <= 0) {
  flags &= ~Flag::LineNumber;
}

// ....

std::cout << std::vformat(
  Schemas[flags],
  // ....

This will increase complexity, but it is linear complexity. The code remains flat and maintainable.

Conclusion

Using a bit-field integer as an index is a fast way to obtain data based on the combination of bit flags currently set in the integer.

Here, it is used to format a log message in O(1) complexity. Another possible use is to calculate population counts⧉.