crush depth

That bin directory. No, not that one, the other one.

For about a week, I've been having DNS resolution issues on one server. The machine runs a tinydns server for publishing internal domain names, and it seemed that after roughly 24 hours of operation, the server would simply stop responding to DNS requests. After exhausting all of the obvious solutions, I restarted the jail that housed the daemon and everything mysteriously started working.

I checked the logs and suddenly realized that there were no messages in the log newer than about a week. I checked the process list for s6-log instances and noticed that no, there were no s6-log instances running in the jail. I checked /service/tinydns/log/run, which looked fine. I tried executing /service/tinydns/log/run and saw:

exec: /usr/local/sbin/s6-setuidgid: not found

OK. So...

# which s6-setuidgid
/usr/local/bin/s6-setuidgid

Apparently, at some point, the s6 binaries were moved from /usr/local/sbin to /usr/local/bin. This is not something I did! There was no indication of this happening in any recent port change entry nor anything in the s6 change log.

The "outage" was being caused by the way that logging is handled. The tinydns binary logs to stderr instead of using something like syslog, with the error messages being piped into a logging process in the manner of traditional UNIX pipes. This is normally a good thing, because syslog implementations haven't traditionally been very reliable. The problem occurs when the process that's reading from the standard error output of a preceding process stops reading. Sooner or later, any attempt made by the preceding process to write data to the output will block indefinitely (presumably, it doesn't happen immediately due to internal buffering by the operating system kernel). In a simple single-threaded design like that used by tinydns, this essentially means the process stops working as the write operation never completes and no other work can be performed in the mean time.

I'm currently going through all of the service entries to see if anything else has quietly broken. Perhaps I need process supervision for my process supervision.

Fighting fires

Outage

Had a minor outage yesterday apparently due to a bug in the older FreeBSD images that DigitalOcean provide. To their credit, they responded extremely quickly to my support request and provided a link to a GitHub repository with a fix that could be applied to live systems.

Glossaries And Modularization

A close friend of mine suggested that this blog might be more readable with a glossary, and I agreed. Computer science is fraught with terms that may have one of several similar-but-not-quite-the-same meanings depending upon the context in which they're used. I've added a glossary that I'll try to keep updated and will link to from the first uses of significant terms when they arise. Note that the glossary tends towards definitions from the perspective of this blog. That is, it lists only my intended definitions of terms as opposed to listing every possible accepted definition of each of them. I've gone back through old posts and tidied up the usages somewhat. I'll be making an effort to use more consistent terminology from now on.

I also took this opportunity to aggressively modularize the zeptoblog codebase and introduce a general API for implementing post generators. The glossary page is an implementation of a post generator that builds a static page from a database of definitions.

Primogenitor

The Maven POM files in io7m projects contain a fair amount of duplication with respect to each other. The (possibly not entirely rational) reason for this is that when I started moving projects to Maven about five years ago, I had an instinctive lack of trust for project inheritance after seeing what a disaster inheritance usually implies in object-oriented programming languages. Five years of experience, however, have taught me that in a non-Turing-complete description language such as Maven's POM, the problems usually caused by inheritance tend not to occur. I can't give any formal reasoning for this, it's purely anecdotal.

I've introduced inheritance in steps: The root POM of each project does most of the work, and then each of the POMs in the modules of the project specify the bare minimum extra information such as dependencies, plugin executions, etc. In practice, most modules just specify some dependencies and an OSGi manifest.

This is fine, but it does mean that the root POMs of all of the projects mostly contain the same few hundred lines of XML. If I make a change to the logic in one POM that I think would be useful in other projects, I have to make the same change in those projects too. Therefore, I'd like to complete the progression and move towards all projects inheriting from a common primogenitor POM. In addition I'd like to add in some extra information such as inserting the current Git revision into produced JAR files, and statically analyzing the bytecode of compiled files to ensure that semantic versioning rules are being followed. I had a fairly productive conversation with Curtis Rueden about some practical aspects of this (Curtis maintains a very large collection of projects that inherit from a common root POM) and I've made a first attempt at a new primogenitor POM. I'll try moving a few of the more recent leaf projects such as jregions to this new root and see how things work out.

Source Generation

Spent most of yesterday and a fairly decent amount of time today replacing the bulk of the hand-specialized jregions code with code generated from a template instead. I don't typically like templating as a rule: If I'm going to be producing text for a language (such as Java code, XML, etc) that's then going to be parsed and consumed by an external system (such as a compiler, an XHTML renderer, etc) then constructing the text step-by-step using an AST representation guarantees that the output will be well-formed. Templating systems don't provide any guarantees. In this case though, the output of the templating system is going to be immediately consumed by the Java compiler, which is then going to indicate any syntax and/or type errors before any other system has a chance to consume the code.

Of the templating systems I know about, only StringTemplate seems to have been developed with any kind of discipline: The primary concept is the strict separation of presentation and logic. A StringTemplate template does not contain any logic and simply defines formal parameters that must be supplied with values when the template is rendered. Note the must here: Failing to provide a value for a parameter causes an error to be raised at rendering time. Most template systems (notably Maven's resource filtering) tend to silently fail or insert garbage in this situation.

I used Kevin Birch's StringTemplate Maven plugin to generate sources as part of Maven's generate-sources phase. It appears to work well, but has some nasty failure modes when individual templates can't be found. Basically, if you tell the plugin to open a template file that doesn't exist, or if the name of the template doesn't match the name of the file within which it is defined, you'll get an unhelpful error like this:

[ERROR] Failed to execute goal com.webguys:string-template-maven-plugin:1.1:render (generate-D) on project com.io7m.jregions.core: Unable to execute template. -> [Help 1]

Even with Maven's -X switch (enables the display of exception stack traces and other debugging output) there was no useful information available. I ended up using the little-known mvnDebug executable (bundled with every Maven install but apparently undocumented) to step through the execution of the plugin and work out what was going wrong. For those that don't know, mvnDebug loads a Java agent into the JVM that causes Maven to wait until an external debugger (in my case, Intellij IDEA) connects before running the build. It turned out that the Maven plugin wasn't exactly at fault: StringTemplate's internal APIs indicate failure by returning null and maybe writing an error message to a provided mutable sequence. In my case, there were several mistakes:

  1. I specified the name of a template file including the suffix: Template.st. Internally, StringTemplate appends its own suffix, so the API was looking for Template.st.st, failing to find it, but not providing an error message.

  2. I specified the name of a template file that didn't exist at all. Same failure mode as the above.

  3. I specified the name of a template P. StringTemplate looked at the file P.st, found the file and parsed it, but it turned out that P.st actually contained a template called Q. Again, this resulted in StringTemplate being unable to find the template I'd named, but refusing to tell me about it.

When those mistakes were found and corrected, the rest of the error reporting was decent. Failing to provide template parameters, making syntax errors within templates, etc, all provided good error messages with line and column numbers.

The result of this is that a lot of the jregions source code (and test suite) is now generated from a common template. No more manual specialization. I also added float and BigDecimal specialized types to exercise the generation system during development. I suspect that I'm going to apply this same methodology to rewrite the jtensors codebase when Java gets value types.