crush depth

Maven Central Signing

A while ago I started having a problem with signature verification when trying to upload to Maven Central.

After three months of intense debugging and a rather long conversation with Sonatype, it turned out that the issue was actually with the signing key.

At the start of 2018, I'd switched to ed25519 signing keys. It turned out that the version of Nexus running on Maven Central didn't support ed25519. To work around this, I created a new RSA key solely intended for signing Maven packages. Bizarrely, this key didn't work either. Nobody could work out why the signatures were failing, and the problem was escalated twice to Sonatype's internal support people.

It turned out that the problem was an ed25519 signature on the new RSA key!

The moral of the story: If you want to deploy to Maven Central, use only RSA keys and make sure that the signatures on those RSA keys only come from other RSA keys. If you fail to do this, you won't get an actionable error message when you try to deploy packages, you'll just get a "Signature verification failed" message. Sonatype are updating their documentation to ensure that nobody else has to lose time to this.

Thanks to Joel Orlina for being patient during those three months and for handling the support teams.

Competing Module Systems

It's been about eight months since the release of Java 9.

I won't bore anyone with the details of everything it introduced as that information is available just about anywhere you care to look. The main thing it added, however, is the subject of this post: The Java Platform Module System.

The JPMS was introduced, and programmers reacted in the normal way that people working in a field that's supposed to demand a rational and critical mind react: They started frothing at the mouth, claiming that Oracle were trying to kill Java, and planning a mass exodus to other more hip languages. Kotlin or Rust, probably. Needless to say, Oracle weren't trying to kill Java. If Oracle wanted to kill Java, they could certainly find a way to do it that didn't require seven expensive years of intense, careful, and painful software engineering. So far, the exodus appears to have been quietly called off.

I'm guessing the people that complained the loudest are the sort of people that write a ton of fragile, error-prone, unsupported reflection hacks and then spew bile when some tiny aspect of the platform changes and their code breaks.

I have a ton of code, none of which I'd describe as legacy. I'm somewhat invested in OSGi for reasons I'll go into shortly.

Today: I'm conflicted and slightly nervous!

I decided when Java 9 came out that I was going to pursue full modularization for all of my projects. As I've said before, my code is already modular because I've been designing it around OSGi. However, in order for it to become modular in the JPMS sense, I'd have to write module descriptors for each project.

I'm developing a commercial (but open-source) game, where the game itself and third party modifications are handled by a well-specified, strongly-versioned module system. Consider something that (from the user's perspective) behaves a bit like the system in OpenTTD (except with the entire game delivered this way, not just third-party addons):

OpenTTD

I briefly considered stopping using OSGi and using the JPMS to power the whole system. That experiment fairly quickly ended, though. Let's look at some of the things that OSGi does or has that are important to me:

  1. OSGi has a full repository specification that gives applications the ability to download and install packages at run-time. You tell the system what packages you want, and it fetches those and all of their dependencies. Everything is standardized, from the API to the actual metadata served by repository implementations. Here's an example of a standard repository index (albeit styled with an XSL stylesheet).

  2. OSGi bundles have rich metadata associated with them, including very fine-grained and versioned dependency metadata. If you have a bundle, you can find out very easily what else you need in order to be able to use it. The metadata can also express things beyond simple package dependencies.

  3. OSGi can load and unload modules at run-time, and has standard APIs to do so (and standard APIs that application code can implement in order to react to modules being loaded and unloaded).

  4. OSGi has wide support; it's fairly rare these days to find a Java project that isn't also implicitly an OSGi project. Making something OSGi-compatible is often just a case of adding a single Maven plugin invocation (assuming that the code doesn't make a lot of unfounded assumptions about class loaders).

Let's see what the JPMS has (or hasn't), in comparison:

  1. The JPMS doesn't have anything like a repository specification. The closest analogy is probably fetching things off of Maven Central, but there's no standard API for doing this. Essentially, obtaining modules is Someone Else's Problem.

  2. JPMS modules don't have much metadata associated with them. A module says what packages it exports and upon which other modules it depends. However, it doesn't say anything about which versions of those modules are required. Essentially, you're expected to use a system like Maven. This is Someone Else's Problem.

  3. The JPMS can, in some sense, load modules at run-time via ModuleLayers. There's no support for explicitly unloading a module. There are no standard APIs that code can use to be notified that a module has been loaded. Essentially, you're expected to build your own ad-hoc OSGi-like system on top of module layers. This is Someone Else's Problem.

  4. Most projects haven't even begun to modularize their code. Even in a well-staffed and extremely fundamental project like Google Protobuf, I'm still waiting after six months for them to make a harmless one-line JAR manifest update. I cannot begin to imagine what it would take to push full modularization through a system that slow moving. If a project you depend on hasn't modularized, you're not going to be modularizing either.

The first three points mean that I'd have to invent my own OSGi-like system. I'd have to come up with a uniform, platform-agnostic metadata standard that could be inserted into JAR files in order to specify dependency information. Third party projects (at least those that aren't written specifically for my game engine) aren't going to be interested in adding metadata for the sake of one man's low budget module system. I can't trust JAR files to contain Maven pom.xml files (although many do), as there are plenty of other build systems in use that don't include that information. Even then, you can't just parse a Maven pom.xml, you need to evaluate it. This would mean pulling in Maven APIs as a dependency, and those neither work in a JPMS modular context (because they're not modularized and never will be) nor an OSGi context (because they use their own incompatible module system called Plexus).

Let's assume by some miracle that I get a good percentage of the Java ecosystem to include dependency metadata in a format I can consume. I now need to come up with an API and tools to fetch modules from a repository. That's not hard, but it's still more than not having to do it at all. I'd want version ranges, so I'd also need to write code to solve the versioning problem which is NP-complete. Tricky.

So let's assume by now that I can, given the name of a module, fetch the dependencies at run-time using my shiny new API and metadata. I still need to be able to load and unload that set of modules at run-time, and have code in other modules react properly to that occurring. There aren't any standard APIs to do this in the JPMS, and so I'd have to write my own. Right now, the JPMS is not actually capable of expressing the kind of things that are possible with OSGi bundles, so any system I built would be strictly less capable than the existing OSGi system. At least part of the problem appears to come down to missing APIs in the JPMS, so at least some of this might be less of an issue in the future. Still, it's work I'd have to do.

So let's assume that I can fetch, load, and unload modules at run-time, and code in modules can react to this happening. I still need modules to put in the system, and not many Java projects are modularized. I tried, for a few months, to use only projects that have either added Automatic-Module-Name entries to their JAR manifests, or have fully modularized. It was fairly painful. I'd be limiting myself to a tiny percentage of the vast Java ecosystem if I did this. This, however, is something that OSGi also went through in the early days and is no longer a problem. It's just a matter of time.

So why am I nervous and still conflicted?

The default effect suggests that, just because the JPMS is the default Java module system, the JPMS is the one that developers will turn to. Tool support for JPMS modules is already vastly better in Intellij IDEA than it is for OSGi. In Eclipse, the support is about equal.

The OSGi R7 JAR files will not contain Automatic-Module-Name entries which means that if you have any dependency on any part of OSGi (including parts that are not even used at run-time, like the annotations), your code cannot be fully modularized. If developers are forced to choose between supporting that old module system that nobody uses or that shiny new default module system that everyone's talking (or possibly frothing) about, which one do you think they'll pick?

I'm also aware that this kind of dynamic module loading and unloading is not something that most applications need. For years people were happy with a static (although extremely error-prone) class path mechanism, and the majority of applications are just recompiled and re-deployed when updated. The JPMS can support this kind of deployment sufficiently. Previously, if developers wanted any kind of safe module system at all, they had to turn to OSGi. They might only use it in the context of a static application that is not dynamically updated, but they'd still use it. Why would those same developers still turn to OSGi when the JPMS exists?

Finally, I pay attention to the various new developments to the Java language and JVM and, fairly often, new features are proposed that have subtle (or not-so-subtle) dependencies on the new module system. There are various optimization strategies, for example, that can be enabled if you know that instances of a given type are completely confined to a single module. Unless the VM can be magically persuaded that OSGi bundles are JPMS modules (this is unlikely to happen), then code running in OSGi is very much going to become a second-class citizen.

So, I'm nervous and conflicted. I don't want to build some sort of ad-hoc OSGi-lite system on the JPMS. I don't want to do the work, I don't want to maintain it, and I don't think the result would be very good anyway. I also, however, am unsure about continuing to base my code on a system that's going to have to work hard not to be considered irrelevant. I believe OSGi is the superior choice, but it's not the default choice, and I think that's going to matter more than it should.

I suspect I'm going to finish assisting any remaining projects that I've started helping to modularize, and not do any more. I had decided that I was going to push hard to move all of my projects to requiring Java 9 as part of the modularization effort, but unfortunately this would leave me unable to deploy code on FreeBSD as there's no JDK 9 there and likely won't be. With the new six-month release cycle (and 18-month long-term release cycle), porting efforts will likely be directed towards JDK 11. This means that I won't be able to deploy anything newer than Java 8 bytecode on FreeBSD for at least another six months.

IPv6 And Linux

I wrote a while back about issues with IPv6 on Linux. It turns out that most of the pain occurs for two reasons:

  1. Linux doesn't accept router advertisements by default. If you configure your router to tell everyone on the network that it has a nonstandard MTU, Linux will ignore the advertisements.

  2. Linux will act upon any Packet Too Large messages it receives and in fact will create a temporary route (visible from the ip route command) that has the correct reduced MTU but, for whatever reason, most programs won't use the new route without a restart. That is, if my mail client hangs due to a Packet Too Large message, it won't be able to send any mail until I restart the program.

The first point can be addressed by adding the following to /etc/sysctl.conf:

net.ipv6.conf.default.accept_ra=1
net.ipv6.conf.default.accept_ra_mtu=1

Usually, net.ipv6.conf.default.accept_ra_mtu is already set to 1, but it's worth being explicit about it.

I also add net.ipv6.conf.default.autoconf=0 because I statically assign addresses.

The second point can be addressed by restarting the program, as sad as that is.

Module Chasing

https://github.com/io7m/modulechaser

modulechaser

I'm trying to get to the point where all of my projects are fully modularized as JPMS modules. My code already follows a very modular architecture thanks to designing it around OSGi, but it's necessary to write module descriptors to complete the picture. To write module descriptors, the projects upon which a project depends must first be modularized. This can either mean writing module descriptors for those projects, or it can simply mean assigning an Automatic-Module-Name. Writing a full module descriptor is better, because this means that the project can be used in combination with jlink to produce tiny custom JVM distributions.

My problem is that I have rather a large number of dependencies across all of my projects, and I need to know the most efficient order in which to approach maintainers in order to get them to modularize their projects. If a project A depends on project B, then project A can't be modularized before project B so it's a waste of my time to go asking project A's maintainers before B is modularized.

I wrote a Maven plugin to assist with this problem. It produces reports like this.

The plugin tells me if the current version of a dependency is modularized, if the latest version of the dependency on Maven Central is modularized, and whether the dependency has been fully modularized or simply assigned a name. The table of dependencies is shown in reverse topological order: Start at the top of the table and work downwards, contacting each project maintainer as you go.

Some dependencies will, of course, never be modularized. Whomever published the javax.inject jar, for example, didn't even bother to create a manifest in the jar file. I'm not sure that even constitutes a valid jar file according to the specification. Some dependencies, like javaslang, were renamed (javaslang became vavr) and so code should move to using the newer names instead. Some projects can barely manage to publish to Maven Central (like Xerces) and still appear to use tools and processes from the previous century, so are unlikely to be modularizing any time soon.

Let's Encrypt For Woe And Loss

In a crippling bout of sinusitis, and after reading that Chrome is going to mark http sites as insecure, I decided to put good sense aside and deploy Let's Encrypt certificates on the io7m servers.

I've complained about the complexity of this before, so I started thinking about how to reduce the number of moving parts, and the number of protection boundary violations implied by the average ACME setup.

I decided the following invariants must hold:

  • The web server must not have write access to its own configuration, certificates, or logs. This is generally a given in any server setup. Logging is actually achieved by piping log messages to a log program such as svlogd which is running as a different user. In my case, I can actually go much further and state that the web server must not have write access to anything in the filesystem. This means that if the web server is compromised (by a buffer overflow in the TLS library, for example), it's not possible for an attacker to write to any data or code without first having to find a way to escalate privileges.

  • The web server must have read access to a challenge directory. This is the directory containing files created in response to a Let's Encrypt (LE) challenge.

  • The acme client must have read and write access to the certificates, and it must have write access to a challenge directory, but nothing else in the filesystem. Specifically, the client must not have write access to the LE account key or the directory that the web server is serving. This means that if the client is compromised, it can corrupt or reissue certificates but it can't substitute its own account key and request certificates on behalf of someone else.

  • The acme client must not have any kind of administrative access to the web server. I don't want the acme client helpfully restarting the web server because it thinks it's time to do so.

There are some contradictions here: The acme client must not be able to write to the directory that the web server is serving, and yet the web server must be able to serve challenge responses to the LE server. The acme client must not be able to restart the web server, and yet the web server must be restarted in order to pick up new certificates when they're issued.

I came up with the following:

  • An httpd-reloader service sends a SIGHUP signal to the web server every 30 minutes. This causes the web server to re-read its own configuration data and reload certificates, but does not kill any requests that are in the process of being served and specifically does not actually restart the web server.

  • The acme client writes to a private challenge directory, and a private certificates directory. It doesn't know anything about the web server and is prevented from accessing anything other than those directories via a combination of access controls and chrooting.

  • The web server reads from a read-only nullfs mount of a wwwdata directory, and the challenge directory above is placed in wwwdata/.well-known/acme-challenge via another read-only nullfs mount. The web server also reads from a read-only nullfs mount of the certificates directory above in order to access the certificates that the acme client creates.

  • The acme client is told to update certificates hourly, but the acme client itself decides if it's really necessary to update the certificates each time (based on the time remaining before the certificates expire).

  • The intrusion detection system has to be told that the web server's certificates are permitted to change. The account key is never permitted to change. I don't want notifications every ~90 days telling me the certificates have been modified.

Data flow

I set up all of the above and also took some time to harden the configuration by enabling various HTTP headers such as Strict-Transport-Security, Content-Security-Policy, Referrer-Policy, etc. I'm not going to require https to read io7m.com and I'm not going to automatically redirect traffic from the http site to the https site. As far as I'm concerned, it's up to the people reading the site to decide whether or not they want https. There are plenty of browser addons that can tell the browser to try https first, and I imagine Chrome is already doing this without addons.

The Qualys SSL Labs result:

Result

Now we can all sleep soundly in the knowledge that a third party that we have no reason whatsoever to trust is telling us that io7m.com is safe.