It's that time of year again.
Fingerprint | Comment --------------------------------------------------------------------------- E362 BB4F 16A9 981D E781 2F6E 10E4 AAD0 B00D 6CDD | 2024 personal 37A9 97D5 970E 145A B9DB 1409 A203 E72A D3BB E1CE | 2024 maven-rsa-key
Keys are published to the keyservers as usual.
A while ago I got into
a fight with jpackage.
Long story short, I concluded that it wasn't possible to use jpackage
to
produce an application that runs in "module path" mode but that also has
one or more automatic modules.
It turns out that it is possible, but it's not obvious from the documentation at all and requires some extra steps. The example project demonstrates this.
Assume I've got an application containing modules com.io7m.demo.m1
,
com.io7m.demo.m2
, and com.io7m.demo.m3
. The com.io7m.demo.m3
module
is the module that contains the main class (at com.io7m.demo.m3/com.io7m.demo.m3.M3
).
In this example, assume that com.io7m.demo.m2
is actually an automatic module
and therefore would cause jlink
to fail if it tried to process it.
I first grab a JDK from Foojay and unpack it:
$ wget -O jdk.tar.gz -c 'https://api.foojay.io/disco/v3.0/ids/9604be3e0c32fe96e73a67a132a64890/redirect' $ mkdir -p jdk $ tar -x -v --strip-components=1 -f jdk.tar.gz --directory jdk
Then I grab a JRE from Foojay and unpack it:
$ wget -O jre.tar.gz -c 'https://api.foojay.io/disco/v3.0/ids/3981936b6f6b297afee4f3950c85c559/redirect' $ mkdir -p jre $ tar -x -v --strip-components=1 -f jre.tar.gz --directory jre
I could reuse the same JDK from the first step, but the JRE is smaller and
thus it makes the application distribution smaller as we won't be using jlink
to strip out any unused modules.
I then build the application, and this produces a set of platform-independent modular jar files:
$ mvn clean package
I copy the jars into a jars
directory:
$ cp ./m1/target/m1-20231111.jar jars $ cp ./m2/target/m2-20231111.jar jars $ cp ./m3/target/m3-20231111.jar jars
Then I call jpackage
:
$ jpackage \ --runtime-image jre \ -t app-image \ --module com.io7m.demo.m3 \ --module-path jars --name jpackagetest
The key argument that makes this work is the --runtime-image
option. It
effectively means "don't try to produce a reduced jlink
runtime".
This produces an application that works correctly:
$ file jpackagetest/bin/jpackagetest jpackagetest/bin/jpackagetest: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, not stripped $ ./jpackagetest/bin/jpackagetest M1: Module module com.io7m.demo.m1 M2: Module module m2 JRT: java.base JRT: java.compiler JRT: java.datatransfer JRT: java.desktop ...
We can see from the first two lines of output that both com.io7m.demo.m1
and (the badly-named) m2
are on the module path and have not been placed
on the class path. This means that any services declared in the module
descriptors will actually work properly.
We can take a look at the internal configuration:
$ cat jpackagetest/lib/app/jpackagetest.cfg [Application] app.mainmodule=com.io7m.demo.m3/com.io7m.demo.m3.M3 [JavaOptions] java-options=-Djpackage.app-version=20231111 java-options=--module-path java-options=$APPDIR/mods
We can see that the internal configuration uses an (undocumented) $APPDIR
variable that expands to the full path to a mods
directory inside the
application distribution. The mods
directory contains the unmodified
application jars:
$ ls jpackagetest/lib/app/mods/ m1-20231111.jar m2-20231111.jar m3-20231111.jar $ sha256sum jars/* f8de3acf245428576dcf2ea47f5eb46cf64bb1a5daf43281e9fc39179cb3154f jars/m1-20231111.jar 6ad0f7357cf03dcc654a3f9b8fa8ce658826fc996436dc848165f6f92973bb90 jars/m2-20231111.jar b5c4d7d858dad6f819d224dd056b9b54009896a02b0cd5c357cf463de0d9fdd2 jars/m3-20231111.jar $ sha256sum jpackagetest/lib/app/mods/* f8de3acf245428576dcf2ea47f5eb46cf64bb1a5daf43281e9fc39179cb3154f jpackagetest/lib/app/mods/m1-20231111.jar 6ad0f7357cf03dcc654a3f9b8fa8ce658826fc996436dc848165f6f92973bb90 jpackagetest/lib/app/mods/m2-20231111.jar b5c4d7d858dad6f819d224dd056b9b54009896a02b0cd5c357cf463de0d9fdd2 jpackagetest/lib/app/mods/m3-20231111.jar
Now to try to get this working on Windows with the elderly wix tools...
I've recently been looking into the allocation of device memory in Vulkan. It turns out there's a lot of complexity around balancing the various constraints that the API imposes.
Vulkan has the concepts of device memory and host memory. Host memory is the common memory that developers are accustomed to: It's the memory that is accessed and controlled by the host CPU on the current machine, and the memory that is returned by malloc() and friends. We won't talk much about host memory here, because there's nothing new or interesting to say about it, and it works the same way in Vulkan as in regular programming environments.
Device memory, on the other hand, is memory that is directly accessible to whatever the GPU is on the current system. Device memory is exposed to the Vulkan programmer via a number of independent heaps. For example, on a system (at the time of writing) with an AMD Radeon RX 5700 GPU, device memory is exposed as three different heaps:
The different heaps have different performance characteristics and different capabilities. For example, some heaps can be directly memory-mapped for reading and writing by the host computer using the vkMapMemory function (similar to the POSIX mmap function). Some heaps are solely and directly accessible to the GPU and therefore are the fastest memory for the GPU to read and write. In order for the host CPU to read and write this memory, explicit transfer commands must be executed using the Vulkan API. Reads and writes to memory that is not directly connected to the GPU must typically go over the PCI bus, and are therefore slower than reads and writes to directly GPU-connected memory in relative terms.
Naturally, different types of GPU expose different sets of heaps. On systems with GPUs integrated into the CPU, there might only be a single heap. For example, on a fairly elderly laptop with an Intel HD 620 embedded GPU, there is simply one 12gb heap that is directly accessible by both the GPU and host CPU.
Vulkan also introduces the concept of memory types. A memory type is a rather vague concept, but it can be considered as a kind of access method for memory in a given heap. For example, a memory type for a given heap might advertise that it can be memory-mapped directly from the host CPU. A different memory type might advertise that there's CPU-side caching of the memory. Another memory type might advertise that it is incoherent, and therefore requires explicit flush operations in order to make any writes to the memory visible to the GPU.
Implementations might require that certain types of GPU resources be allocated in heaps using specific memory types. For example, some NVIDIA GPUs strictly separate memory allocations made to hold color images, allocations made to hold arrays of structured data, and so on. These memory type requirements can exist even when allocations of different types are being made to the same heap.
As mentioned earlier, differing performance characteristics between heaps means
that developers will want to place different kinds of resources in different
heaps in order to take advantage of the properties of each heap. For example, if
the programmer knows that a given texture is immutable, and that it will be
loaded from disk once and then sampled repeatedly by the GPU when rendering,
then it makes sense for this texture to be placed into the fastest, directly
GPU-connected memory. On the other hand, consider what happens if a developer
knows that the contents of a texture are mutable and will be updated on every
frame: In one manner or another, the contents of that texture are almost
certainly going to traverse the PCI bus each time it is updated (assuming a
discrete GPU with a separate device-local heap). Therefore,
it makes sense for that texture to be allocated in a heap that is directly
CPU-accessible and have the GPU read from that memory as needed. Directly
GPU-connected memory tends to be somewhat of a more precious and less abundant
resource, so there's little to be gained by wasting it on a texture that will
need to be transferred anew to the GPU on every frame anyway! The small
256mb
heap mentioned at the start of this article is explicitly intended
for those kinds of transfers: The CPU can quickly write data into that heap
and then instruct Vulkan to perform a transfer from that heap into the main
device-local heap. This is essentially a heap for
staging buffers.
When allocating a resource, the developer must ask the Vulkan API what the memory requirements will be for the given resource. Essentially, the conversation goes like this:
Developer: I want to create a 256x256 RGBA8 texture, and I want the texture to be laid out in the most optimal form for fast access by the GPU. What are the memory requirements for this data?
Vulkan: You will need to allocate 262144 bytes of memory for this texture, using memory type
T
, and the allocated memory must be aligned to a 65536 byte boundary.
The blissfully unaware developer then calls vkAllocateMemory,
passing it a memory size of 262144
and a memory type T
. The specification
for vkAllocateMemory
actually guarantees that whatever memory it returns will
always obey the alignment restrictions for any kind of resource possible, so
the developer doesn't need to worry about the 65536
byte alignment restriction
above.
This all works fine for a while, but after having allocated a hundred or so
textures like this, suddenly vkAllocateMemory
returns VK_ERROR_OUT_OF_DEVICE_MEMORY
.
The developer immediately starts doubting their own sanity; after all, their
GPU has a 8gb heap, and they've only allocated about ~26mb of textures. What's
gone wrong?
Well, Vulkan imposes a limit on the number of active allocations that can exist
at any give time. This is advertised to the developer in the
maxMemoryAllocationCount
field of the VkPhysicalDeviceLimits
structure returned by the vkGetPhysicalDeviceProperties
function. The Vulkan specification guarantees that this limit will be at least
4096
, although it does give a laundry list of reasons why the limit in practice
might be lower. In fact, in some situations, the limit can be much
lower than this. To quote the Vulkan specification for vkAllocateMemory
:
As a guideline, the Vulkan conformance test suite requires that at least 80 minimum-size allocations can exist concurrently when no other uses of protected memory are active in the system.
In practical terms, this means that Vulkan developers are required to ask for a small number of large chunks of memory, and then manually sub-allocate that memory for use with resources. This is where the real complexity begins.
There are a practically unlimited number of possible ways to manage memory, and there are entire books on the subject. Vulkan developers wishing to sub-allocate memory must come up with algorithms that balance at least the following (often contradictory) requirements:
maxMemoryAllocationCount
limit for vkAllocateMemory
.maxMemoryAllocationSize
of the
VkPhysicalDeviceMaintenance3Properties
structure returned by the vkGetPhysicalDeviceProperties2
function.vkAllocateMemory
function is guaranteed to return memory that is suitably aligned for any
possible resource, but developers sub-allocating from one of these
allocations must ensure that they place resources at correctly-aligned
offsets relative to the start of that allocation.4096x1024 RGBA8
texture will
require roughly 16mb
of storage. If we divide the heap up into allocations
no larger than 8mb
, we will never be able to store a texture of this size.Edit: Textures and buffers can use non-contiguous memory via sparse resources. Support for this cannot be relied upon.
It is fairly difficult to come up with memory allocation algorithms that will meet all of these requirements.
Developers are expected to use large allocations in order to stay below the
limit on the number of active allocations imposed by vkAllocateMemory
, but
at the same time they can't use allocations that are too large and would
exceed the maxMemoryAllocationSize
limit. Developers don't know what sizes
and types of heaps they will be presented with, so allocation sizes must be
decided by educated guesses and heuristics, and probing the heap sizes at
application startup.
In order to obey alignment restrictions, reduce memory fragmentation and avoid wasting too much memory by having unused space between aligned objects, it's almost certainly necessary to bucket sub-allocations by size, and place them into separate regions of memory. If this bucketing is not performed, then large numbers of small sub-allocations within an allocation can result in there not being enough contiguous space for a larger sub-allocation, even if there is otherwise enough non-contiguous free space for it.
How should sub-allocations be bucketed, though? The Vulkan specification does provide some guarantees as to what the returned alignment restrictions will be, not limited to:
The alignment member is a power of two.
If usage included VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, alignment must be an integer multiple of VkPhysicalDeviceLimits::minStorageBufferOffsetAlignment.
However, the alignment requirements can vary wildly between platforms. As
an example, I wrote a small program that tried asking for the memory
requirements for an RGBA8
texture in every combination of power-of-two
sizes up to a maximum of 4096
(the largest texture width/height guaranteed to
be supported by all Vulkan implementations). I specifically asked for textures
using the tiling type VK_IMAGE_TILING_OPTIMAL
as there is very little reason
to use the discouraged VK_IMAGE_TILING_LINEAR
. Use of VK_IMAGE_TILING_LINEAR
can relax storage/alignment restrictions at the cost of much slower rendering
performance.
I ran the program on a selection of platforms:
LinuxIntelMesa
: Linux, Intel(R) HD Graphics 620, Mesa driverLinuxAMDRADV
: Linux, AMD Radeon RX 5700, RADV driverWindowsAMDProp
: Windows, AMD Integrated, proprietary driverWindowsNVIDIAProp
: Windows, NVIDIA GeForce GTX 1660 Ti, proprietary driverThe following graph shows the alignment requirements for every image size on every driver (click for full-size image):
Some observations can be made from this data:
LinuxIntelMesa
platform, the required alignment for image data is
always 4096
. This is almost certainly something to do with the fact that
the GPU is integrated with the CPU, and simply expects image data to be
aligned to the native page size of the platform.WindowsNVIDIAProp
platform, the required alignment for image data
is always 1024
.LinuxAMDRADV
platform, the required alignment for image data is
either 4096
or 65536
. Strangely, there appears to be no clear relation
that explains why an image might require 65536
byte alignment instead of
4096
byte alignment. The first image size to require 65536
byte alignment
is 128x128
, which coincidentally requires 65536
bytes of storage. However,
a smaller image size such as 256x64
also requires 65536
bytes of storage,
but only has a reported alignment requirement of 4096
bytes.WindowsAMDProp
platform behaves similarly to the LinuxAMDRADV
platform except that it often allows for a smaller alignment of 256
bytes.
Even some very large images such as 16x4096
can require a 256
byte
alignment.Similarly, the data for the storage requirements for each size of image (click for full-size image):
Some observations can be made from this data:
128x16
image using 4
bytes per pixel
should theoretically take 128 * 16 * 4 = 8192
bytes of storage space,
but it actually requires 20504
, 16384
, or 8192
bytes depending on
the target platform.LinuxIntelMesa
, images will always consume at least 8126
bytes.
On LinuxAMDRADV
, images will always consume at least 4096
bytes.
On WindowsNVIDIAProp
, images will always consume at least 512
bytes.
On WindowsAMDProp
, images will always consume at least 256
bytes.2x4096
and 4x4096
sized images require
the same amount of storage on all surveyed platforms. This is true for
some platforms all the way up to 64x4096
!LinuxIntelMesa
, storage sizes vary a lot, and are often not powers of
two. On the other platforms, storage sizes are always a power of two.The raw datasets are available:
With all of this data, it suffices to say that it is not possible for an allocator to use any kind of statically-determined, platform-independent size-based bucketing policy; the storage and alignment requirements for any given image differ wildly across platforms and seem to bear very little relation to the dimensions of the images.
However, textures are fairly complex in the sense of having lots of different properties such as format, number of layers, number of mipmaps, tiling mode, etc. We know that most GPUs have hardware specifically dedicated to texture operations, and so we can infer that a lot of the odd storage and alignment restrictions might be due to the idiosyncrasies of that hardware.
Vulkan developers also work with buffers, which can more or less be thought
of as arrays that live in device memory. Do buffers also have the same storage
and alignment oddities in practice? I wrote another program that requests
memory requirements for a range of different sizes of buffer and ran it
on the set of platforms above. I requested a buffer with a usage of
type VK_BUFFER_USAGE_STORAGE_BUFFER_BIT
, although trying different usage
flags didn't seem to change the numbers returned on any platform, so we can
probably assume that the values will be fairly consistent for all usage flags.
The following graph shows the alignment requirements for a range of buffer sizes on every driver (click for full-size image):
Only one observation can be made from this data:
The following graph shows the storage requirements for a range of buffer sizes on every driver (click for full-size image):
Some observations can be made from this data:
LinuxAMDRADV
and WindowsNVIDIAProp
, requesting a buffer of less
than 16
bytes simply results in an allocation of 16
bytes.1000
bytes (the threshold will almost certainly
turn out to be 1024
), the required storage size is exactly equal to the
requested buffer size.The raw datasets are available:
So, in practice, on these particular platforms, buffers do not appear to have such a wide range of storage and alignment requirements.
By combining some of the measurements we've seen so far, and by seeing what guarantees the Vulkan spec gives us, we can try to put together a set of assumptions that might help in putting together a system for allocating memory that can satisfy all the fairly painful requirements Vulkan demands.
Firstly, I believe that allocations for textures should be treated separately from allocations for buffers.
For textures: On the platforms we surveyed, textures have alignment requirements that fall within the integer powers of two in the range [2⁸, 2¹⁶]. We could therefore divide the heap into allocations based on alignment size and memory type. On the platforms we surveyed, this would effectively avoid creating too many allocations, because there were at most three different alignment values on a given platform.
When a texture is required that has an alignment size S
and memory type T
, we sub-allocate from an existing allocation that has
been created for alignment size S
and memory type T
, or create a new one
if either the existing allocations are full, or no allocations exist. Within
an allocation, we can track blocks of size S
. By working in terms of blocks
of size S
, we guarantee that sub-allocations always have the correct
alignment. Additionally, by fixing S
on a per-allocation basis, we reduce
wasted space: There will be no occurrences of small, unaligned sub-allocations
breaking up contiguous free space and preventing larger aligned sub-allocations
from being created.
We could choose to also group allocations by texture size, so allocations would
be created for a combination of alignment size S
, memory type T
, and texture
size P
. I think this would likely be a bad idea unless the application only
used a single texture size; in applications that used a wide range of texture
sizes this would result in a large number of different allocations being created,
and it's possible the allocation count limit could be reached.
In terms of sizing the allocations used for textures, we can simplify the
situation further if we are willing to limit the maximum size of textures that
the application will accept. We can see from the existing data that a 4096x4096
texture using four bytes per pixel will require just over 64mb of storage space.
Many existing GPUs are capable of using textures at sizes of 8192x8192
and
above. We could make the simplifying assumption that any textures over, say,
2048x2048
are classed as humongous and would therefore use a different
allocation scheme. The Java virtual machine takes a
similar approach for objects
that have a size that is over a certain percentage of the heap size.
If we had an 8gb heap and divided it up into 32mb allocations, we could cover
the entire heap in around 250 allocations, and each allocation would be able
to store a 2048x2048
texture with room to spare. The same heap divided into 128mb
allocations would need just over 62 allocations to cover the entire heap. A
128mb allocation would easily hold at least one 4096x4096
texture. However,
the larger the individual allocations, the more likely it is that the entirety
of the heap could be used up before allocations could be created for all the
required combinations of S
and T
. We can derive a rough heuristic for
the allocation size for a heap of size H
where the maximum allowed size for
a resource is M
:
∃d. H / d ≃ K, size = max(M, d)
That is, there's some allocation size d
that will divide the heap into roughly
K
parts. The maximum allowed size of a resource is M
. Therefore, the size
used for allocations should be whichever of d
or M
is larger. If we choose
K = 62
and are satisfied with resources that are at most 64mb
, then
size = max(M, d) = max(64000000, 133333333) = 133333333
.
We could simplify matters further by requiring that the application provide
up-front hints as to the range of texture sizes and formats that it is going
to use (and the ways in which those textures are going to be used). This would
be an impossibly onerous restriction for a general-purpose malloc()
, but it's
perfectly feasible for a typical rendering engine.
This would allow us to evaluate the memory requirements of all the combinations
of S
and T
that are likely to be encountered when the application runs,
and try to arrange for an optimal set of allocations of sizes suitable for the
system's heap size. Obviously, the ideal situation for this kind of allocator would
be that the application would use exactly one size of texture, and would use
those textures in exactly one way. This is rarely the case for real
applications!
Within an allocation, we would take care to sub-allocate blocks using a
best-fit algorithm in order to reduce fragmentation. Most best-fit
algorithms run in O(N)
time over the set of free spaces, but the size of
this set can be kept small by merging adjacent free spaces when deallocating
sub-allocations.
For humongous textures, the situation is slightly unclear. Unless the application is routinely texturing all of its models with massive images, then those humongous textures are likely to be render targets. If they aren't render targets, then the application likely has bigger problems! I suspect that the right thing to do in this case is to simply reject the allocation and tell the user "if you want to allocate a render target, then use this specific API for doing so". The render target textures can then be created as dedicated allocations and won't interfere with the rest of the texture memory allocation scheme.
For buffers: The situation appears to be much simpler. On all platforms
surveyed, the alignment restrictions for buffers fall within a small range
of small powers of two, and don't appear to change based on the buffer parameters
at all. We can use the same kind of S
and T
based bucketing scheme, but
be happy in the knowledge that all of our created allocations will probably
have the same S
value.
I'm going to start work on a Java API to try to codify all of the above. Ideally there would be an API to examine the current platform and suggest reasonable allocation defaults, and a separate API to actually manage the heap(s) according to the chosen values. The first API would work along the lines of "here's the size of my heap, here are the texture sizes and formats I'm going to use; give me what you think are sensible allocation sizes".
There'll also need to be some introspection tools to measure properties such as contiguous space usage, fragmentation, etc.
Compaction and defragmentation is a topic I've not covered. It doesn't really seem like there's much to it other than "take all allocations and then sort all sub-allocations by size to free up space at the ends of allocations". It's slightly harder to actually implement because there will be existing Vulkan resources referring to memory that will have "moved" after defragmentation. The difficulty is really just a matter of designing a safe API around it.
I'm shutting down my mail server and moving to fastmail. If everything works correctly, nothing should appear to be any different to the outside world; all old email addresses will continue to work.
Edit: This is wrong.
Wasted a day getting aggravated at the options for producing per-platform distributions for Java applications.
I have lots of applications. None of them are platform-specific in any way; the code is very much "write once, run anywhere". Additionally, they are fully modularized (and, indeed, probably only work correctly if everything is on the module path). They do, however, have some dependencies that are only automatic modules (but that nevertheless work correctly when placed on the module path).
Historically, Java applications were distributed either as a single executable
jar
file containing the entire bytecode of the application, or as a set of
jar
files in a directory such that the program is executed by running something
equivalent to:
$ java -cp lib/* com.io7m.example.main
This has an obvious issue: You don't know which version of Java you're going to
get when you run java
. If you're not running the above on a command-line, but
via some kind of frontend script (or double-clicking a desktop shortcut), and
the Java version you have isn't compatible, you're
likely just going to silently fail and do nothing. Users will burn you at the
stake. Aside from this glaring problem, things are otherwise perfect:
Unfortunately, since the death of runtime delivery platforms such as Java Web Start, the only remaining way to deal with the "we don't know what runtime we might have" problem is to make your application distributions platform-specific and distribute a Java runtime along with your application.
Thankfully, there are APIs such as Foojay that allow for automatically downloading Java runtimes for platforms. These APIs can be used with, for example, the JDKs Maven Plugin to efficiently fetch Java runtimes and package them up with your application.
You can, therefore, have a Linux-specific distribution that contains
your application's jar
files in a lib
subdirectory, and some kind of shell
script that runs your included Java runtime with all of the jar
s in lib
placed on the module or class path as necessary. You can obviously have a
similar Windows-specific distribution that has the same arrangement but with
a .bat
file that serves the same purpose.
This maintains the advantage of the "historical" distribution method in that your build system remains completely platform independent. It does, however, gain the disadvantage that your build system no longer produces platform-independent artifacts. Despite Java being a very consciously platform-independent language, we're back to having to have platform-specific application distributions. I've decided I can live with this.
Worse, though, people want things like .exe
files that can be double clicked
on Windows. Ideally, they want those .exe
files to have nice icons and that
show meaningful values when looking at the properties:
A .bat
file won't give you that on Windows. Additionally, Windows has things
like the Process Explorer that will show all of your applications as being
java.exe
with a generic Java icon. Great.
On Linux, the whole issue is somewhat less of a problem because at least one of the following will probably be true:
So let's assume that I'm willing to do the following:
My main application project will continue to produce a completely
platform-independent distribution; I'll put the jar
files that make up
the application into a zip file. There is a hard requirement on not using
any platform-specific tools, for the build to produce byte-for-byte
identical outputs on any platform, and the build must be possible to
complete on a single machine. The code can be executed on multiple
different platforms during the build for automated testing purposes,
but one-and-exactly-one
machine is responsible for producing build artifacts that will be deployed
to a repository somewhere. Users can, if they want, use this distribution
directly on their own systems by running it with their installed java
commands. It's their responsibility to use the right version, and deal
with the consequences of getting it wrong.
I'll maintain separate projects that take those platform-independent
artifacts and repackage them as necessary using platform-specific tools.
It must be possible for the platform-specific build for platform P
to
conclude in a single build, on a single machine running platform P
.
These platform-specific distributions will treat users as being
functionally illiterate and must mindlessly work correctly via a single
double-clickable entry point of some kind.
Why do I insist on having each build run to completion and produce something useful on a single machine? Because coordinating distributed systems is hard, and trying to guarantee any kind of atomicity with regards to releasing and deploying code over them is a fool's errand. At least if the platform-specific builds happen in single shots on independent systems, we can make the independent releases and deployment of code on individual platforms somewhat atomic. I may not release Linux versions on the same day that I release Windows versions. Fine.
Additionally, I want the platform-specific distributions to feel like they
actually belong on the platform they're running on. I want my application to
look like MyApplication.exe
in the Windows Process Explorer; I don't want to
see java.exe
and a Duke icon.
So what are the options?
Well, the OpenJDK people have been going on about jlink
for ages now. Unfortunately, jlink
falls over at the very first hurdle: It
can't work with automatic modules. This means that all of my dependencies have
to be modularized, and I know that at least some of them never will be. There
are tools like moditect that claim to
be able to somewhat automatically modularize, but the issue is that this takes
the resulting bytecode artifacts further and further from the code that was
actually tested during the platform-independent build; any rewriting of code
is a potential for bugs that occur where the test suite won't have the
opportunity to find them.
This is unacceptable. Ultimately, using jlink
means that either your
application runs in class path mode (which mine will not), or your application
and all of its transitive dependencies have to be fully modularized and all of
the modules have to be statically compiled into the resulting runtime image
as jmod
files. I've tried, but this isn't workable.
Moving on, the OpenJDK project now has the jpackage tool that claims to be capable of producing platform-specific application packages.
Unfortunately, it fails in a number of ways. It suffers from the exact same
failure modes as jlink
with regards to requiring complete modularization.
Additionally, on Windows, it requires the installation of elderly versions of
wix that are awkward to get working correctly in
CI systems due to requiring PATH
mangling and other magic. These issues
ultimately made jpackage
a dead end. Annoyingly, it looked like jpackage
would have gotten me there, as it does produce executables that have the
correct names in Process Explorer, and does apparently allow for setting icon
resources and the like.
Other systems exist, such as winrun4j and launch4j. Unfortunately, these are more or less unmaintained. Additionally, they don't know anything about Java modules as they pre-date the JPMS by many years. They ultimately demand your application run in class path mode. So, those are out too.
I toyed around with creating a kind of launcher.exe
that simply executed
a non-jlink
ed Java runtime included with the platform distribution with
the right command line arguments to put everything onto the module path.
This became a comedy of errors. I first attempted to write the launcher program
in Java, and AOT compile it with GraalVM. This
required the installation of Visual Studio, and after being unable to get it
to actually find the Visual C++ compiler, I gave up. It became clear, anyway,
that this wasn't going to be any use as the resulting executables don't allow
for custom icons without using hacky closed-source third-part executable editor
tools. There's a ticket about
this that was closed without comment. Nice.
I then decided to try writing a launcher in C++ by simply downloading a
portable development kit during the
build and compiling a C++ program (with the correct resource script to include
a nice icon and executable metadata). This didn't exactly fail, but ultimately
didn't achieve what I wanted: The launcher can execute a Java runtime, but
then we're back to the original problem of the program appearing as java.exe
with a generic icon in process listings (because that's exactly what it is;
it's just the java
executable from the included runtime).
Ultimately, I gave up.
What can be done about this?
I'd really like some options for jpackage
that work like this:
$ jpackage \ --type app-image \ --name MyApplication \ --icon icon.png \ --use-jdk-directly some-platform-specific-jdk \ --use-jars-on-module-path lib
This would:
some-platform-specific-jdk
without
trying to minimize the modules included by using jlink
to try to work
out which are needed. Just give me all of them, I don't care.MyApplication
with a nice icon
given in icon.png
jar
files in lib
, and include them in a directory in
the resulting app image directly without touching them. The executable
should be configured internally to give the same results as if I had run
java
with -p lib
.I feel like the documentation almost suggests that this is already possible,
but I just couldn't get it to work. The tool always tried to analyze the modules
of my application and then loudly complain about automatic modules and fail,
rather than just shutting up and producing an executable that placed the jar
files on an included module path instead.
This would, presumably, work exactly as well on Windows as on Linux. I don't care about Mac support. It would also be great if it didn't use obsolete tools to produce executables.