Java Packaging HOWTO

Table of Contents

Authors and contributing

The authors of this document are:

  • Mikolaj Izdebski, Red Hat

  • Nicolas Mailhot, JPackage Project

  • Stanislav Ochotnicky, Red Hat

  • Ville Skyttä, JPackage Project

  • Michal Srb, Red Hat

  • Michael Simacek, Red Hat

  • Marián Konček, Red Hat

This document is free software; see the license for terms of use, distribution and / or modification.

The source code for this document is available in git repository. Instructions for building it from sources are available in README file.

This document is developed as part of Javapackages community project, which welcomes new contributors. Requests and comments related to this document should be reported at Red Hat Bugzilla.

Before contributing please read the README file, which among other things includes basic rules which should be followed when working on this document. You can send patches to the Red Hat Bugzilla. They should be in git format and should be prepared against master branch in git. Alternatively you can also send pull requests at Github repository.

Abstract

This document aims to help developers create and maintain Java packages in Fedora. It does not supersede or replace Java Packaging Guidelines, but rather aims to document tools and techniques used for packaging Java software on Fedora.

1. Introduction

Clean Java packaging has historically been a daunting task. Lack of any standard addressing the physical location of files on the system combined with the common use of licensing terms that only allow free redistribution of key components as a part of a greater ensemble has let to the systematic release of self-sufficient applications with built-in copies of external components.

As a consequence applications are only tested with the versions of the components they bundle, a complete Java system suffers from endless duplication of the same modules, and integrating multiple parts can be a nightmare since they are bound to depend on the same elements - only with different and subtly incompatible versions (different requirements, different bugs). Any security or compatibility upgrade must be performed for each of those duplicated elements.

This problem is compounded by the current practice of folding extensions in the JVM itself after a while; an element that could safely be embedded in a application will suddenly conflict with a JVM part and cause subtle failures.

It is not surprising then that complex Java systems tend to fossilize very quickly, with the cost of maintaining dependencies current growing too high so fast people basically give up on it.

This situation is incompatible with typical fast-evolving Linux platform. To attain the aim of user- and administrator-friendly RPM packaging of Java applications a custom infrastructure and strict packaging rules had to be evolved.

1.1. Basic introduction to packaging, reasons, problems, rationale

This section includes basic introduction to Java packaging world to people coming from different backgrounds. The goal is to understand language of all groups involved. If you are a Java developer coming into contact with RPM packaging for the first time start reading Java developer section. On the other hand if you are coming from RPM packaging background an introduction to Java world is probably a better starting point.

It should be noted that especially in this section we might sacrifice correctness for simplicity.

1.2. For Packagers

Java is a programming language which is usually compiled into bytecode for JVM (Java Virtual Machine). For more details about the JVM and bytecode specification see JVM documentation.

1.2.1. Example Java Project

To better illustrate various parts of Java packaging we will dissect simple Java Hello world application. Java sources are usually organized using directory hierarchies. Shared directory hierarchy creates a namespace called package in Java terminology. To understand naming mechanisms of Java packages see Java package naming conventions.

Let’s create a simple hello world application that will execute following steps when run:

  1. Ask for a name.

  2. Print out Hello World from and the name from previous step.

To illustrate certain points we artificially complicate things by creating:

  • Input class used only for input of text from terminal.

  • Output class used only for output on terminal.

  • HelloWorldApp class used as main application.

Directory listing of example project
$ find .
.
./Makefile
./src
./src/org
./src/org/fedoraproject
./src/org/fedoraproject/helloworld
./src/org/fedoraproject/helloworld/output
./src/org/fedoraproject/helloworld/output/Output.java
./src/org/fedoraproject/helloworld/input
./src/org/fedoraproject/helloworld/input/Input.java
./src/org/fedoraproject/helloworld/HelloWorld.java

In this project all packages are under src/ directory hierarchy.

HelloWorld.java listing
Unresolved directive in introduction_for_packagers.adoc - include::{EXAMPLE}java_project/src/org/fedoraproject/helloworld/HelloWorld.java[]
Java packages
org/fedoraproject/helloworld/input/Input.java
org/fedoraproject/helloworld/output/Output.java
org/fedoraproject/helloworld/HelloWorld.java

Although the directory structure of our package is hierarchical, there is no real parent-child relationship between packages. Each package is therefore seen as independent. The above example makes use of three separate packages:

  • org.fedoraproject.helloworld.input

  • org.fedoraproject.helloworld.output

  • org.fedoraproject.helloworld

Environment setup consists of two main parts:

  • Telling JVM which Java class contains main() method.

  • Adding required JAR files on JVM classpath.

Compiling our project

The sample project can be compiled to a bytecode by Java compiler. Java compiler can be typically invoked from command line by command javac.

javac $(find -name '*.java')

For every .java file corresponding .class file will be created. The .class files contain Java bytecode which is meant to be executed on JVM.

One could put invocation of javac to Makefile and simplify the compilation a bit. It might be sufficient for such a simple project, but it would quickly become hard to build more complex projects with this approach. Java world knows several high-level build systems which can highly simplify building of Java projects. Among others, probably the most known are Apache Maven and Apache Ant.

See also Maven and Ant sections.

JAR files

Having our application split across many .class files would not be very practical, so those .class files are assembled into ZIP files with specific layout and called JAR files. Most commonly these special ZIP files have .jar suffix, but other variations exist (.war, .ear). They contain:

  • Compiled bytecode of our project.

  • Additional metadata stored in META-INF/MANIFEST.MF file.

  • Resource files such as images or localisation data.

  • Optionaly the source code of our project (called source JAR then).

They can also contain additional bundled software which is something we do not want to have in packages. You can inspect the contents of given JAR file by extracting it. That can be done with following command:

jar -xf something.jar

The detailed description of JAR file format is in the JAR File Specification.

Classpath

The classpath is a way of telling JVM where to look for user classes and 3rd party libraries. By default, only current directory is searched, all other locations need to be specified explicitly by setting up CLASSPATH environment variable, or via -cp (-classpath) option of the Java Virtual Machine.

Setting the classpath
java -cp /usr/share/java/log4j.jar:/usr/share/java/junit.jar mypackage/MyClass.class
CLASSPATH=/usr/share/java/log4j.jar:/usr/share/java/junit.jar java mypackage/MyClass.class

Please note that two JAR files are separated by colon in a classpath definition.

See official documentation for more information about classpath.

Wrapper scripts

Classic compiled applications use dynamic linker to find dependencies (linked libraries), whereas dynamic languages such as Python, Ruby, Lua have predefined directories where they search for imported modules. JVM itself has no embedded knowledge of installation paths and thus no automatic way to resolve dependencies of Java projects. This means that all Java applications have to use wrapper shell scripts to setup the environment before invoking the JVM and running the application itself. Note that this is not necessary for libraries.

1.2.2. Build System Identification

The build system used by upstream can be usually identified by looking at their configuration files, which reside in project directory structure, usually in its root or in specialized directories with names such as build or make.

Maven

Build managed by Apache Maven is configured by an XML file that is by default named pom.xml. In its simpler form it usually looks like this:

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.example.myproject</groupId>
  <artifactId>myproject</artifactId>
  <packaging>jar</packaging>
  <version>1.0</version>
  <name>myproject</name>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.1</version>
      <scope>test</scope>
    </dependency>
  </dependencies>
</project>

It describes project’s build process in a declarative way, without explicitly specifying exact steps needed to compile sources and assemble pieces together. It also specifies project’s dependencies which are usually the main point of interest for packagers. Another important feature of Maven that packagers should know about are plugins. Plugins extend Maven with some particular functionality, but unfortunately some of them may get in the way of packaging and need to be altered or removed. There are RPM macros provided to facilitate modifying Maven dependencies and plugins.

Ant

Apache Ant is also configured by an XML file. It is by convention named build.xml and in its simple form it looks like this:

<project name="MyProject" default="dist" basedir=".">
  <property name="src" location="src"/>
  <property name="build" location="build"/>
  <property name="dist" location="dist"/>

  <target name="init" description="Create build directory">
    <mkdir dir="${build}"/>
  </target>

  <target name="compile" depends="init"
        description="Compile the source">
    <javac srcdir="${src}" destdir="${build}"/>
  </target>

  <target name="dist" depends="compile"
        description="Generate jar">
    <mkdir dir="${dist}/lib"/>

    <jar jarfile="${dist}/myproject.jar" basedir="${build}"/>
  </target>

  <target name="clean" description="Clean build files">
    <delete dir="${build}"/>
    <delete dir="${dist}"/>
  </target>
</project>

Ant build file consists mostly of targets, which are collections of steps needed to accomplish intended task. They usually depend on each other and are generally similar to Makefile targets. Available targets can be listed by invoking ant -p in project directory containing build.xml file. If the file is named differently than build.xml you have to tell Ant which file should be used by using -f option with the name of the actual build file.

Some projects that use Apache Ant also use Apache Ivy to simplify dependency handling. Ivy is capable of resolving and downloading artifacts from Maven repositories which are declaratively described in XML. Project usually contains one or more ivy.xml files specifying the module Maven coordinates and its dependencies. Ivy can also be used directly from Ant build files. To detect whether the project you wish to package is using Apache Ivy, look for files named ivy.xml or nodes in the ivy namespace in project’s build file.

Make

While unlikely, it is still possible that you encounter a project whose build is managed by plain old Makefiles. They contain a list of targets which consist of commands (marked with tab at the begining of line) and are invoked by make target or simply make to run the default target.

1.2.3. Quiz for Packagers

At this point you should have enough knowledge about Java to start packaging. If you are not able to answer following questions return back to previous sections or ask experienced packagers for different explanations of given topics.

  1. What is the difference between JVM and Java?

  2. What is a CLASSPATH environment variable and how can you use it?

  3. Name two typical Java build systems and how you can identify which one is being used

  4. What is the difference between java and javac comands?

  5. What are contents of a typical JAR file?

  6. What is a pom.xml file and what information it contains?

  7. How would you handle packaging software that contains lib/junit4.jar inside source tarball?

  8. Name at least three methods for bundling code in Java projects

1.3. For Java Developers

Packaging Java software has specifics which we will try to cover in this section aimed at Java developers who are already familiar with Java language, JVM, classpath handling, Maven, pom.xml file structure and dependencies.

Instead we will focus on basic packaging tools and relationships between Java and RPM world. One of the most important questions is: What is the reason to package software in RPM (or other distribution-specific formats). There are several reasons for it, among others:

  • Unified way of installation of software for users of distribution regardless of upstream projects

  • Verification of authenticity of software packages by signing them

  • Simplified software updates

  • Automatic handling of dependencies for users

  • Common filesystem layout across distribution enforced by packaging standards

  • Ability to administer, monitor and query packages installed on several machines through unified interfaces

  • Distribution of additional metadata with the software itself such as licenses used, homepage for the project, changelogs and other information that users or administrators can find useful

1.3.1. Example RPM Project

RPM uses spec files as recipes for building software packages. We will use it to package example project created in previous section. If you did not read it you do not need to; the file listing is available here and the rest is not necessary for this section.

Directory listing
Makefile
src
src/org
src/org/fedoraproject
src/org/fedoraproject/helloworld
src/org/fedoraproject/helloworld/output
src/org/fedoraproject/helloworld/output/Output.java
src/org/fedoraproject/helloworld/input
src/org/fedoraproject/helloworld/input/Input.java
src/org/fedoraproject/helloworld/HelloWorld.java

We packed the project directory into file helloworld.tar.gz.

Example spec file
Unresolved directive in introduction_for_developers.adoc - include::{EXAMPLE}rpm_project/helloworld.spec[]

RPM spec files contain several basic sections:

Header, which contains:
  • Package metadata such as its name, version, release, license, …​

  • A Summary with basic one-line summary of package contents.

  • Package source URLs denoted with Source0 to SourceN directives.

    • Source files can then be referenced by %SOURCE0 to %SOURCEn, which expand to actual paths to given files.

    • In practice, the source URL shouldn’t point to a file in our filesystem, but to an upstream release on the web.

  • Patches - using Patch0 to PatchN.

  • Project dependencies.

    • Build dependencies specified by BuildRequires, which need to be determined manually.

    • Run time dependencies will be detected automatically. If it fails, you have to specify them with Requires.

    • More information on this topic can be found in the dependency handling section.

%description
  • Few sentences about the project and its uses. It will be displayed by package management software.

%prep section
  • Unpacks the sources using setup -q or manually if needed.

  • If a source file doesn’t need to be extracted, it can be copied to build directory by cp -p %SOURCE0 ..

  • Apply patches with %patch X, where X is the number of patch you want to apply. (You usually need the patch index, so it would be: %patch 0 -p1).

%build section
  • Contains project compilation instructions. Usually consists of calling the projects build system such as Ant, Maven or Make.

Optional %check section
  • Runs projects integration tests. Unit test are usually run in %build section, so if there are no integration tests available, this section is omitted.

%install section
  • Copies built files that are supposed to be installed into %{buildroot} directory, which represents target filesystem’s root.

%files section
  • Lists all files, that should be installed from %{buildroot} to target system.

  • Documentation files are prefixed by %doc and are taken from build directory instead of buildroot.

  • Directories that this package should own are prefixed with %dir.

%changelog
  • Contains changes to this spec file (not upstream).

  • Has prescribed format. To prevent mistakes in format, it is advised to use tool such as rpmdev-bumpspec from package rpmdevtools to append new changelog entries instead of editing it by hand.

To build RPM from this spec file save it in your current directory and run rpmbuild:

$ rpmbuild -bb helloworld.spec

If everything worked OK, this should produce RPM file ~/rpmbuild/RPMS/x86_64/helloworld-1.0-1.fc18.x86_64.rpm. You can use rpm -i or dnf install commands to install this package and it will add /usr/share/java/helloworld.jar and /usr/bin/helloworld wrapper script to your system. Please note that this specfile is simplified and lacks some additional parts, such as license installation.

Paths and filenames might be slightly different depending on your architecture and distribution. Output of the commands will tell you exact paths.

As you can see to build RPM files you can use rpmbuild command. It has several other options, which we will cover later on.

Other than building binary RPMs (-bb), rpmbuild can also be used to:

  • build only source RPMs (SRPMs), the packages containing source files which can be later build to RPMs (option -bs)

  • build all, both binary and source RPMs (option -ba)

See rpmbuild 's manual page for all available options.

1.3.2. Querying repositories

Fedora comes with several useful tools which can provide great assistance in getting information from RPM repositories.

dnf repoquery is a tool for querying information from RPM repositories. Maintainers of Java packages might typically query the repository for information like "which package contains the Maven artifact groupId:artifactId".

Find out which package provides given artifact
$ dnf repoquery --whatprovides 'mvn(commons-io:commons-io)'
apache-commons-io-1:2.4-9.fc19.noarch

The example above shows that one can get to commons-io:commons-io artifact by installing apache-commons-io package.

By default, dnf repoquery uses all enabled repositories in DNF configuration, but it is possible to explicitly specify any other repository. For example following command will query only Rawhide repository:

$ dnf repoquery --available --disablerepo \* --enablerepo rawhide --whatprovides 'mvn(commons-io:commons-io)'
apache-commons-io-1:2.4-10.fc20.noarch

Sometimes it may be useful to just list all the artifacts provided by given package:

$ dnf repoquery --provides apache-commons-io
apache-commons-io = 1:2.4-9.fc19
jakarta-commons-io = 1:2.4-9.fc19
mvn(commons-io:commons-io) = 2.4
mvn(org.apache.commons:commons-io) = 2.4
osgi(org.apache.commons.io) = 2.4.0

Output above means that package apache-commons-io provides two Maven artifacts: previously mentioned commons-io:commons-io and org.apache.commons:commons-io. In this case the second one is just an alias for same JAR file. See section about artifact aliases for more information on this topic.

Another useful tool is rpm. It can do a lot of stuff, but most importantly it can replace dnf repoquery if one only needs to query local RPM database. Only installed packages, or local .rpm files, can be queried with this tool.

Common use case could be checking which Maven artifacts provide locally built packages.

$ rpm -qp --provides simplemaven-1.0-2.fc21.noarch.rpm
mvn(com.example:simplemaven) = 1.0
mvn(simplemaven:simplemaven) = 1.0
simplemaven = 1.0-2.fc21

1.3.3. Quiz for Java Developers

  1. How would you build a binary RPM if you were given a source RPM?

  2. What is most common content of Source0 spec file tag?

  3. What is the difference between Version and Release tags?

  4. How would you apply a patch in RPM?

  5. Where on filesystem should JAR files go?

  6. What is the format of RPM changelog or what tool would you use to produce it?

  7. How would you install an application that needs certain layout (think ANT_HOME) while honoring distribution filesystem layout guidelines?

  8. How would you generate script for running a application with main class org.project.MainClass which depends on commons-lang jar?

2. Java Specifics in Fedora for Users and Developers

This section contains information about default Java implementation in Fedora, switching between different Java runtime environments and about few useful tools which can be used during packaging / development.

2.1. Java implementation in Fedora

Fedora ships with an open-source reference implementation of Java Standard Edition called OpenJDK. OpenJDK provides Java Runtime Environment for Java applications and set of development tools for Java developers.

From users point of view, java command is probably the most interesting. It is a Java application launcher which spawns Java Virtual Machine (JVM), loads specified .class file and executes its main method.

Here is an example how to run sample Java project from section Example Java Project:

$ java org/fedoraproject/helloworld/HelloWorld.class

OpenJDK provides a lot of interesting tools for Java developers:

  • javac is a Java compiler which translates source files to Java bytecode, which can be later interpreted by JVM.

  • jdb is a simple command-line debugger for Java applications.

  • javadoc is a tool for generating Javadoc documentation.

  • javap can be used for disassembling Java class files.

2.1.1. Switching between different Java implementations

Users and developers may want to have multiple Java environments installed at the same time. It is possible in Fedora, but only one of them can be default Java environment in system. Fedora uses alternatives for switching between different installed JREs/JDKs.

# alternatives --config java

There are 3 programs which provide 'java'.

  Selection    Command
  -----------------------------------------------
   1           java-17-openjdk.x86_64 (/usr/lib/jvm/java-17-openjdk-17.0.2.0.8-1.fc35.x86_64/bin/java)
*+ 2           java-11-openjdk.x86_64 (/usr/lib/jvm/java-11-openjdk-11.0.14.1.1-5.fc35.x86_64/bin/java)
   3           java-latest-openjdk.x86_64 (/usr/lib/jvm/java-18-openjdk-18.0.1.0.10-1.rolling.fc35.x86_64/bin/java)

Enter to keep the