Every application deals with data, and within certain domains, the handling of data dominates everything. It doesn’t matter how fancy your “web-scale dependency-injected synergetic micro-service”1 is if you don’t manage data correctly. Even still, dealing with data in Java turns out to be harder than you’d expect.
Let’s take a deep-dive through a few of the available options.
Minimalism
The most basic way to store data is to use a plain old Java object (POJO) with the bare necessities:
class Flight {
String airline;
String id;
String destination;
Set<String> delays;
}
What is the problem with this approach? Simple classes like this are not equivalent to struct
s in
languages like C/Go/Rust; you can’t make very many assumptions about its behavior.
The most apparent issue is that it is quite error-prone to initialize this data structure. Let’s have a look at an example:
Flight flight = new Flight();
flight.airline = "BA";
flight.id = "1759";
flight.destination = "Birmingham";
Have we initialized all of the fields? If delays
is empty (unlikely), should we set it to null
or Collections.emptySet()
or a mutable set? If we add a field in the future, will we remember to
set it? Does this result in a reasonable .toString()
, .hashCode()
or .equals()
?
Encapsulation
Some common patterns have emerged to address these and related problems:
- Create a constructor to offer a place to put validation of — and constraints on — the data.
- Expose properties as “getters” à la
airline()
(or the older stylegetAirline()
) so that properties can be mapped to/from different underlying fields. - Hide the object constructor and expose factory methods that allow you to create instances in different ways.
- Make the fields immutable to reduce the API surface and aid with concurrent usage patterns.
- Create a reasonable
.toString()
/.hashCode()
/.equals()
implementation using all of the fields.
The resulting structure looks something like this:
class Flight {
private final String airline;
// ... other fields
private Flight(String airline/*, other fields */) {
this.airline = airline;
// ... other fields
}
public String airline() { return airline; }
// ... other fields
@Override
public String toString() {
return "Flight{airline=" + airline /* + other fields */ + "}";
}
@Override
public boolean equals(Object that) { /* ... */ }
@Override
public int hashCode() { /* ... */ }
}
This is a lot of boilerplate! The class is a lot safer to use, but there is a new burden on the developer to type out all of that code, not make any mistakes, and keep everything up to date when new fields are added or changed.
Code generation
There are several libraries to help solve this problem. Some of them use reflection or other “interesting” hacks to help out at runtime. However, some of them are tools that can be used at compile-time to generate all of the necessary boilerplate. This is both a safer and more performant method, since there is no code running at runtime taking time or having bugs.
The ones I have been using in the past include:
Lombok is a compiler plugin for the various versions of javac
. I’ve found it to be very
unreliable/unstable and require plugins for most of the major IDEs (Like IntelliJ IDEA or Eclipse).
As a result, I’ve always preferred not to use this plugin for big projects for that reason.
@AutoMatter
uses annotation processing on an interface, and generates a few implementations
of that interface (such as a builder and value class) that you can use in your code. The
boilerplate above is thus reduced to simply:
@AutoMatter
public interface Flight {
String airline();
String id();
String destination();
Set<String> delays();
}
You can use the generated classes like so:
Flight flight = new FlightBuilder()
.airline("SAS")
.id("SK903")
.destination("EWR")
.build();
out.println("Airline: " + flight.airline());
out.println("ID: " + flight.id());
out.println("Destination: " + flight.destination());
This is in general quite nice, and removes a lot of the burden from the developer. However, there are still some problems:
- Closed world assumption — Since you are now using an
interface
, users are free to make many different implementation ofFlight
, so you can’t assume that ifff instanceof Flight
, it will also behave like a value type. - Data invariants — You have no control over various constraints that should apply. It might
be the case that if
airline
isSAS
, theid
must start withSK
. It is nice to be able to encode that in the code somehow. - API surface control — Since
@AutoMatter
generates all of the classes, you as a developer have no control of the object’s API surface. You are at the mercy of what the@AutoMatter
library has chosen to implement.
@AutoValue
solves these problems with the trade-off that a bit more boilerplate is needed:
@AutoValue
public abstract class Flight {
Flight() {}
public abstract String airline();
public abstract String id();
public abstract String destination();
public abstract ImmutableSet<String> delays();
public static Flight create(
String airline,
String id,
String destination,
ImmutableSet<String> delays) {
if ("SAS".equals(airline) && !id.startsWith("SK")) {
throw new IllegalArgumentException("...");
}
return new AutoValue_Flight(airline, id, destination, delays);
}
}
Here are some interesting differences:
- By using an
abstract class
with a package-private constructor instead of aninterface
, it becomes possible to limit subclassing the type. - The generated class is not accessible to the user; you should create your own factory methods where it is possible to enforce invariants.
- You have complete control of the API surface. If you want a builder with a specific signature, you have to declare its interface
@AutoValue
is additionally in general quite production-hardened compared to @AutoMatter
in that
it has a plugin system, integration with common libraries such as guava
, and handles corner cases
such as generics/existential quantification/obscure primitive types quite well.
In summary, @AutoValue
is my go-to tool for creating value types in Java, with @AutoMatter
being
useful in some cases when the boiler plate of @AutoValue
becomes too much to bear.
-
If you use any of those words unironically, this blog is not for you.
↩