Skip to main content

Local Type Inference Cheat Sheet for Java 10 and beyond!

Written by:

Stuart Marks

April 26, 2018

0 mins read
Local_Type_Inference_Cheat_Sheet_for_Java_10_and_beyond_web

Welcome to the first in a new series of cheat sheets that we’ll be running on the Snyk blog. We’ll be providing content for you to print and pin up to help you be a better developer. In our first edition and hot on the heels on Java 10, we’ll be focusing on the much talked about type inference for local variables. You can download the pdf version by clicking the image above or this link! The main premise behind the local type inference feature is pretty simple. Replace the explicit type in the declaration with the new reserved type name ‘var’ and its type will be inferred. So we could replace:

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

with:

var outputStream = new ByteArrayOutputStream();

And the type of outputStream will be inferred as ByteArrayOutputStream. Hang on, are we saying that Java is now allowing dynamic typing? Absolutely not! ALL type inference occurs at compile time and explicit types are baked into the byte code, by the compiler. At runtime, Java is as static as it’s ever been. Given the usage is so simple, this cheat sheet will focus on the most important aspect of local type inference - its practical usage. It will give guidance to when you should use explicit typing and when you should consider type inference.

Since wanting to write this cheat sheet, Stuart Marks, JDK engineer from Oracle, wrote the perfect article giving both coding principles and guidance of the usage of using type inference. So, when I wanted to create a cheat sheet, I headed straight over to Stuart to see if we could include his thoughts and condense them into a cheat sheet for developers to pin up and use daily! I would heavily recommend you read Stuart’s article in full, It really is worth your time!

Principles

1. Reading code > Writing code.

Whether it takes you 10 minutes or 10 days to write a line of code, you’ll almost certainly be reading it for many years to come. Code is only maintainable and understandable in future if it’s clear, concise, and most importantly contains all the necessary information to understanding its purpose. The goal is maximizing understandability.

2. Code should be clear from local reasoning.

Bake as much information as you can into your code to avoid a reader having to look through different parts of the code base in order to understand what’s going on. This can be through method or variable naming.

3. Code readability shouldn’t depend on IDEs.

IDE’s can be great. I mean really great! They can make a developer more productive or more accurate with their development. Code must be readable and understandable without relying on an IDE. Often code is read outside an IDE. Or perhaps IDEs will differ in how much information they provide the reader. Code should be self-revealing. It should be understandable on its face, without the need for assistance from tools.

The decision is yours.

The choice of whether to give a variable an explicit type or to let the Java compiler work it out for itself is a trade off. On one hand you want to reduce clutter, boilerplate, ceremony. On the other hand you don’t want to impair understandability of the code. The type declaration isn’t the only way to convey information to the reader. Other means include the variable’s name and the initializer expression. We should take all the available channels into account when determining whether it’s OK to mute explicit typing from the equation, for each variable.

Guidelines

1. Choose variable names that provide useful information.

This is good practice in general, but it’s much more important in the context of var. In a var declaration, information about the meaning and use of the variable can be conveyed using the variable’s name. Replacing an explicit type with var should often be accompanied by improving the variable name. Sometimes it might be useful to encode the variable’s type in its name. For example:

List<Customer> x = dbconn.executeQuery(query);

var custList = dbconn.executeQuery(query);

2. Minimize the scope of local variables.

Limiting the scope of local variables is described in Effective Java (3rd edition), Item 57. It applies with extra force if var is in use. The problem occurs when the variable’s scope is large. This means that there are many lines of code between the declaration of the variable and its usage. As the code is maintained, changes to types etc may end up producing different behaviour. For example moving from a List to a Set might look ok, but does your code rely on ordering later on in the same scope? While types are always set statically, subtle differences in implementations using the same interface may trip you up. Instead of simply avoiding var in these cases, one should change the code to reduce the scope of the local variables, and only then declare them with var. Consider the following code:

var items = new HashSet<Item>(...);
items.add(MUST_BE_PROCESSED_LAST);
for (var item : items) { ... }

This code now has a bug, since sets don’t have a defined iteration order. However, the programmer is likely to fix this bug immediately, as the uses of the items variable are adjacent to its declaration. Now suppose that this code is part of a large method, with a correspondingly large scope for the items variable:

var items = new HashSet<Item>(...);

// ... 100 lines of code ...

items.add(MUST_BE_PROCESSED_LAST);
for (var item : items) { ... }

This bug now becomes much harder to track down as the line which tries to add an item to the end of the set isn’t close enough to the type declaration to make the bug obvious.

3. Consider var when the initializer provides sufficient information to the reader.

Local variables are often initialized with constructors. The name of the class being constructed is often repeated as the explicit type on the left-hand side. If the type name is long, the use of var provides concision without loss of information:

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

var outputStream = new ByteArrayOutputStream();

It’s also reasonable to use var in cases where the initializer is a method call, such as Files.newBufferedReader(…) or List stringList = List.of("a", "b", "c").

4. Use var to break up chained or nested expressions with local variables.

Consider code that takes a collection of strings and finds the string that occurs most often. This might look like the following:

return strings.stream()
              .collect(groupingBy(s -> s, counting()))
              .entrySet()
              .stream()
              .max(Map.Entry.comparingByValue())
              .map(Map.Entry::getKey);

This code is correct, but more readable over multiple statements. The problem with splitting over statements as shown.

Map<String, Long> freqMap = strings.stream()
                                   .collect(groupingBy(s -> s, counting()));
Optional<Map.Entry<String, Long>> maxEntryOpt = freqMap.entrySet()
                                                       .stream()
                                                       .max(Map.Entry.comparingByValue());
return maxEntryOpt.map(Map.Entry::getKey);

But the author probably resisted doing so because the explicit typing looks extremely messy, distracting from the important code. Using var allows us to express the the code more naturally without paying the high price of explicitly declaring the types of the intermediate variables:

var freqMap = strings.stream()
                     .collect(groupingBy(s -> s, counting()));
var maxEntryOpt = freqMap.entrySet()
                         .stream()
                         .max(Map.Entry.comparingByValue());
return maxEntryOpt.map(Map.Entry::getKey);

One might legitimately prefer the first snippet with its single long chain of method calls. However, in some cases it’s better to break up long method chains.

5. Don’t worry too much about “programming to the interface” with local variables.

A common idiom in Java programming is to construct an instance of a concrete type but to assign it to a variable of an interface type. For example:

List<String> list = new ArrayList<>();

If var is used, however, the concrete type is inferred instead of the interface:

// Inferred type of list is ArrayList<String>.
var list = new ArrayList<String>();

Code that uses the list variable can now form dependencies on the concrete implementation. If the variable’s initializer were to change in the future, this might cause its inferred type to change, causing errors or bugs to occur in subsequent code that uses the variable.

This is less of a problem when adhering to guideline 2, as the scope of the local variable is small, the risks from “leakage” of the concrete implementation that can impact the subsequent code are limited.

6. Take care when using var with diamond or generic methods.

Both var and the “diamond” feature allow you to omit explicit type information when it can be derived from information already present. However if used together they might end up omitting all the useful information the compiler needs to correctly narrow down the type you wish to be inferred.

Consider the following:

PriorityQueue<Item> itemQueue = new PriorityQueue<Item>();
PriorityQueue<Item> itemQueue = new PriorityQueue<>();
var itemQueue = new PriorityQueue<Item>();

// DANGEROUS: infers as PriorityQueue<Object>
var itemQueue = new PriorityQueue<>();

Generic methods have also employed type inference so successfully that it’s quite rare for programmers to provide explicit type arguments. Inference for generic methods relies on the target type if there are no actual method arguments that provide sufficient type information. In a var declaration, there is no target type, so a similar issue can occur as with diamond. For example,

// DANGEROUS: infers as List<Object>
var list = List.of();

With both diamond and generic methods, additional type information can be provided by actual arguments to the constructor or method, allowing the intended type to be inferred. This does add an additional level of indirection, but is still predictable. Thus,

// OK: itemQueue infers as PriorityQueue<String>
Comparator<String> comp = ... ;
var itemQueue = new PriorityQueue<>(comp);

7. Take care when using var with literals.

It’s unlikely that using var with literals will provide much advantage, as the type names are generally short. However, var is sometimes useful, for example, to align variable names.

There is no issue with boolean, character, long, and string literals. The type inferred from these literals is precise, and so the meaning of var is unambiguous. Particular care should be taken when the initializer is a numeric value, especially an integer literal. With an explicit type on the left-hand side, the numeric value may be silently widened or narrowed to types other than int. With var, the value will be inferred as an int, which may be unintended.

// ORIGINAL
boolean ready = true;
char ch = '\ufffd';
long sum = 0L;
String label = "wombat";
byte flags = 0;
short mask = 0x7fff;
long base = 17;

var ready = true;
var ch    = '\ufffd';
var sum   = 0L;
var label = "wombat";

// DANGEROUS: all infer as int
var flags = 0;
var mask = 0x7fff;
var base = 17;

Conclusion

Using var for declarations can improve code by reducing clutter, thereby letting more important information stand out. On the other hand, applying var indiscriminately can make things worse. Used properly, var can help improve good code, making it shorter and clearer without compromising understandability. When using var, ask yourself if you’ve made the code more ambiguous, or whether it’s clearly understandable without too much investigation.

Download the cheat sheet now!

Get started in capture the flag

Learn how to solve capture the flag challenges by watching our virtual 101 workshop on demand.

Posted in:

Snyk Top 10: Vulnerabilites you should know

Find out which types of vulnerabilities are most likely to appear in your projects based on Snyk scan results and security research.