How to Map Distinct Value Types Using Java Generics

Home  >>  Common  >>  How to Map Distinct Value Types Using Java Generics

How to Map Distinct Value Types Using Java Generics

On March 4, 2015, Posted by , In Common,Spotlight, By ,, , With 6 Comments

Occasionally the average developer runs into a situation where he has to map values of arbitrary types within a particular container. However, the Java collection API provides container related parameterization only. Which limits the type safe usage of HashMap for example to a single value type. But what if you want to mix apples and pears?

Luckily there is an easy design pattern that allows to map distinct value types using Java generics, which Joshua Bloch has described as typesafe hetereogeneous container in his book Effective Java (second edition, Item 29).

Stumbling across some not altogether congenial solutions regarding this topic recently, gave me the idea to explain the problem domain and elaborate on some implementation aspects in this post.

Map Distinct Value Types Using Java Generics

Consider for the sake of example that you have to provide some kind of application context that allows to bind values of arbitrary types to certain keys. A simple non-type safe implementation using String keys backed by a HashMap might look like this:

public class Context {

  private final Map<String,Object> values = new HashMap<>();

  public void put( String key, Object value ) {
    values.put( key, value );
  }

  public Object get( String key ) {
    return values.get( key );
  }

  [...]
}

The following snippet shows how this Context can be used in a program:

Context context = new Context();
Runnable runnable = ...
context.put( "key", runnable );

// several computation cycles later...
Runnable value = ( Runnable )context.get( "key" );

The drawback of this approach can be seen at line six where a down cast is needed. Obviously this can lead to a ClassCastException in case the key-value pair has been replaced by a different value type:

Context context = new Context();
Runnable runnable = ...
context.put( "key", runnable );

// several computation cycles later...
Executor executor = ...
context.put( "key", executor );

// even more computation cycles later...
Runnable value = ( Runnable )context.get( "key" ); // runtime problem

The cause of such problems can be difficult to trace as the related implementation steps might be spread wide apart in your application. To improve the situation, it seems reasonable to bind the value not only to its key but also to its type.

Common mistakes I saw in several solutions following this approach boil down more or less to the following Context variant:

public class Context {

  private final <String, Object> values = new HashMap<>();

  public <T> void put( String key, T value, Class<T> valueType ) {
    values.put( key, value );
  }

  public <T> T get( String key, Class<T> valueType ) {
    return ( T )values.get( key );
  }

  [...]
}

Again basic usage might look like this:

Context context = new Context();
Runnable runnable = ...
context.put( "key", runnable, Runnable.class );

// several computation cycles later...
Runnable value = context.get( "key", Runnable.class );

One first glance this code might give the illusion of being more type save as it avoids the down cast in line six. But running the following snippet gets us down to earth as we still run into the ClassCastException scenario during the assignment in line ten:

Context context = new Context();
Runnable runnable = ...
context.put( "key", runnable, Runnable.class );

// several computation cycles later...
Executor executor = ...
context.put( "key", executor, Executor.class );

// even more computation cycles later...
Runnable value = context.get( "key", Runnable.class ); // runtime problem

So what went wrong?

See also  Java Code Style: The Final Decision

First of all the down cast in Context#get of type T is ineffective as type erasure replaces unbounded parameters with a static cast to Object. But more important the implementation does not use the type information provided by Context#put as key. At most it serves as superfluous cosmetic effect.

Typesafe Heterogeneous Container

Although the last Context variant did not work out very well it points into the right direction. The question is how to properly parameterize the key? To answer this take a look at a stripped-down implementation according to the typesafe heterogeneous container pattern described by Bloch.

The idea is to use the class type as key itself. Since Class is a parameterized type it enables us to make the methods of Context type safe without resorting to an unchecked cast to T. A Class object used in this fashion is called a type token.

public class Context {

  private final Map<Class<?>, Object> values = new HashMap<>();

  public <T> void put( Class<T> key, T value ) {
    values.put( key, value );
  }

  public <T> T get( Class<T> key ) {
    return key.cast( values.get( key ) );
  }

  [...]
}

Note how the down cast within the Context#get implementation has been replaced with an effective dynamic variant. And this is how the context can be used by clients:

Context context = new Context();
Runnable runnable ...
context.put( Runnable.class, runnable );

// several computation cycles later...    
Executor executor = ...
context.put( Executor.class, executor );

// even more computation cycles later...
Runnable value = context.get( Runnable.class );

This time, the client code will work without class cast problems as it is impossible to exchange a certain key-value pair by one with a different value type.

Where there is light, there must be shadow, where there is shadow there must be light. There is no shadow without light and no light without shadow…. Haruki Murakami

Bloch mentions two limitations to this pattern. ‘First, a malicious client could easily corrupt the type safety […] by using a class object in its raw form.’ To ensure the type invariant at runtime a dynamic cast can be used within Context#put.

public <T> void put( Class<T> key, T value ) {
  values.put( key, key.cast( value ) );
}

The second limitation is that the pattern cannot be used on non-reifiable types (see Item 25, Effective Java). Which means you can store value types like Runnable or Runnable[] but not List<Runnable> in a type safe manner.

This is because there is no particular class object for List<Runnable>. All parameterized types refer to the same List.class object. Hence Bloch points out that there is no satisfactory workaround for this kind of limitation.

But what if you need to store two entries of the same value type? While creating new type extensions just for storage purpose into the type-safe container might be imaginable, it does not sound as the best design decision. Using a custom key implementation might be a better approach.

Multiple Container Entries of the Same Type

To be able to store multiple container entries of the same type we could change the Context class to use a custom key. Such a key has to provide the type information we need for the type-safe behaviour and an identifier for the distinction of the actual value objects.

See also  Speed up Your Builds with Codeship Parallel Test Pipelines

A naive key implementation using a String instance as identifier might look like this:

public class Key<T> {

  final String identifier;
  final Class<T> type;

  public Key( String identifier, Class<T> type ) {
    this.identifier = identifier;
    this.type = type;
  }
}

Again we use the parameterized Class as hook to the type information. And the adjusted Context now uses the parameterized Key instead of Class:

public class Context {

  private final Map<Key<?>, Object> values = new HashMap<>();

  public <T> void put( Key<T> key, T value ) {
    values.put( key, value );
  }

  public <T> T get( Key<T> key ) {
    return key.type.cast( values.get( key ) );
  }

  [...]
}

A client would use this version of Context like this:

Context context = new Context();

Runnable runnable1 = ...
Key<Runnable> key1 = new Key<>( "id1", Runnable.class );
context.put( key1, runnable1 );

Runnable runnable2 = ...
Key<Runnable> key2 = new Key<>( "id2", Runnable.class );
context.put( key2, runnable2 );

// several computation cycles later...
Runnable actual = context.get( key1 );

assertThat( actual ).isSameAs( runnable1 );

Although this snippet works, the implementation is still flawed. The Key implementation is used as lookup parameter in Context#get. Using two distinct instances of Key initialized with the same identifier and class – one instance used with put and the other used with get – would return null on get. Which is not what we want.

Luckily this can be solved easily with an appropriate equals and hashCode implementation of Key. That allows the HashMap lookup to work as expected. Finally one might provide a factory method for key creation to minimize boilerplate (useful in combination with static imports):

public static <T> Key<T> key( String identifier, Class<T> type ) {
  return new Key<>( identifier, type );
}

Conclusion

‘The normal use of generics, exemplified by the collection APIs, restricts you to a fixed number of type parameters per container. You can get around this restriction by placing the type parameter on the key rather than the container. You can use Class objects as keys for such typesafe heterogeneous containers’ (Joshua Bloch, Item 29, Effective Java).

Given these closing remarks, there is nothing left to be added except for wishing you good luck mixing apples and pears successfully…

Title Image: © Depositphotos.com/file404
Frank Appel
Follow me
Latest posts by Frank Appel (see all)

6 Comments so far:

  1. Dominic Fox says:

    This technique is the basis for Octarine, in which type-bearing keys also act as lenses which can be composed to create pointers to values in nested containers. See http://github.com/poetix/octarine for details.

    • Frank Appel says:

      Thanks for sharing this link. Octarine seems pretty cool on first sight. Will have a in dept look ASAP and see how it feels working with it ;-)

  2. Thanks for sharing this Frank!

    We’ve often run into similar situations as you and over time, very similar patterns emerged. For the first example (if you don’t need multiple values), we’re often using ClassToInstanceMap from Guava as it effectively solves that problem. It even offers mutable and immutable variants of that map type which comes in handy.

    For the usecase of storing parameterized types, we’ve played around with different techniques and essentially came up with a combination of different approaches depending on which technologies are used. For very simple cases, we’ve used a combination of the ClassToInstanceMap with a custom Type key (instead of Class) which stores and both raw and component type. In other cases when we had access to Guice and Guava, we’ve been using TypeTokens and TypeLiterals to solve this. In the end, it heavily depends on your coding style and how/where you use those constructs as the usage of TypeLiterals requires an anonymous class to resolve the component type due to type erasure (see example in http://google.github.io/guice/api-docs/latest/javadoc/index.html?com/google/inject/TypeLiteral.html). We usually hide those type literals behind methods/constants to make the code more readable.

    Looking forward to the next blog posts, always a pleasure to read!

    • Frank Appel says:

      Hi Benny, thanks for your praise ;-) Guava’s ClassToInstanceMap even allows to restrict the content of a map instance to sub classes of a particular type. This is because it has a single type parameter representing the upper bound on the types managed by the map. This means instead of mixing apples and pears, you may restrain the container to all types of apples only…

      Regarding type literals it seems to be the technique Bloch refers to as ‘super type tokens’. However he points out that there are limitations to this approach, described Neal Gafter (http://gafter.blogspot.de/2007/05/limitation-of-super-type-tokens.html). Seems that the correctness of this approach can only be enforced at runtime (but not at compile time). Which is probably why Bloch did not dwell on it in his chapter about generics.

    • Holger says:

      From my point of view the MultiTypedMap has a major drawback. To ensure type safety you can’t use the standard Map interface. Looking at the implementation there are several new get methods but the original put. This means the get is type safe while the put isn’t type safe.

      The fact that you need to declare new get/put methods to ensure type safety makes the use of the Map interface senseless.