javagenericsproject-valhalla

Projected generic specialization in Java 9 or later, vs List<int>: how will .remove() work?


Generic specialization along with value types are a projected feature of future JVMs; link to the Valhalla project page here.

Now, from what I understand, it would then become possible to declare a:

final List<int> myList = new ArrayList<>(); // for instance

But then List defines another .remove() method in addition to the one defined in the Collection interface, which takes an int as an argument which is the index in the list to remove; that is why, currently, the content of list in the example below:

final List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.remove(2);

will be [1, 2] and not [1, 3] (the most specific overload is chosen).

However, if in the future we are able to declare a List<int>, we have a problem: what overload of the remove method will be chosen?


Solution

  • This answer is based on this paper by Brian Goetz, dated December 2014. This is the latest I could find on the subject; note however that the paper is an "informal sketch" so there's nothing definitive yet regarding your question.

    First, a List<int> would not be a subtype of List<Integer> (Subtyping):

    Initially, it might also seem sensible that Box<int> could be a subtype of raw Box. But, given our translation strategy, the Box class cannot be a superclass of whatever class represents Box<int>, as then Box<int> would have a field t of type Object, whereas t should be of type int. So Box<int> cannot be a subtype of raw Box. (And for the same reason, Box<int> cannot be a subtype of Box<Integer>.)

    ...

    Since generics are invariant, it is not surprising that List<int> is not a subtype of List<Integer>. The slightly surprising thing here is that a specialized type cannot interoperate with its raw counterpart. However, this is not an unreasonable restriction; not only are raw types discouraged (having been introduced solely for the purpose of supporting the gradual migration from non-generic code to generic code), but it is still possible to write fully generic code using generic methods -- see "Generic Methods".

    This paper also lists "Migration challenges", and reference-primitive overloadings (the problem is your question) is one of them:

    Some overloadings that are valid today would become problematic under specialization. For example, these methods would have a problem if specialized with T=int:

    public void remove(int position);
    public void remove(T element);
    

    Such overloads would be problematic both on the specialization side (what methods to generate) and on the overload selection side (which method to invoke.)

    A proposed solution is refered to as the "peeling" technique:

    Consider the overload pair in a List-like class:

    interface ListLike<T> {
        public void remove(int position);
        public void remove(T element);
    }
    

    Existing uses of ListLike will all involve reference instantiations, since those are the only instantiations currently allowed in a pre-specialization world. Note that while compatibility requires that reference instantiations have both of these methods, it requires nothing of non-reference instantiations (since none currently exist.)

    The intuition behind peeling is to observe that, while we're used to thinking of the existing ListLike as the generic type, that it might really be the union of a type that is generic across all instantiations, and a type that is generic across only reference instantations.

    If we were writing this class from scratch in a post-specialization world, we might have written it as:

    interface ListLike<any T> {
        void removeByIndex(int position);
        void removeByValue(T element);
    }
    

    But, such a change now would not be either source-compatible or binary-compatible. However, peeling allows us to add these methods to the generic layer, and implement them in a reference-specific layer, without requiring them in the specializations, restoring compatibility:

    interface ListLike<any T> {
        // New methods added to the generic layer
        void removeByValue(T element);
        void removeByIndex(int pos);
    
        layer<ref T> {
            // Abstract methods that exist only in the ref layer
            void remove(int pos);
            void remove(T element);
    
            // Default implementations of the new generic methods
            default void removeByIndex(int pos) { remove(pos); }
            default void removeByValue(T t) { remove(t); }
        }
    }
    

    Now, reference instantiations have remove(T), and remove(int) (as well as the new methods removeByIndex and removeByValue), ensuring compatibility, and specializations have the nonproblematic overloads of removeByValue(T) and removeByIndex(int). Existing implementations of ListLike would continue to compile since the new methods have a default implementation for reference specializations (which simply bridges to the existing remove methods). For value instantiations, removeByIndex and removeByValue are seen as abstract and must be provided, but remove does not exist at all.

    This technique also enables the "implementation by parts" technique; it is possible to declare a method abstract in the generic layer and provide concrete implementations in both the value and reference layers. If we allowed layers for T=int, it would also enable the "specializing the specializations" technique.

    With this technique, backward compatibility would be kept and the new methods removeByValue and removeByIndex would be used.