swiftgenericsexistential-type

Is there a penalty for changing a generic function's argument based on `MyProtocol` to using the existential `any MyProtocol` or `some MyProtocol`?


Normally in the discussion of things like some, they are referring to return types. This question is specifically about using any or some in argument lists.

As an example, in the Swift documentation for String, you have this initializer...

init<T>(_ value: T, radix: Int = 10, uppercase: Bool = false)
where T : BinaryInteger

In Swift 5.6, they introduced the any keyword to let us work with existential types more easily. With that change, I understand you theoretically can rewrite the above like so...

init(_ value: any BinaryInteger, radix: Int = 10, uppercase: Bool = false)

Of course there's also this version based on the some keyword, which also works...

init(_ value: some BinaryInteger, radix: Int = 10, uppercase: Bool = false)

My question is... which one makes the most sense? Is there a down-side to using the existential type over the generic like that? What about the some version? I was originally thinking yes, the generic version is best because the compiler can determine at compile time what's being passed to it, but then again, so does the existential any and even the some version as it won't compile if you don't pass it a BinaryInteger and I'm not quite sure how to write a test to check this out.


Solution

  • When passing a value of a protocol type P to a method in Swift, there are 4 possible spellings you can currently use:

    1. func f<T: P>(_ value: T) ("generic")
    2. func f(_ value: some P) ("opaque parameter")
    3. func f(_ value: P) ("'bare' existential")
    4. func f(_ value: any P) ("'explicit' existential")

    Of these spellings, (1) and (2) are synonyms, and currently, (3) and (4) are synonyms. Between using (1) and (2), there is no difference, and between (3) and (4) there is currently* no difference; but between (1)/(2) and (3)/(4), there is a difference.

    1. f<T: P>(_: T) is the traditional way of taking a parameter of a concrete type T which is guaranteed to conform to protocol P. This way of taking a parameter:
      • Gives you access to the concrete type T both at compile time and at runtime, so you can perform operations on T itself
      • Has no overhead, as the type T is known at compile time, and the compiler knows the size and layout of the value and can set up the stack/registers appropriately; it can pass whatever parameter was given to it directly to the method
      • Can only be called when the type of the argument is statically known (at compile time); but as such, can be called with protocol types with Self- or associatedtype requirements
    2. Introduced in SE-0341 (Opaque Parameter Declarations), the version of a method which takes some Protocol is exactly equivalent to the generic version spelled with angle brackets. I'll avoid repeating the content of the proposal, but the Introduction section spells out the desire for this syntax as a way to simplify the complexity of spelling generic parameters
    3. f(_: P) is the traditional way of taking a parameter of an existential type which is guaranteed to conform to protocol P. This way of taking a parameter:
      • Does not give access to the concrete underlying type of the parameter at compile time; though this is accessible dynamically at runtime via type(of:)
      • Has runtime overhead both to pass an argument to the method, and when accessing the value inside of the method: because the type of the argument to the method might not be known statically (while the compiler still needs to know how to set up the stack and registers in order to call the method), the parameter must be boxed up in an "existential box", which has the interface of P and can dynamically pass methods along to the underlying concrete value. This both has a cost of allocating an additional "box" at runtime to hold the actual value inside of a consistently-sized and -laid-out container, as well as the cost of indirecting as method calls on the box must dynamically dispatch to the underlying type
      • Can be called whether or not the type of the argument is known statically; as such, cannot be called with protocol types with Self- or associatedtype requirements
    4. Introduced in SE-0335 (Existential Any), the any keyword preceding a protocol type is a marker that helps indicate that the type is being used as an existential. It is exactly equivalent right now to using the bare name of the protocol (i.e. any P == P), but there has been some discussion about eventually using the bare name of the protocol to mean some P instead

    So to address your specific example:

    init(_ value: some BinaryInteger, radix: Int = 10, uppercase: Bool = false)
    

    is exactly equivalent to the original

    init<T>(_ value: T, radix: Int = 10, uppercase: Bool = false)
    where T : BinaryInteger
    

    while

    init(_ value: any BinaryInteger, radix: Int = 10, uppercase: Bool = false)
    

    is not. And, because BinaryInteger has associatedtype requirements, you also cannot use the any version as the existential type would not provide access to the underlying associated types. (If you try, you'll get the classic error: protocol 'BinaryInteger' can only be used as a generic constraint because it has Self or associated type requirements)


    In general, generics are preferable to existential types when possible because of the lack of overhead and greater flexibility; but, they require knowing the static type of their parameter, which is not always possible.

    Existentials are more accepting of input, but are significantly more limited in functionality and come at a cost.

    Between preferring <T: P> and some P, or P and any P — the choice is currently subjective, but:

    1. As part of SE-0335, it is currently planned that Swift 6 will require the use of the any keyword to denote existential protocol usage; the counterpart to using this is some, so if you want to start future-proofing your code and staying consistent, it might be a good idea to start migrating to any and some
    2. Between opaque parameter syntax and generic syntax, the choice is up to you, but opaque parameters cannot currently cover all of the use cases that generics can, especially with more complicated constraints. Whether sticking with generics everywhere or preferring opaque parameters and using generics only when necessary is a code style choice that you'll need to make