^KT-65637 Fixed
24 KiB
Partially Constrained Lambda Analysis
Prerequisites: That document is built under the assumption that the reader is more or less aware of how inference works for calls without lambdas or with lambdas for which all input-types are proper at the point where they should be analyzed. Also, some terms that are used here described at Common inference terms definition
See also: Kotlin Spec: Builder-style type inference
The most basic scenario where PCLA is used is calls to builder-like functions:
fun <E> buildList(builderAction: MutableList<E>.() -> Unit): List<E> = ...
fun main() {
buildList {
add("")
} // E is inferred to `String`
}
There in the buildList call there is no other source of information what type should be inferred for E beside the body of the lambda.
That is the case where partially constrained lambda analysis (PCLA) starts.
Glossary
Regular lambda analysis
Currently, we start lambda analysis only during call completion (ConstraintSystemCompleter.runCompletion).
Let's call "regular lambda analysis" situation when we start body resolution of the anonymous function when all input types do not contain
uninferred TVs.
PCLA (Partially Constrained Lambda Analysis)
PCLA is the way of lambda analysis during call completion that happens when some input types are not properly inferred, and there are no other sources of constraints.
PCLA lambda = lambda analyzed with PCLA
Main call/candidate is the call that contains PCLA lambda
Nested call/candidate is a call being resolved in other than ContextDependent mode (usually statement-level) that belongs to PCLA lambda
- Postponed nested calls are nested calls which we decided not to complete in the FULL mode immediately
Shared CS is a constraint system being shared between postponed nested calls inside PCLA lambda
Outer CS
- Defined only for nested postponed candidates (for shared CS)
- It consists of type variables of the containing PCLA lambdas and all the type variables of already processed inner candidates
- For inner CS, all the outer CS-related type variables are always coming at the beginning of the list of all variables
Outer Type Variable (TV)
- In the context of shared CS, it's a subset of type variables that are brought from the outer CS
- In the list of all variables (
NewConstraintSystemImpl.allTypeVariables) outer TV always comes in the beginning.- See
NewConstraintSystemImpl.outerSystemVariablesPrefixSize
- See
Entry-point to PCLA
The conditions to run a lambda resolution in PCLA mode is the following:
- Completion mode for the current call tree is
FULL - There are no other ways to infer some new constraints for any type variable.
- There is some lambda that among its input types has at least one with a not fixed TV being used as type argument
- So, if one of the input types is
MutableList<Ev>that lambda suits for PCLA - But if some of the input types are proper and other are top-level type variables (
Tv,Tv?,Tv & Any, etc.), it doesn't suit
- So, if one of the input types is
References in code:
ConstraintSystemCompleter.tryToCompleteWithPCLA
During PCLA, there is a quite specific behavior of how actually lambda body resolution happens, and describing it is actually a goal of the following parts document.
Algorithm backbone
The basic idea is that we mostly run regular lambda body analysis, with special treatment for all calls inside. Namely,
- They use shared CS: adding their new variables there and constraints related to both outer CS and their own variables
- They are not being fully completed (leaving them postponed)
After lambda's body traversal, the whole shared CS is added to the CS of the main candidate, the resulting CS should be resolved as usual, and after that at completion results writing phase all type variables are replaced with their result types.
More details
-
NB: This algorithm doesn't use any stub types, just regular type variables everywhere
-
Once we see a need to start PCLA (see Entry-point to PCLA), we create
FirPCLAInferenceSession(introduced atLambdaAnalyzerImpl.analyzeAndGetLambdaReturnArguments)- Shared CS (see
FirPCLAInferenceSession.currentCommonSystemproperty)- It's mostly a copy of outer CS (the CS of the containing candidate), but we mark all the variables as outer
- See
FirInferenceSession.Companion.prepareSharedBaseSystem
-
After that, we run analysis inside the lambda
- For nested candidates that are really trivial (no receivers with TV, no lambdas, no value arguments using outer CS), we just regularly complete them
- For other calls
- Instead of empty initial CS, we supply a shared CS
- So, before introducing its own TVs, CS already contains of outer TV
- Run candidate resolution within this CS
- Do not run full completion, just gather all the constraints and variable back to the shared CS
-
In the end, the root CS (aka CS for the Main candidate) contains TVs for all incomplete nested candidates.
-
And may just continue
FULLcompletion process on the Main, i.e., fixing variables for which we now have new proper constraints, and analyze other lambdas. -
In the end, beside completion results writing for the main call, we also write results (inferred type arguments/expression type update) for all incomplete/postponed calls (see
FirCallCompletionResultsWriterTransformer.runPCLARelatedTasksForCandidate).
Example 1
fun foo() {}
fun main() {
val x = listOf("")
buildList /* Main candidate/call */ {
foo() // Irrelevant call that doesn't refer PCLA variables, might be fully completed
// Uses shared CS
// Adds String <: Ev constraint
// Pushes all constraints back to the shared CS
add("") // Postponed nested call
// Uses shared CS
// Candidate declaration: fun <T, R, C : MutableCollection<in R>> Iterable<T>.mapTo(destination: C, transform: (T) -> R): C
// Adds Tv, Rv, Cv type variables
// And constraints
// String <: Tv [from List<String> <: Iterable<Tv>]
// MutableList<Ev> <: Cv
// Rv <: Ev [incorporation from MutableList<Ev> <: MutableCollection<in Rv>
// Fixing Tv := String
// Starting lamda analysis
// Having
// Rv <: String constraint
//
// Then add all the new constraints to the shared CS
x.mapTo(this) { it } // Postponed nested call
}
// And start looking for CS solution that would be sound
// Ev := String
// Rv := String
// Cv := MutableList<String>
Example 2
fun <A, B, C> twoSteps(first: Inv<A>.() -> Unit, b: B, second: (B) -> C): Triple<A, B, C> {
val a = Inv<A>().apply(first)
val c = second(value)
return Triple(a, value, c)
}
fun main() {
twoSteps(
first = { set(1) },
b = "b",
second = { it }
)
}
Here, during the call completion of the call to twoSteps function we would:
- Fix
Bv <:> String, from theString <: Bvconstraint. - Analyze
secondlambda, since it has enough type info,Bv <:> Stringis known now.firstlambda still hasAvin its input types:/* input type */ Inv<Av>.() -> /* output type */ Unitsecondlambda input types are known:(/* input type */ Bv=String) -> /*output type*/ Cv- Get additional type constraint
Cv <: Stringfrom the usage of it
- Fix
Cv <:> String, from the newly inferred lambda type info. - Realize, that the
firstlambda still doesn't have enough type info and its input types are still unknown. - tryToCompleteWithPCLA
- Check if the
firstlambda type shape is suitable for PCLA - Analyze the
firstlambda in PCLA mode
- Check if the
Shared CS definition
- It's been initialized at
FirPCLAInferenceSession.currentCommonSystem. - Initially, it's an empty CS with the main candidate's system being added as an outer CS.
Inference session callbacks
Currently, there are three implementations of inference session
Defaultthat effectively does nothingFirPCLAInferenceSession(being used during PCLA lambda resolution)FirDelegatedPropertyInferenceSession(being used for delegated operator resolution)
baseConstraintStorageForCandidate
It returns shared CS if the candidate is considered applicable (non-trivial/postponed nested).
That callback is supposed to be called for each created candidate of all nested calls, and the content of returned CS should be integrated into the candidate CS.
See org.jetbrains.kotlin.fir.resolve.calls.Candidate.getSystem.
The shared CS is not supposed to be modified during candidate resolution (because there might be more than one of them for each call, while only a single one should be chosen).
customCompletionModeInsteadOfFull
Being called after the single successful candidate is chosen just before the completion begins.
Currently, it returns PCLA_POSTPONED_CALL if the candidate is postponed.
See more details on PCLA_POSTPONED_CALL completion mode in the section below.
processPartiallyResolvedCall
This callback is assumed to be called after PARTIAL or PCLA_POSTPONED_CALL completion terminates.
Mostly, it collects all postponed candidates and integrates their CS content into the shared CS.
Also, here we substitute the type of that given call expression with substitutor replacing already fixed type variables
with their result types.
Otherwise, using fixed TVs as expression types further might lead to TypeCheckerStateForConstraintInjector.fixedTypeVariable
throwing an exception.
fun <S> id(e: S) = e
fun main() {
buildList {
// `id(this.size) ` type is `Sv`.
// But we actually fixed `Sv` to Int because it's resolved in independent context and `Sv` is not related
// to the outer CS.
//
// But since we don't run completion writing results, no one would replace the expression type of the call
// with new resulting type.
// So, we do it at `processPartiallyResolvedCall`
val x = id(this.size)
add(x)
}
}
Note that this situation with fixed TVs can't happen outside PCLA context because
- For partial completion, we avoid fixing TVs that are used inside return types
- For FULL completion, we would run completion results writing, so there would be no candidates and type variables inside return types.
And the last part is fixing TVs for receiver (see the relevant section for details).
runLambdaCompletion
This part is rather controversial and hacky, hope we could get rid of it at some point.
When resolving a lambda at some nested call inside PCLA lambda, like
buildList {
myOtherCallWithLambda {
add("")
Unit
}
}
After completion for add("") call, we haven't yet called processPartiallyResolvedCall for myOtherCallWithLambda { ... },
because it's not yet completed, and we are already going to call processPartiallyResolvedCall for add(""), that would contribute some
constraints to buildList type variable Ev in shared CS.
And after that we finally call processPartiallyResolvedCall for myOtherCallWithLambda { ... }, but it doesn't contain constraints
on Ev from add("""), thus overwriting existing information.
That might be workarounded with some kind of merging constraints for the same TV instead of just replacement
(see NewConstraintSystemImpl.addOtherSystem), but that seems to be hacky too.
So, the current approach is the following: during nested lambda resolution we assume that currently shared CS is equal to the CS of the
myOtherCallWithLambda candidate. Thus, the constraints from add call are being merged into the myOtherCallWithLambda's CS and then
finally into regular shared CS.
Hacky place: the proper solution might be just replacing instance of shared CS with current candidate's CS, but that needs some further investigation.
Also, there's some special handling of overload resolution by lambda return type, but they are quite expected when we don't have a final
candidate yet (see analyzeLambdaAndReduceNumberOfCandidatesRegardingOverloadResolutionByLambdaReturnType)
runCallableReferenceResolution
TODO: Mostly, the idea as the same as for lambdas, but for callable references
addSubtypeConstraintIfCompatible
In some situations, some constraints might be originated not just from calls, but from other kinds of expression, like from variable assignments:
var <F> MutableList<F>.firstElement
get() = get(0)
set(f: F) {
set(0, f)
}
fun main() {
buildList {
firstElement = ""
}
}
We don't specially resolve the call to the setter, thus to declare that String <: Fv, thus String <: Ev, and finally Ev := String,
there's a need to somehow send this information to the inference session.
And that's how addSubtypeConstraintIfCompatible might be used.
Hacky place: In general, this approach is quite fragile, and we might've forgotten some places where this method should be triggered.
One of the ideas particularly for assignment is that they should be resolved via setter call, thus the necessary constraint would be
introduced naturally when string literal would be an argument for Fv value parameter.
PCLA_POSTPONED_CALL completion mode
This mode is assumed to be used for postponed nested calls inside PCLA lambdas instead of FULL mode (i.e., mostly for top-level calls).
The set of the type variables that might be considered for fixation are obtained from the call tree of the supplied nested call itself, i.e., it would be a set of fresh type variables of the candidate itself, type variable of its arguments and from the return statements of lambdas. That part works the same way as for regular calls outside PCLA context.
Example
fun foo(x: List<String>) {
buildList {
x.mapTo(this) { // `mapTo` analyzed in PCLA_POSTPONED_CALL
add("") // analyzed in PCLA_POSTPONED_CALL
it.length // analyzed in PCLA_POSTPONED_CALL
}
println("") // analyzed in FULL mode (because it's a trivial call, irrelevant to outer CS)
add(x.get(0) /* analyzed in PARTIAL */) // analyzed in PCLA_POSTPONED_CALL
}
}
Short description
- Type variables are gathered from the call tree of the supplied nested call.
- Variable fixation is not allowed if the variable is deeply related to some TV from the outer CS.
- All the lambdas belonging to the supplied call need to be analyzed during this completion.
- We don't run completion writing in this mode.
Variable fixation limitation
The only limitation that we've got for PCLA_POSTPONED_CALL is that we don't allow fixing TVs that are deeply related to some outer TV
(for definition, see the relevant part at inference.md).
References in code: TypeVariableFixationReadiness.OUTER_TYPE_VARIABLE_DEPENDENCY
See the following example,
interface A
object B : A
object C : A
class GenericController<T> {
fun yield(t: T) {}
}
fun <K> GenericController<K>.yieldAll(s: Collection<K>) {}
fun <S> generate(g: suspend GenericController<S>.() -> Unit): S = TODO()
fun foo(x: Collection<B>) {
generate {
// GenericController<K> <: GenericController<S> => K := S
// Collection<B> <: Collection<K>
// B <: K
// Let's imagine we fix K to B, then it would lead to the constraint S := B
yieldAll(x)
// And to constraint error here: C <: B
yield(C)
}
}
While in fact, Sv might be easily inferred to A as a common supertype. To make it happen we just postpone fixation of the Sv as one
having deep
Forcing lambda analysis
Unlike FULL mode, it doesn't require fixing all the variables, in that sense it works more like PARTIAL.
But unlike PARTIAL, if forces analysis of all the lambdas even if some input types are not properly inferred yet.
It's crucial because otherwise, contract-affecting information gained would stop working across different statements.
For lambdas, there might be three kinds of situations:
- There's some lambda which input types do not contain any not fixed type variables => might be analyzed in a regular way
- For some lambda, some of the input types satisfy PCLA requirement (see Entry-point to PCLA) => might be analyzed in recursive PCLA mode (see Nested PCLA)
- For all lambdas, all the input types are either proper or a top-level type variable (most complicated).
In the third (and sometimes in the second) situation, we start regular lambda analysis, but with the following modifications:
- If the lambda parameter has a type of some not fixed TV, we just leave it there.
- If the receiver type of lambda is some type variable on demand. One of the options would be just leaving TV not fixed as for value parameter, but that seems meaningless because almost any call inside the lambda might require the member scope of the receiver, thus need to fix its result type.
On demand variable fixation
Unlike non-PCLA context, here there might be legit situations when someone needs to look into member scope of some type variable.
For example, see the following case:
buildList {
// MutableList<Ev> <: Tv
// T can't be fixed, because it's deeply related to Ev
this.let { it -> // it: Tv
it.add("") // requesting member scope of Tv
}
}
There, we've got it used as a receiver in a call it.add(""), thus the first question we need to answer to is which member scope it has.
And the answer is that before starting the selector (add("")), we simply try to fix necessary type variable with the following steps:
- Don't do anything in case the type variable belongs, to outer CS
- Temporary override what proper type means
- Type variables from outer CS might be used as a type argument for a proper type (e.g.,
MutableList<Ev>is proper). - Even type variables from outer CS can't be used as top-level proper type (e.g, we can't fix
TvtoEv). - See
ConstraintSystemCompletionContext.withTypeVariablesThatAreCountedAsProperTypesandTypeSystemInferenceExtensionContext.isProperTypeForFixationextension.
- Type variables from outer CS might be used as a type argument for a proper type (e.g.,
- If there is some proper type for that TV, run regular result type computation
ResultType. - Add constraint
Tv := ResultTypeand useResultTypeas a member scope - NB: we don't actually fix the variable via
NewConstraintSystemImpl.fixVariable, because there's an assumption that resulting type should not contain other type variables that might be not satisfied during on-demand fixation. ButfixVariablewould anyway be called during final completion when all other type variables types would be properly substituted in the given equality constraint.
As follows, from the rules above, there are some scenarios where the variable can't be fixed. In those cases, we implicitly forbid using them as receivers:
- By assuming that we've got their member scope empty
- Reporting a diagnostic at
org.jetbrains.kotlin.fir.resolve.calls.TypeVariablesInExplicitReceiversresolution stage (currentlyBUILDER_INFERENCE_STUB_RECEIVERbeing used).
fun <T> myLet(t: T, b: (T) -> Unit) {}
buildList {
get(0).toString() // BUILDER_INFERENCE_STUB_RECEIVER is reported with PCLA
myLet(get(0)) { it ->
get(0).toString() // BUILDER_INFERENCE_STUB_RECEIVER is reported with PCLA
}
}
See the implementations details at FirPCLAInferenceSession.fixVariablesForMemberScope.
CST computation
When computing the result type for TV referencing other variables from outer CS there might be a situation when we need to compute
CommonSuperType(Xv, ...):
fun <S> selectL(x: S, y: S, l: (S) -> Unit) {}
fun foo(x: MutableList<String>) {
buildList {
// Adding the constraints
// MutableList<String> <: Sv
// MutableList<Ev <: Sv
// But to analyze lambda we fix Sv to CST(MutableList<String>, MutableList<Ev>) = MutableList<CST(String, Ev)>
// The question is what is CST(String, Ev)
selectL(x, this) { x ->
x.add(<!ARGUMENT_TYPE_MISMATCH!>1<!>)
x.add("")
}
} // inferred to List<String>
}
References: org.jetbrains.kotlin.resolve.calls.inference.components.ResultTypeResolver.prepareLowerConstraints
In general, such a problem existed and be resolved before PCLA:
- On the fixation stage, type variable should have at least one proper constraint, but other might be not-proper, but still being used
- For that case, we do a controversial thing, namely
- Substituting all TVs with special stub types
- Compute CST with those substituted types
- For those stub types, there’s effectively a rule that for any type
Tvand stubXStub:CST(Tv, XStub) = Tv
- For those stub types, there’s effectively a rule that for any type
- Intuitively, such an approach seems correct because it just chooses some result and incorporates it with other TVs, at least not hiding any contradictions
- In the example above, we would get
Sv := MutableList<String>and during incorporation we would assert the subtypingCollection<S> <: Collection<String>, thus having constraintS <: Stringmaking the whole CS sound
- In the example above, we would get
But unlike a non-PCLA case, our TV might have a single constraint containing a reference to some other (outer) TV, so we need to make some tweaks:
- Because we've overridden what "proper" means here, but
prepareLowerConstraintsuses it in a sense of the usage some TV inside the type, for PCLA context, we unconditionally perform stub substitution (a bit of dirty place) - And after that we might need to substitute "local" stubs back to TVs (impossible in non-PCLA, because in that case, at least one constraint is "proper" in regular sense, i.e., doesn't contain any TVs)
Nested PCLA
buildList {
add("")
addAll(buildSet l2@{
add(1)
})
}
For nested PCLA lambdas, everything works pretty straightforward:
- We create own
FirPCLAInferenceSessionfor it - Run analysis of the nested lambda with that session just the same way as for an outer
- But the main candidate for
l2automatically uses the shared CS forl1(thus, it's used as an outer CS forl2)
- But the main candidate for
- But we don't run completion writing results at this stage yet
- Just applying the system for
l2to the candidate of nested candidate owning the lambda (addAllin the example), and after that it automatically propagates to the shared CS duringprocessPartiallyResolvedCall.
Thus, in the end, we would get the CS for the root call containing all variables from the whole PCLA-tree.
Completion results writing for nested lambdas happen recursively when it starts for the root call:
- In the example we start with
buildList. - Go to
FirCallCompletionResultsWriterTransformer.runPCLARelatedTasksForCandidate. - Find the postponed call
addAll. - Recursively run
runPCLARelatedTasksForCandidatefor it at some point. - And while iterating the postponed call for the latter will write results for
add(1)call.
Nested Delegation inference
buildList {
val x by lazy {
get(0)
}
add("")
add(x)
}
Inference for delegated properties inside PCLA works in the following way:
- Delegate expression is being resolved as a regular postponed nested call under the same PCLA session.
- All the operator calls are being resolved under specially created delegate
FirDelegatedPropertyInferenceSessioninPARTIALmode. - After that we run completion for delegate expression via
PCLA_POSTPONED_CALLmode. - Then, we integrate delegation related CS into PCLA shared session and all the delegation related calls as regular postponed ones of the PCLA session.
For more details, read delegated_property_inference.md,
FirDelegatedPropertyInferenceSession.completeSessionOrPostponeIfNonRoot and FirPCLAInferenceSession.integrateChildSession.