FIR. Document current builder inference algorithm

This commit is contained in:
Simon Ogorodnik
2023-05-05 18:38:48 +02:00
committed by Space Team
parent 987e8c25dc
commit 4d1ea06c17
3 changed files with 219 additions and 0 deletions
+210
View File
@@ -0,0 +1,210 @@
## FIR/Builder inference
See also: [Kotlin Spec: Builder-style type inference](https://kotlinlang.org/spec/type-inference.html#builder-style-type-inference)
### Glossary
#### CS = Constraint system
An instance of `org.jetbrains.kotlin.resolve.calls.inference.model.NewConstraintSystemImpl`
#### Call-tree
A tree of calls, in which constraint systems are joined and solved(completed) together
#### Postponed type variable
Type-variable, that was used to build a stub type on, and fixation of which is postponed until the end of lambda analysis.
Selection of type variables to postpone happens during [Initiation](#initiation) phase
#### Stub type
Type, that is equal to anything from the perspective of subtyping.
Carries information about its base postponed type variable
Such types are used during [analysis of lambda body](#lambda-body-analysis) inplace of postponed type variables to resolve calls,
after end of analysis occurrences of such types are replaced with corresponding type inference result for its base type variable
Ex:
```kotlin
buildList { /* this: MutableList<Stub(T)> -> */
val self/*: MutableList<Stub(T)> */ = this
self.add("") // String <: Stub(T) from argument
}
```
#### Proper constraint
A constraint that doesn't reference any type variables
### Builder inference algorithm
The algorithm consists of the following phases, all of which happen during constraint system completion
- [Initiation](#initiation) - deciding if lambda is suitable for builder inference
- [Lambda analysis preparation](#lambda-analysis-preparation) - stub type and builder inference session creating
- [Lambda body analysis](#lambda-body-analysis) - collecting calls inside lambda body, which references stub-types among input types
- [Lambda analysis finalization](#lambda-analysis-finalization) - integrating collected information
- [Result write](#result-write) - writing of builder inference result to FIR tree of lambda (and implicit receiver stack update)
- [Result backpropagation](#result-backpropagation) - propagation of builder inference result to CS of call-tree
### Detailed description
#### Initiation
Before, we try to fix all type variables that have enough type info to be fixed and analyze all lambda arguments,
whose input types depend on such type variables.
If there are still not analyzed lambda arguments left, we try to perform builder inference
Entrypoint to builder inference is the function `org.jetbrains.kotlin.fir.resolve.inference.ConstraintSystemCompleter.tryToCompleteWithBuilderInference`
*It happens only during full completion*
Then, all lambda arguments, that weren't analyzed and satisfy the following criteria are considered:
- lambda argument has a non-empty list of input types
- any of the input types contain a non-fixed type variable in arguments (If the input type is a type-variable itself, it is **NOT** considered)
Such type-variables are marked as postponed and analysis of the lambda argument is performed
#### Lambda analysis preparation
First, `org.jetbrains.kotlin.fir.resolve.inference.PostponedArgumentsAnalyzer.analyzeLambda`
- creates stub types for all postponed type variables
- substitutes type-variable-type -> stub-type in a receiver, context receiver, parameters, and return types
Then, before the actual transforming of the lambda body, an instance of `org.jetbrains.kotlin.fir.resolve.inference.FirBuilderInferenceSession`
is created and set as the current inference session
#### Lambda body analysis
During body analysis of a lambda, for each call (inside lambda body) that is completed in the FULL mode, we ask the inference session if it can
be completed
See `org.jetbrains.kotlin.fir.resolve.inference.FirBuilderInferenceSession.shouldRunCompletion`
It is needed in cases, where the result of builder inference can help with further system completion
Example:
```kotlin
fun <M> materialize(): M = null!!
fun foo() = buildList {
addAll(materialize()) // (1) To be able to infer type for materialize, we need to know builder inference result
add("str") // (2) This call already has complete type info, so we allow it to complete
}
```
In situation 1, we mark the call as an incomplete, see `org.jetbrains.kotlin.fir.resolve.inference.FirBuilderInferenceSession.addPartiallyResolvedCall`
In situation 2, we mark the call as a completed, see `org.jetbrains.kotlin.fir.resolve.inference.FirInferenceSession.addCompletedCall`
Note: Calls that were analyzed in PARTIAL mode will not be considered, as it always analyzed during FULL completion of the outer call.
Such calls will not be added to incomplete/completed call lists
The criteria to mark a call as an incomplete is **ALL** following:
1. There's no contradiction in the constraint system of the call
2. The candidate (call) is suitable for builder inference based on **ANY** of the following conditions:
- The dispatch receiver's type contains a stub type
- The extension receiver's type contains a stub type and the corresponding symbol has a builder inference annotation in the session
3. The candidate doesn't have any not analyzed postponed atoms
4. Any type variable in call CS doesn't have a proper constraint, and it's not a postponed variable
If call is marked incomplete its constraint system wouldn't be solved as usual in `org.jetbrains.kotlin.fir.resolve.inference.FirCallCompleter.completeCall`, completion result writing will not be performed.
It is postponed until [lambda analysis finalization](#lambda-analysis-finalization)
#### Lambda analysis finalization
See `org.jetbrains.kotlin.fir.resolve.inference.PostponedArgumentsAnalyzer.applyResultsOfAnalyzedLambdaToCandidateSystem`
Once the lambda body was analyzed return arguments are added into the call-tree
Then, we perform inference of postponed type variables
See `org.jetbrains.kotlin.fir.resolve.inference.FirBuilderInferenceSession.inferPostponedVariables`
After such steps, the constraint system of the call-tree contains all output-type constraints
While systems for the completed and incomplete calls inside the lambda body contain all internal type constraints in the form of
constraints with stub types
See `org.jetbrains.kotlin.fir.resolve.inference.FirBuilderInferenceSession.buildCommonSystem`
To find a solution for postponed type variables we need to integrate all aforementioned constraints.
We do so, by constructing a new constraint system in the following way:
- Register all not-fixed type variables from CS of call-tree
- Re-apply all initial constraints from CS of call-tree
- Substitute Stub(PostponedTV) with TypeVariableType(PostponedTV)
- Optimization: Only constraints, where at least one of the types contains TypeVariableType from integration system are applied
- Constraints, that originate from builder inference results of other lambda arguments of the call-tree are skipped (See `org.jetbrains.kotlin.resolve.calls.inference.model.BuilderInferencePosition`)
- Re-apply all initial constraints of all completed calls that were collected during lambda body analysis
- Using the same algorithm as above
- Re-apply all initial constraints of all incomplete calls
- Using the same algorithm as above
- Also, register all fixed type variables and bring its fixation result as an equality constraint
Note: Such a system shouldn't contain stub types from current builder inference session in constraints anymore
If the system is empty, we bail out and skip to the [result write](#result-write)
After such a constraint system is constructed, we call the constraint system completer to solve the system
Note: As incomplete calls couldn't contain postponed arguments in the call-tree (as defined in [lambda body analysis](#lambda-body-analysis)), we don't analyze any postponed arguments here
#### Result write
See `org.jetbrains.kotlin.fir.resolve.inference.FirBuilderInferenceSession.updateCalls`
##### Resulting substitutor:
Solution(TV) - result type, that was inferred for a type variable TV in integration CS
Substitution will be the following:
- Stub(PostponedTV) -> Solution(PostponedTV) ||
- TypeVariableType(TV) -> Solution(TV) ||
- Stub(*) -> ERROR ||
- TypeVariableType(*) -> ERROR
Note: Effectively, it means that if we have an inference result for a postponed type variable, we replace stub/type variable types inside with
an error types
Unclear: If the constraint system wasn't empty we apply the resulting substitutor to fixed type variables from CS of the call-tree
We apply the resulting substitutor to all type references inside of lambda, recursively
After that, we apply the same substitution to all the implicit receivers, which originated from lambdas. It is needed to have full type information
during the analysis of lambda arguments inside of return arguments in the current lambda
(since those arguments may be analyzed after builder inference completion)
Then, we call the completion results writer to store completion results for incomplete calls with the result substitutor
#### Result backpropagation
See `org.jetbrains.kotlin.fir.resolve.inference.PostponedArgumentsAnalyzer.applyResultsOfAnalyzedLambdaToCandidateSystem`
Once the result for postponed variables is inferred and stored, we need to propagate results to the CS of call-tree
since its completion isn't finished yet
To do so, we inject subtype constraint with result type from postponed variable inference
After that, we mark postponed type variables as no longer postponed
Lambda analysis is now finished, and we return to the [Initiation](#initiation) to process other lambda arguments of the call-tree
### Potential problems
#### Unclear naming
In fact, builder inference doesn't only apply to "builders". It's a general algorithm that applies to lambda arguments of particular shape
Ex:
```kotlin
suspendCoroutine/* <String> inferred by 'builder inference' */ {
it.resume("")
}
// ...
fun <T> consumeIterator(input: Iterator<T>.() -> Unit) {}
fun takeString(s: String) {}
//...
consumeIterator/* <String> inferred by 'builder inference' */ { takeString(next()) }
```
#### Incomplete criteria to perform builder-inference
Criteria to perform builder inference doesn't consider lambdas, which input types contain only type-variable-types
#### Incomplete criteria to mark calls as incomplete during lambda analysis
- Criteria to mark calls as incomplete depends on builder inference annotation, however it shouldn't be used anymore
- In case of present postponed arguments somewhere in the call-tree it will change behavior, failing to infer types
- Ad-hoc check for the presence of proper constraints
#### Lost diagnostics
The constraint system that is used to infer results for postponed type variables isn't checked for inconsistencies
#### Unclear substitution of fixed type variables
During [result writing](#result-write), we update fixed type variables of the main call-tree, reasons for that aren't clear
#### Potentially exponential complexity
- During [lambda analysis finalization](#lambda-analysis-finalization) we copy initial constraints from call-tree to integration system, it
can lead to exponential complexity (no concrete example here)
- During [result writing](#result-write) we traverse body of lambda argument recursively, in case if there are multiple nested lambdas that are
subjects to builder inference, it will be exponential
#### Separate constraint system creation and result backpropagation
During [lambda analysis finalization](#lambda-analysis-finalization) we create and solve separate CS.
However, we need to communicate results to the CS of the call-tree, during [result backpropagation](#result-backpropagation) we add subtype constraints between
type variable from the call-tree and solution from integration CS, loosing information about actual constraints.
It causes unsound solutions in the call-tree CS
#### Resulting substitution is unclear
Its unclear why we mix stub type substitutor with type-variable-type substitutor, and why we need to manually handle substitution to error
types