Files

T

Dmitriy Novozhilov 990da9fa1a [FIR] Part 5. Introduce paired common checkers for expect classes

There are some cases when we want to run some platform checker not from
  platform session but from common session. All such cases appear when
  we check some `expect` class

```kotlin
// MODULE: common
expect interface A
expect class B : A

class C : A

// MODULE: platform()()(common)
actual interface A {
    fun foo()
}

actual class B : A {
    override fun foo() {}
}
```

In this example we want to report "abstract foo not implemented" on
  class `C`, but we don't want to report it on `expect class B` (as
  its supertype is always `expect A`, never `actual A`)

So to cover such cases some platform checkers were split into two parts:
- `Regular`, which is platform checkers and runs for everything except
  expect declaration
- `ForExpectClass`, which is common checkers and runs only for expect
  declarations

^KT-58881 Fixed
^KT-58881 Fixed
^KT-64187 Fixed

2024-01-24 10:45:00 +02:00

16 KiB

Raw Permalink Blame History

FIR Checkers

Checkers structure

There are six kinds of checkers:

The first three kinds are typed and may be restricted to checking only a specific type of declaration/expression/type ref. To simplify working with checkers for different FIR elements, there is a number of typed typealiases:

Declarations: FirDeclarationCheckerAliases.kt
Expressions: FirExpressionCheckerAliases.kt
Type refs: FirTypeCheckerAliases.kt

The next kind, FirLanguageVersionSettingsChecker, is to check language version settings independently of particular code pieces.

The last kind of checker, FirControlFlowChecker, is for checkers which perform Control Flow Analysis (CFA) and is supposed to work with every declaration that has its own Control Flow Graph (CFG)

Checkers contracts

All checkers are supposed to satisfy the following contracts:

checkers are stateless
checkers are independent
checkers are as specific as possible
checkers should try to avoid traversing the subtree of the element it checks
checkers should not rely on the syntax

Those contracts imply the following:

Usually a checker is an object without any state
Each checker should work correctly even if all other checkers are disabled
If a checker is meant to check only simple functions, there is no need to parameterize it with FirDeclaration and check if the declaration is a FirSimpleFunction. Just parameterize the checker itself with FirSimpleFunction
- this is needed not only for simplification of code, but also for the sake of performance. Typed checkers are run only on elements with a suitable type. So if you declared a FirRegularClassChecker it will never be run for a FirAnonymousObject
If a checker is supposed to check anonymous initializers, it's better to create a FirAnonymousInitializerChecker which will be separately run for each init block in the class rather than creating a FirClassChecker which will manually iterate over each init block in this class. There are several reasons for that:
- the diagnostic suppression mechanism is implemented in the checkers dispatcher, so reporting something on a sub-element may cause false-positive diagnostics, if there was a @Suppress annotation between the root element (passed to the checker) and the sub-element. There is a mechanism to fix it, but it's not recommended to use
- checkers with smaller scope increase IDE performance because they require fewer elements to be resolved in order to check something
FIR compiler is made syntax-agnostic and can work with different parsers and syntax tree (at this moment it already supports PSI and LightTree syntax trees), so checkers should not rely on any syntax implementation details. Instead of that, checkers should use positioning strategies to more precise positioning of diagnostics for specific elements (e.g. it allows to render diagnostic on class name using the source of the whole class). The only exception from this rule are inheritors of FirSyntaxChecker, which work directly with a syntax tree (and must support several implementations for different ASTs)

Checkers pipeline

All checkers are collected in special containers, named DeclarationCheckers, ExpressionCheckers and TypeCheckers. Those containers have fields with sets of checkers for each possible type of checker of corresponding kind

There is a number of different container groups:

Common checkers, which always run on any platform
Checkers for each specific platform (lay in the corresponding :compiler:fir:checkers:checkers.platform modules)
- JVM:
- JS:
  - JsDeclarationCheckers
  - JsExpressionCheckers
- Native:
  - NativeDeclarationCheckers
  - NativeExpressionCheckers
Extended checkers. Those checkers are disabled by default and can be enabled with the -Xuse-fir-extended-checkers compiler flag. This group includes experimental and not very performant checkers, which are not crucial for regular compilation

At the beginning of the compilation, in the initialization phase, all required checker containers are collected inside a session component named CheckersComponent. When the time of checker phase comes, the compiler creates an instance of AbstractDiagnosticCollector, which is responsible to run all checkers. DiagnosticCollector traverses the whole given FIR tree, collects CheckerContext during this traversal, and runs all checkers that suite the element type on each element.

Checker Context

CheckerContext contains all information which can be used by checkers, including

session and scopeSession
the list of containingDeclarations
various information about the body which is analyzed
the stack of implicit receivers
information about suppressed diagnostics

CheckerContext is meant to be read-only for checkers

Diagnostic reporting

All diagnostics which can be reported by the compiler are stored within the FirErrors, FirJvmErrors, FirJsErrors and FirNativeErrors objects. Those diagnostics are auto-generated based on the diagnostic description in one of a diagnostic list in checkers-component-generator.

The generation is needed, because Analysis API (AA), which is used in IDE, generates a separate class for each compiler diagnostic with proper conversions of arguments for parametrized diagnostics. And the goal of the code generator is to automatically generate those classes and conversions. To run the diagnostics generation use the Generators -> Generate FIR Checker Components and FIR/IDE Diagnostics run configuration.

Diagnostic messages must be added manually to FirErrorsDefaultMessages, FirJvmErrorsDefaultMessages, FirJsErrorsDefaultMessages and FirNativeErrorsDefaultMessages respectively. Guidelines for diagnostic messages are described in the header of FirErrorsDefaultMessages

To report diagnostics, each checker takes an instance of DiagnosticReporter as a parameter. To reduce the boilerplate needed to instantiate a diagnostic from the given factory and ensure it's not missed due to reporting on the null source, a one should use the utilities from KtDiagnosticReportHelpers

FIR contracts at checker stage

In CLI mode the compiler runs checkers only after it has analyzed the whole world up to the final FIR phase (BODY_RESOLVE). But the IDE uses lazy resolve, so there can be a situation when some files have been analyzed to BODY_RESOLVE and other files have not been analyzed at all. This means that in a checker one can not rely on the fact that some FIR elements should have been resolved to some specific phase. The only exception is the following: If some element was passed directly to the checker then it is guaranteed that this element is already resolved to the BODY_RESOLVE phase. If some declaration is received somewhere from outside (from a type, a symbol provider or a scope), then it could have been resolved up to an arbitrary phase.

So, to avoid possible problems with accessing some information from FIR elements which was not yet calculated in the AA mode, there are the following restrictions and recommendations:

Access to FirBasedSymbol<*>.fir is prohibited. One can not extract any FIR element from the corresponding symbol
Instead of that, if some information about the declaration is needed (e.g., the list of supertypes for some class symbol), special accessors from that symbol should be used (they are declared as members of symbols). Those accessors call lazy resolution to the least required phase and after that extract the required information from FIR

Resolution diagnostics

While all checkers are run after resolution of the code is finished, some diagnostics can be actually detected only during resolution, such as

inference errors (type mismatch, no information for type parameter)
call resolution errors (overload resolution ambiguity)
type resolution errors (cycle in supertypes)
visibility errors (invisible reference)
etc

And at the same time, there is a contract that FIR resolution is side effect free (not very formal but still) and produces only a resolved FIR tree. So diagnostics can not be reported from resolution directly.

To support such diagnostics, there is the following mechanism:

some FIR nodes (mostly with word Error in name, like FirResolvedErrorReference) have a property which contain a ConeDiagnostic
ConeDiagnostic is an indicator that something went wrong during resolution
- there are a lot of different kinds of ConeDiagnostic for any possible problems, see ConeDiagnostics.kt
ConeDiagnostic is saved in the FIR tree, and then the special checker component (ErrorNodeDiagnosticCollectorComponent) checks all FIR nodes and report proper diagnostics based on the found ConeDiagnostic

Platform and Common checkers

In the MPP compilation, the same type may be resolved to different classes depending on the use-site session, if this type is based on the expect classifier. This implies that the same checker may produce different results depending on the use-site session:

// MODULE: common
expect interface A

class B : A

// MODULE: platform()()(common)
actual interface A {
    fun foo()
}

In this example class B is located in the common module, and from this module POV there is no problems with this class. But after actualization supertype A is resolved to actual interface A, which brings an abstract fun foo() into the scope, so class B becomes incorrect, as it doesn't implement this abstract function.

To cover this problem, all checkers are split into two groups: Common and Platform (see the MppCheckerKind enum)

MppCheckerKind.Common means that this checker should run from the same session to which corresponding declaration belongs
MppCheckerKind.Platform means that in case of MPP compilation this checker should run with session of leaf platform module for sources of all modules

So the author of each new checker should decide in which session this checker should run and properly set the MppCheckerKind in the checker declaration. There are some hints that may help to decide:

if the checker is not interested in the scope of some class, acquired from some type/scope/provider, it should be Common
if the checker is interested in class symbol of some type, but there is no difference for it how this class/typealias can be expanded, it most likely should be Common
if the checker is interested in the scope of some type, it should be carefully considered how the actualization of the scope may affect the checker

Checkers for expect classes

// MODULE: common
expect interface A
expect class B : A

class C : A

// MODULE: platform()()(common)
actual interface A {
    fun foo()
}

actual class B : A {
    override fun foo() {}
}

In this example we want to report "abstract foo not implemented" on class C, but we don't want to report it on expect class B (as its supertype is always expect A, never actual A)

So to cover such cases, it's worth splitting the platform checker into two parts:

Regular, which is platform checkers and runs for everything except expect declaration
ForExpectClass, which is common checkers and runs only for expect declarations

As an example, check the implementation of FirImplementationMismatchChecker checker

16 KiB Raw Permalink Blame History