When we inline an anonymous object which captures something such as
crossinline values or reified parameters, we copy and transform its
metadata in `AnonymousObjectTransformer.transformMetadata`. Basically we
read the metadata of the original class, add a minor protobuf extension
and write it to the new class.
This also includes copying the string table. We read the string table
into `JvmNameResolver` (a representation of string table used in
deserialization), then construct a `JvmStringTable` (a representation
used in _serialization_) and then write it back.
There's a few optimizations in the string table representation in JVM
metadata which allow to store less strings and thus take less space. See
`StringTableTypes.Record` in `jvm_metadata.proto` for more information.
One of the optimizations `Record.range` allows to avoid storing the same
record many times in a sequence. For example, if we have N different
strings in the string table but none of them require any operation (such
as substring, char replacement, etc.), then we only store the record
with all default values (no operation, no predefined string, etc.) and
set its `range` to N. Upon reading such optimized record list in
`JvmNameResolver`, we "expand" it back to normal, so that we could index
it quickly and figure out what operation needs to be performed on each
string from the string table.
The problem was that when we expanded this list, we didn't set the range
of the expanded record entry to 1. So each record in
`JvmNameResolver.records` still has its original range. It doesn't cause
any problems most of the time because the range in this expanded list is
almost unused. However, when copying/transforming metadata for anonymous
objects, we mistakenly passed this expanded list with incorrect ranges
to `JvmStringTable`. So the metadata in the copied anonymous object
ended up being incorrect: each record now was present the number of
times equal to its range. Copying such metadata once again led to
another multiplication of the record list size. Multiple copies resulted
in exponential increase in memory consumption and quickly led to OOM.
For the fix, we now take the original, unexpanded list of records when
creating `JvmStringTable` out of `JvmNameResolver` for transformation of
anonymous object metadata.
Note that another possible fix would be to make range for each record in
`JvmNameResolver.records` equal to 1. This is undesirable though, since
then we'd need to copy each `JvmProtoBuf.StringTableTypes.Record`
instance, of which there could be many, and use some memory for no
apparent gain (since ranges in that expanded list are now not used at
all).
#KT-38197 Fixed
The fields containing crossinline lambdas should be package-private to
avoid generating synthetic accessors, which break object regeneration.
Note that the inline methods cannot actually be called, as call sites
will attempt to read the captured lambda from a field through a *copy*
of the local containing the object, so these reads will not be inlined,
causing an exception at runtime:
inline fun f(crossinline g: () -> Unit) = object : I {
inline fun h() = g()
// effectively `val tmp = this; return tmp.$g()`:
override fun run() = h()
}
f {}.run() // NoSuchFieldError: $g
This particular example can be fixed by reusing locals for receiver
parameters in IrInlineCodegen, but explicitly assigning `this` to
another variable and calling an inline method on it will break it again.
(This is only applicable to the JVM_IR backend, as the non-IR one fails
to generate `f` at all for some other reason.)
As for SAM wrappers, the bytecode sequence
new A
dup
new B
dup
invokespecial B.<init>
invokespecial A.<init>
breaks the inliner, so instead we do
new B
dup
invokespecial B.<init>
store x
new A
dup
load x
invokespecial A.<init>
Otherwise, the cached instances cannot be reused for different wrapped
types. Also, if the wrapped type is regenerated during inlining, the
inliner would produce a call to a nonexistent constructor that takes the
regenerated type as an argument.
To avoid bytecode sequences like
new _1Kt$sam$i$java_lang_Runnable$0
dup
new _1Kt$f$1
dup
invokespecial _1Kt$f$1.<init>()V
invokespecial _1Kt$sam$i$java_lang_Runnable$0.<init>(...)V
as the different order of `new` and `<init>` confuses the inliner.
Namely, anonymous objects defined in lambdas that have all captured
variables as loose fields instead of a single reference to the parent.
The question is, when a lambda inside an inline function defines an
anonymous object, and that object is not regenerated during codegen for
the inline function itself, but then has to be regenerated at call site
anyway, do we use an outer `this` or loose capture fields? For example,
before KT-28064:
inline fun f1(g: () -> Unit) = object { g() }
// -> f1$1 { $g: () -> Unit }
inline fun f2(g: () -> Unit) = f1 { object { g() } }
// -> f2$$inlined$f1$1 { $g: () -> Unit }
// f2$$inlined$f1$1$lambda$1 { this$0: f2$$inlined$f1$1 }
inline fun f3(g: () -> Unit) = f2 { object { g() } }
// -> f3$$inlined$f2$1 { $g: () -> Unit }
// f3$$inlined$f2$1$1 { this$0: f3$$inlined$f2$1 }
// f3$$inlined$f2$1$1$lambda$1 { this$0: f3$$inlined$f2$1$1 }
After KT-28064:
inline fun f2(g: () -> Unit) = f1 { object { g() } }
// -> f2$$inlined$f1$1 { $g: () -> Unit }
// f2$1$1 { $g: () -> Unit }
inline fun f3(g: () -> Unit) = f2 { object { g() } }
// -> f3$$inlined$f2$1 { $g: () -> Unit }
// f3$$inlined$f2$2 { ??? }
// f3$1$1 { $g: () -> Unit }
Should `???` be `this$0: f3$$inlined$f2$1` or `$g: () -> Unit`? This
commit chooses the latter for KT-28064 bytecode and keeps `this$0` when
inlining the old bytecode.
This patch mutes the following test categories:
* Tests with java dependencies (System class,
java stdlib, jvm-oriented annotations etc).
* Coroutines tests.
* Reflection tests.
* Tests with an inheritance from the standard
collections.
The problem was that anonymous classes wasn't regenerated
although they capture another anonymous class that is a subject
for regeneration
#KT-8689 Fixed