This doesn't reduce the quality of tests, because the flags are still
printed for declarations themselves. We only omit them in references.
However, this makes the tests more compatible with non-JVM backends
(see KT-58605), because flags of referenced stdlib declarations may
differ among target platforms.
But not substitution overrides! This is important if the method called
is an intersection override where one of the intersected types is a
subtype of a generic type.
In tests merged from boxAgainstJava in 29b96aa1, some directories were
named slightly differently compared to box, e.g. "property" vs
"properties", "varargs" vs "vararg". This change renames these, moves
some of the tests to more fitting directories, and also renames
"visibility" to "javaVisibility" because it's about Java visibilities
specifically.