RFR(XS): 8215042: Move serviceability/sa tests from tier1 to tier3.

Discussion:

Leonid Mesnik

2018-12-08 05:53:06 UTC

Hi

Could you please review following fix which moves SA tests from tier1 to tier3. There are some bugs which cause intermittent failures of any test. SA tests fail intermittently are not stable enough for tier1.
However failures are not very frequent. Also I don't think that putting all test in Problemlist.txt is very good idea because it left SA without any testing at all.
So now all SA tests which are included in hotspot_tier3_runtime group.

webrev: http://cr.openjdk.java.net/~lmesnik/8215042/webrev.00/ <http://cr.openjdk.java.net/~lmesnik/8215042/webrev.00/>
bug: https://bugs.openjdk.java.net/browse/JDK-8215042 <https://bugs.openjdk.java.net/browse/JDK-8215042>

Leonid

JC Beyler

2018-12-08 06:07:18 UTC

Permalink

Hi Leonid,

I cannot comment on whether it is a good idea to put the tests in tier3 but
the webrev does look good to achieve that :) So LGTM as far as it seems to
do what you intended :)

Thanks,
Jc

Post by Leonid Mesnik
Hi
Could you please review following fix which moves SA tests from tier1 to
tier3. There are some bugs which cause intermittent failures of any test.
SA tests fail intermittently are not stable enough for tier1.
However failures are not very frequent. Also I don't think that putting
all test in Problemlist.txt is very good idea because it left SA without
any testing at all.
So now all SA tests which are included in hotspot_tier3_runtime group.
webrev: http://cr.openjdk.java.net/~lmesnik/8215042/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8215042
Leonid

--
Thanks,
Jc

g***@oracle.com

2018-12-08 11:05:59 UTC

Permalink

Looks OK to me.

Post by Leonid Mesnik
Hi
Could you please review following fix which moves SA tests from tier1
to tier3. There are some bugs which cause intermittent failures of any
test. SA tests fail intermittently are not stable enough for tier1.
However failures are not very frequent. Also I don't think that
putting all test in Problemlist.txt is very good idea because it left
SA without any testing at all.
So now all SA tests which are included in hotspot_tier3_runtime group.
webrev: http://cr.openjdk.java.net/~lmesnik/8215042/webrev.00/
<http://cr.openjdk.java.net/%7Elmesnik/8215042/webrev.00/>
bug: https://bugs.openjdk.java.net/browse/JDK-8215042
Leonid

David Holmes

2018-12-08 11:18:34 UTC

Permalink

Hi Leonid,

My concern here, if we care about keeping the SA operational, is that in
tier3 these tests will not be covered by the jdk/submit testing process.

David

Post by Leonid Mesnik
Hi
Could you please review following fix which moves SA tests from tier1 to
tier3. There are some bugs which cause intermittent failures of any
test. SA tests fail intermittently are not stable enough for tier1.
However failures are not very frequent. Also I don't think that putting
all test in Problemlist.txt is very good idea because it left SA without
any testing at all.
So now all SA tests which are included in hotspot_tier3_runtime group.
webrev: http://cr.openjdk.java.net/~lmesnik/8215042/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8215042
Leonid

David Holmes

2018-12-10 07:29:16 UTC

Permalink

Hi Leonid,

If there are specific unstable SA tests then they should be
problem-listed and/or excluded from tier1 and put in a later tier. Lets
not remove all SA testing in tier1 just because of a handful of issues.

Thanks,
David

David, Jini
I understand your concerns. But the original idea of tiered testing is
that tier1 failures are treated as urgent issues and to resolve. [1]
Here is list of test failures for 1000 runs of tier1 tests in Mach5. (I
am not able to provide a link here) Please note that all SA tests are
excluded on Solaris and MacosX already.
1 compiler/aot/calls/fromAot/AotInvokeSpecial2AotTest.java
2 serviceability/sa/ClhsdbFindPC.java
3 serviceability/sa/TestPrintMdo.java
4 serviceability/sa/ClhsdbJstack.java
5 serviceability/sa/ClhsdbJdis.java
6 compiler/c2/Test8004741.java
7 runtime/handshake/HandshakeWalkSuspendExitTest.java
8 runtime/handshake/HandshakeWalkSuspendExitTest.java
9 compiler/aot/calls/fromAot/AotInvokeVirtual2AotTest.java
10 runtime/handshake/HandshakeWalkExitTest.java
11 runtime/handshake/HandshakeWalkSuspendExitTest.java
12 serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java
13 serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java
14 compiler/aot/calls/fromAot/AotInvokeVirtual2AotTest.java
The failures in of 'runtime/handshake/' are relatively caused by
https://bugs.openjdk.java.net/browse/JDK-8214174 but should be also
fixed/excluded. SA tests are also unstable and there are no plans to fix
them soon.
So it means that we are going to have tier1 tests unstable for a long time.
The possible way to make tier1 more stable would be to run only some
very basic sanity SA tests in tier1. Might be to develop new sanity
test which have some failover for existing SA bugs.
Leonid
[1] http://mail.openjdk.java.net/pipermail/jdk9-dev/2015-March/001991.html

Post by JC Beyler
Hi Leonid,
I agree with David. I am also concerned about us not detecting SA
breakages (which could happen along with hotspot changes) soon enough.
(Which was the primary reason to get these tests in).
Thank you,
Jini.

Post by JC Beyler
Hi Leonid,
My concern here, if we care about keeping the SA operational, is that
in tier3 these tests will not be covered by the jdk/submit testing
process.
David

Post by Leonid Mesnik
Hi
Could you please review following fix which moves SA tests from
tier1 to tier3. There are some bugs which cause intermittent
failures of any test. SA tests fail intermittently are not stable
enough for tier1.
However failures are not very frequent. Also I don't think that
putting all test in Problemlist.txt is very good idea because it
left SA without any testing at all.
So now all SA tests which are included in hotspot_tier3_runtime group.
webrev: http://cr.openjdk.java.net/~lmesnik/8215042/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8215042
Leonid

Leonid Mesnik

2018-12-10 07:42:52 UTC

Permalink

Hi

Here is summary of such "non test-specific" bugs:

Following bugs affect all (or mostly all) tests:
JDK-8202884 <https://bugs.openjdk.java.net/browse/JDK-8202884>SA: Attach/detach might fail on Linux if debugee application create/destroy threads during attaching
JDK-8204994 <https://bugs.openjdk.java.net/browse/JDK-8204994>SA might fail to attach to process with "Windbg Error: WaitForEvent failed"
JDK-8197591 <https://bugs.openjdk.java.net/browse/JDK-8197591>Tests failing with App waiting timeout
JDK-8203364 <https://bugs.openjdk.java.net/browse/JDK-8203364>Some serviceability/sa/ tests intermittently fail with java.io.IOException: LingeredApp terminated with non-zero exit code 3
and the similar bug for Solaris-SPARC
JDK-8193639 <https://bugs.openjdk.java.net/browse/JDK-8193639> tests failing intermittently with Error attaching to process: Can't create thread_db agent!

SA might fail connecting to any process because of JDK-8202884 <https://bugs.openjdk.java.net/browse/JDK-8202884>, JDK-8204994 <https://bugs.openjdk.java.net/browse/JDK-8204994>. Also all tests relying on LingeredApp might affected by JDK-8197591 <https://bugs.openjdk.java.net/browse/JDK-8197591>, JDK-8203364 <https://bugs.openjdk.java.net/browse/JDK-8203364>.

You have seen a lot of failures caused by https://bugs.openjdk.java.net/browse/JDK-8202884 <https://bugs.openjdk.java.net/browse/JDK-8202884> because it is a Linux-specific bug which is often reproduced on multi-core host.

Leonid

Post by JC Beyler
Hi Leonid,
Looks like all the SA failures here are all due to https://bugs.openjdk.java.net/browse/JDK-8202884. Do let me know if I am mistaken. We will work on fixing that issue faster.
Thanks,
Jini.

David, Jini
I understand your concerns. But the original idea of tiered testing is that tier1 failures are treated as urgent issues and to resolve. [1]
Here is list of test failures for 1000 runs of tier1 tests in Mach5. (I am not able to provide a link here) Please note that all SA tests are excluded on Solaris and MacosX already.
1 compiler/aot/calls/fromAot/AotInvokeSpecial2AotTest.java
2 serviceability/sa/ClhsdbFindPC.java
3 serviceability/sa/TestPrintMdo.java
4 serviceability/sa/ClhsdbJstack.java
5 serviceability/sa/ClhsdbJdis.java
6 compiler/c2/Test8004741.java
7 runtime/handshake/HandshakeWalkSuspendExitTest.java
8 runtime/handshake/HandshakeWalkSuspendExitTest.java
9 compiler/aot/calls/fromAot/AotInvokeVirtual2AotTest.java
10 runtime/handshake/HandshakeWalkExitTest.java
11 runtime/handshake/HandshakeWalkSuspendExitTest.java
12 serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java
13 serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java
14 compiler/aot/calls/fromAot/AotInvokeVirtual2AotTest.java
The failures in of 'runtime/handshake/' are relatively caused by https://bugs.openjdk.java.net/browse/JDK-8214174 but should be also fixed/excluded. SA tests are also unstable and there are no plans to fix them soon.
So it means that we are going to have tier1 tests unstable for a long time.
The possible way to make tier1 more stable would be to run only some very basic sanity SA tests in tier1. Might be to develop new sanity test which have some failover for existing SA bugs.
Leonid
[1] http://mail.openjdk.java.net/pipermail/jdk9-dev/2015-March/001991.html

Post by JC Beyler
Hi Leonid,
I agree with David. I am also concerned about us not detecting SA breakages (which could happen along with hotspot changes) soon enough. (Which was the primary reason to get these tests in).
Thank you,
Jini.

Post by JC Beyler
Hi Leonid,
My concern here, if we care about keeping the SA operational, is that in tier3 these tests will not be covered by the jdk/submit testing process.
David

Post by Leonid Mesnik
Hi
Could you please review following fix which moves SA tests from tier1 to tier3. There are some bugs which cause intermittent failures of any test. SA tests fail intermittently are not stable enough for tier1.
However failures are not very frequent. Also I don't think that putting all test in Problemlist.txt is very good idea because it left SA without any testing at all.
So now all SA tests which are included in hotspot_tier3_runtime group.
webrev: http://cr.openjdk.java.net/~lmesnik/8215042/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8215042
Leonid