Discussion:
RFR: JDK-8212028: Use run-test makefile framework for testing in Oracle's Mach5
Erik Joelsson
2018-10-11 22:29:38 UTC
Permalink
Hello,

(adding serviceability-dev and hotspot-dev for test changes)

Bug: https://bugs.openjdk.java.net/browse/JDK-8212028

Webrev: http://cr.openjdk.java.net/~erikj/8212028/webrev.01/index.html
(From ihse-runtestprebuilt-branch in jdk-sandbox)

In order to fully adopt the new run-test framework, we need to switch
over the automated and distributed testing system at Oracle to the new
framework. To get this to work, there are number of issues that needed
to be fixed. Here follows a brief explanation, see bug for more details.

For RunTest.gmk and related makefiles there are a number of minor tweaks
to support all the necessary control variables that are currently used
for the old test makefiles, as well as correcting some test setup settings.

In addition to that, some tests also needed to be modified:

Timeouts
The current default timeoutFactor in the makefiles is 4. However, the
old Mach5 executor overrides that to 10. I don't think it should dabble
with such things and leave it to the makefiles, the user, or a specific
job definition, so with the new run-test executor, it no longer does.
This means many tests now have a much shorter effective timeout. Because
of this, we need to increase the timeout on some that are now prone to
timing out. I have run tier1-5 a few times to try and find these and
added /timeout=300 (which will result in the same effective timeout as
before) when specific tests seemed problematic.

test/hotspot/jtreg/runtime/appcds/jvmti/InstrumentationTest.java
This test spawns a child process and tries to locate it using the attach
api, by looking for a unique token in the command line string of the
spawned JVM. The problem is that the command line string it gets from
the attach api is truncated and the token is last on the command line.
This normally works well, but the arguments before it are 3 files, with
full absolute paths inside the jtreg work directory. With Mach5 we have
pretty deep work directories, and with run-test, we make them even
deeper. This unfortunately trips the limit and the test fails. I have
fixed this by reordering the arguments to the child process.

/Erik
David Holmes
2018-10-11 22:40:34 UTC
Permalink
Hi Erik,
Post by Erik Joelsson
Hello,
(adding serviceability-dev and hotspot-dev for test changes)
Bug: https://bugs.openjdk.java.net/browse/JDK-8212028
Webrev: http://cr.openjdk.java.net/~erikj/8212028/webrev.01/index.html
(From ihse-runtestprebuilt-branch in jdk-sandbox)
In order to fully adopt the new run-test framework, we need to switch
over the automated and distributed testing system at Oracle to the new
framework. To get this to work, there are number of issues that needed
to be fixed. Here follows a brief explanation, see bug for more details.
For RunTest.gmk and related makefiles there are a number of minor tweaks
to support all the necessary control variables that are currently used
for the old test makefiles, as well as correcting some test setup settings.
Timeouts
The current default timeoutFactor in the makefiles is 4. However, the
old Mach5 executor overrides that to 10. I don't think it should dabble
with such things and leave it to the makefiles, the user, or a specific
job definition, so with the new run-test executor, it no longer does.
This means many tests now have a much shorter effective timeout. Because
of this, we need to increase the timeout on some that are now prone to
timing out. I have run tier1-5 a few times to try and find these and
added /timeout=300 (which will result in the same effective timeout as
before) when specific tests seemed problematic.
This should be fixed in the tier job definitions not the individual
tests. We have moved away from putting explicit timeouts on individual
tests and instead rely on the framework timeout being set appropriately.

David
-----
Post by Erik Joelsson
test/hotspot/jtreg/runtime/appcds/jvmti/InstrumentationTest.java
This test spawns a child process and tries to locate it using the attach
api, by looking for a unique token in the command line string of the
spawned JVM. The problem is that the command line string it gets from
the attach api is truncated and the token is last on the command line.
This normally works well, but the arguments before it are 3 files, with
full absolute paths inside the jtreg work directory. With Mach5 we have
pretty deep work directories, and with run-test, we make them even
deeper. This unfortunately trips the limit and the test fails. I have
fixed this by reordering the arguments to the child process.
/Erik
Magnus Ihse Bursie
2018-10-12 08:37:07 UTC
Permalink
Hi Erik,

Thank you for preserving through this, so we finally can move to 100%
run-test!
Post by Erik Joelsson
Hello,
(adding serviceability-dev and hotspot-dev for test changes)
Bug: https://bugs.openjdk.java.net/browse/JDK-8212028
Webrev: http://cr.openjdk.java.net/~erikj/8212028/webrev.01/index.html
(From ihse-runtestprebuilt-branch in jdk-sandbox)
Build changes look good. And I agree with the solution to add longer
timeout to individual tests.
Post by Erik Joelsson
In order to fully adopt the new run-test framework, we need to switch
over the automated and distributed testing system at Oracle to the new
framework. To get this to work, there are number of issues that needed
to be fixed. Here follows a brief explanation, see bug for more details.
For RunTest.gmk and related makefiles there are a number of minor
tweaks to support all the necessary control variables that are
currently used for the old test makefiles, as well as correcting some
test setup settings.
Timeouts
The current default timeoutFactor in the makefiles is 4. However, the
old Mach5 executor overrides that to 10. I don't think it should
dabble with such things and leave it to the makefiles, the user, or a
specific job definition, so with the new run-test executor, it no
longer does. This means many tests now have a much shorter effective
timeout. Because of this, we need to increase the timeout on some that
are now prone to timing out. I have run tier1-5 a few times to try and
find these and added /timeout=300 (which will result in the same
effective timeout as before) when specific tests seemed problematic.
test/hotspot/jtreg/runtime/appcds/jvmti/InstrumentationTest.java
This test spawns a child process and tries to locate it using the
attach api, by looking for a unique token in the command line string
of the spawned JVM. The problem is that the command line string it
gets from the attach api is truncated and the token is last on the
command line. This normally works well, but the arguments before it
are 3 files, with full absolute paths inside the jtreg work directory.
With Mach5 we have pretty deep work directories, and with run-test, we
make them even deeper. This unfortunately trips the limit and the test
fails. I have fixed this by reordering the arguments to the child
process.
... but I'm not sure I understand this. Is it a command-line length
we're hitting? If so, how can reordering the arguments solve anything? I
understand why this can preserve the token, but will not one of the
paths be cut of instead?

/Magnus
Post by Erik Joelsson
/Erik
Erik Joelsson
2018-10-12 16:16:43 UTC
Permalink
Post by David Holmes
Hi Erik,
Thank you for preserving through this, so we finally can move to 100%
run-test!
Post by Erik Joelsson
Hello,
(adding serviceability-dev and hotspot-dev for test changes)
Bug: https://bugs.openjdk.java.net/browse/JDK-8212028
http://cr.openjdk.java.net/~erikj/8212028/webrev.01/index.html (From
ihse-runtestprebuilt-branch in jdk-sandbox)
Build changes look good. And I agree with the solution to add longer
timeout to individual tests.
Thanks!
Post by David Holmes
Post by Erik Joelsson
test/hotspot/jtreg/runtime/appcds/jvmti/InstrumentationTest.java
This test spawns a child process and tries to locate it using the
attach api, by looking for a unique token in the command line string
of the spawned JVM. The problem is that the command line string it
gets from the attach api is truncated and the token is last on the
command line. This normally works well, but the arguments before it
are 3 files, with full absolute paths inside the jtreg work
directory. With Mach5 we have pretty deep work directories, and with
run-test, we make them even deeper. This unfortunately trips the
limit and the test fails. I have fixed this by reordering the
arguments to the child process.
... but I'm not sure I understand this. Is it a command-line length
we're hitting? If so, how can reordering the arguments solve anything?
I understand why this can preserve the token, but will not one of the
paths be cut of instead?
I will try to be clearer. The command line is fine for running the child
process. The truncation happens when the attach api is used to try to
find the child process. It basically calls something that lists all JVMs
on system, iterates through them and looks at the "description" field.
This field happens to contain the first X characters of the command line
so the test looks for the unique token it knows it put there. (I don't
know the exact value of X, but could find out). Until now, X was big
enough to fit the whole command line, but with the 3 absolute path files
in there now growing long enough, the last argument is pushed out of the
description field. This means the test process is unable to locate the
child process in the list of JVMs. What then happens is the test keeps
on looking, taking new snapshots of all JVMs on the system until the
test finally times out. By reordering the arguments, the token is less
likely to be cut out of the description field.

/Erik
Magnus Ihse Bursie
2018-10-12 17:07:08 UTC
Permalink
Post by Erik Joelsson
Post by David Holmes
Hi Erik,
Thank you for preserving through this, so we finally can move to 100% run-test!
Post by Erik Joelsson
Hello,
(adding serviceability-dev and hotspot-dev for test changes)
Bug: https://bugs.openjdk.java.net/browse/JDK-8212028
Webrev: http://cr.openjdk.java.net/~erikj/8212028/webrev.01/index.html (From ihse-runtestprebuilt-branch in jdk-sandbox)
Build changes look good. And I agree with the solution to add longer timeout to individual tests.
Thanks!
Post by David Holmes
Post by Erik Joelsson
test/hotspot/jtreg/runtime/appcds/jvmti/InstrumentationTest.java
This test spawns a child process and tries to locate it using the attach api, by looking for a unique token in the command line string of the spawned JVM. The problem is that the command line string it gets from the attach api is truncated and the token is last on the command line. This normally works well, but the arguments before it are 3 files, with full absolute paths inside the jtreg work directory. With Mach5 we have pretty deep work directories, and with run-test, we make them even deeper. This unfortunately trips the limit and the test fails. I have fixed this by reordering the arguments to the child process.
... but I'm not sure I understand this. Is it a command-line length we're hitting? If so, how can reordering the arguments solve anything? I understand why this can preserve the token, but will not one of the paths be cut of instead?
I will try to be clearer. The command line is fine for running the child process. The truncation happens when the attach api is used to try to find the child process. It basically calls something that lists all JVMs on system, iterates through them and looks at the "description" field. This field happens to contain the first X characters of the command line so the test looks for the unique token it knows it put there. (I don't know the exact value of X, but could find out). Until now, X was big enough to fit the whole command line, but with the 3 absolute path files in there now growing long enough, the last argument is pushed out of the description field. This means the test process is unable to locate the child process in the list of JVMs. What then happens is the test keeps on looking, taking new snapshots of all JVMs on the system until the test finally times out. By reordering the arguments, the token is less likely to be cut out of the description field.
I see. Then your fix makes totally sense.

/Magnus
Post by Erik Joelsson
/Erik
Loading...