Discussion:
RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false
(too old to reply)
g***@oracle.com
2018-05-26 10:50:39 UTC
Permalink
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.

  Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/


-------- Forwarded Message --------
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find
boolVar with expected value: false
Date: Fri, 25 May 2018 11:35:10 -0400
From: Gary Adams <***@oracle.com>
Reply-To: ***@oracle.com





The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" <location>
<prompt>

the log contained
"Breakpoint hit:" <prompt> <location>

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001
s***@oracle.com
2018-05-29 18:43:11 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
This looks good to me.<br>
But I've added Tim Bell to the mailing list to know his opinion.<br>
<br>
Hi Tim,<br>
<br>
Could you, please, take a look at this?<br>
What do you thinks about this approach to fix test issues?<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 5/26/18 03:50, <a class="moz-txt-link-abbreviated" href="mailto:***@oracle.com">***@oracle.com</a> wrote:<br>
</div>
<blockquote type="cite"
cite="mid:5d58fa2b-7dda-db64-0280-***@oracle.com">
<meta http-equiv="Context-Type" content="text/html; charset=utf-8">
This is a review request for a previously closed test bug.<br>
The test was recently moved to the open repos, and the <br>
proposed fix is in the open code.<br>
<p>  Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table">
<tbody>
<tr>
<th nowrap="nowrap">Subject: </th>
<td>RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR:
Cannot find boolVar with expected value: false</td>
</tr>
<tr>
<th nowrap="nowrap">Date: </th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th nowrap="nowrap">From: </th>
<td>Gary Adams <a class="moz-txt-link-rfc2396E"
href="mailto:***@oracle.com"
moz-do-not-send="true">&lt;***@oracle.com&gt;</a></td>
</tr>
<tr>
<th nowrap="nowrap">Reply-To: </th>
<td><a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a></td>
</tr>
<tr>
<th nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" &lt;location&gt;
&lt;prompt&gt;

the log contained
"Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001


</pre>
</div>
</blockquote>
<br>
</body>
</html>
Chris Plummer
2018-06-01 22:54:26 UTC
Permalink
Hi Gary,

I'm a bit unclear on one point. According to your description, should we
be seeing two prompts for commands like "cont", and "step"? When you say
we are expecting:

  "Breakpoint hit:" <location>
   <prompt>

Isn't it really:

   <prompt>
  "Breakpoint hit:" <location>
   <prompt>

I figure this must be the case, because your fix removes the printing of
a prompt, and we still need one prompt, so there must have been two to
start with. I think in the above the last <prompt> is being produced
sequentially right after the "Breakpoint hit:" output, so is never an
issue. The first <prompt> comes from the printPrompt() call after
issuing the "cont" command, and it's possible to hit the breakpoint and
start producing more output before printPrompt() has completed, thus
causing the problem you are seeing.

There seems to be some purpose to printing this first prompt and I'm not
so sure eliminating it is the right thing to do. I think it might be
there to indicate that the command has been initiated, because there
could be some delay before the breakpoint is hit. So the issue then is
how to make sure this prompt appears in the input stream before any of
the "Step completed:" or "Breakpoint hit:" output. Possibly event
processing should be blocked until executeCommand() completes.

thanks,

Chris
Post by g***@oracle.com
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.
  Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
-------- Forwarded Message --------
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot
find boolVar with expected value: false
Date: Fri, 25 May 2018 11:35:10 -0400
The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
"Breakpoint hit:" <location>
<prompt>
the log contained
"Breakpoint hit:" <prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.
 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001
g***@oracle.com
2018-06-02 11:02:55 UTC
Permalink
Post by s***@oracle.com
Hi Gary,
I'm a bit unclear on one point. According to your description, should
we be seeing two prompts for commands like "cont", and "step"? When
  "Breakpoint hit:" <location>
   <prompt>
   <prompt>
  "Breakpoint hit:" <location>
   <prompt>
Let me stretch out the log a little bit
The <prompt> is main[1] or Thread-1[1], etc.
and not distinguished for command or event completion.

<prompt>
   stop at <location1>
<prompt-after-stop-command>
   stop at <location2>
<prompt-after-stop-command>
   cont
<prompt=after-cont-command>
   cont
   "Breakpoint hit:" <location1>
<prompt-after-cont-command>
  "Breakpoint hit:"
<prompt-after-breakpoint-event-location1>
  <location2>
<prompt-after-breakpoint-event-location2>

The source of the command or event prompts
was verified with a stacktrace in the printPrompt()
method.
Post by s***@oracle.com
I figure this must be the case, because your fix removes the printing
of a prompt, and we still need one prompt, so there must have been two
to start with. I think in the above the last <prompt> is being
produced sequentially right after the "Breakpoint hit:" output, so is
never an issue. The first <prompt> comes from the printPrompt() call
after issuing the "cont" command, and it's possible to hit the
breakpoint and start producing more output before printPrompt() has
completed, thus causing the problem you are seeing.
printPrompt() is a very quick operation and is not being interrupted.

But the output from the event handler is produced incrementally.
   - event header, e.g. "Breakpont hit:", "Step completed:", etc.
   - from vmInterrupted()
     - printCurrentLocation()
     - monitor command output (if any)
     - printPrompt()
Post by s***@oracle.com
There seems to be some purpose to printing this first prompt and I'm
not so sure eliminating it is the right thing to do. I think it might
be there to indicate that the command has been initiated, because
there could be some delay before the breakpoint is hit. So the issue
then is how to make sure this prompt appears in the input stream
before any of the "Step completed:" or "Breakpoint hit:" output.
Possibly event processing should be blocked until executeCommand()
completes.
The primary purposes of prompts is for "user interactive" nudge to type
another command.
But the failures are from tests that are using the presence of a prompt
as an ACK of the
last command. This "prompt-protocol", does not include any sequence
numbers ,
does not pair prompt to command or event, and does not block during
any multipart message formation.

To prevent the error we are observing we would have to buffer all of the
event output

   "Breakpoint hit:" <location> <monitored-commands>
    <prompt for breakpoint event>

Another way to look at the current failure is to say the event handler
output
has been interrupted by a command sent prematurely. This would never
be seen at user speeds typing in commands.

I should also mention any other output from the user application can also
interfere with the command and event output streams. I have seen
stray prompts in the logs from System.out.println(). That was the
reason earlier I was looking at changing the tests to look for
specific prompts before continuing to the next command.

\gra
Post by s***@oracle.com
thanks,
Chris
Post by g***@oracle.com
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.
  Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
-------- Forwarded Message --------
Cannot find boolVar with expected value: false
Date:     Fri, 25 May 2018 11:35:10 -0400
The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
   "Breakpoint hit:" <location>
    <prompt>
the log contained
   "Breakpoint hit:" <prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.
  redefine001, caught_exception002, locals002, eval001, next001,
  stop_at003, step002, print002, trace001, step_up001, read001,
  clear004, kill001, set001
Chris Plummer
2018-06-06 18:20:38 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">If that were the case, shouldn't the
delay be added to sendCommand() instead?<br>
<br>
I'm still a bit unsure about how this sequence is suppose to work
in the first place:<br>
<br>
         sendCommand(command);
<br>
         return receiveReply(startPos, compoundPromptOnly, count);
<br>
<br>
I guess the assumption is that the first prompt will always be a
simple one? Is that the case? If so, then the expectation is that
we will see:<br>
<br>
main[1] cont
<br>
&gt;<br>
"Breakpoint hit:" &lt;location&gt;
<br>
main[1]
<br>
<br>
But then we can still see, even after your fix:<br>
<br>
main[1] cont
<br>
"Breakpoint hit:" &gt; &lt;location&gt;
<br>
main[1]
<br>
<br>
However, I think maybe the issue is different for step commands,
which seem to result in two compound prompts. Remember your
example:<br>
<br>
<blockquote type="cite"> Here's another recent failure with
eval001<br>
 1. cont and Break hit<br>
 2. step and Step completed (partial reply)<br>
 3. eval and error with second half of Step completed<br>
<br>
<b>====================================</b><br>
Sending command: cont<br>
<br>
reply[0]: &gt; <br>
reply[1]: <br>
Breakpoint hit: "thread=main",
nsk.jdb.eval.eval001.eval001a.lastBreak(), line=23 bci=0<br>
reply[2]: <br>
23        static void lastBreak () {}<br>
reply[3]: <br>
<br>
reply[4]: <b><br>
====================================<br>
<br>
</b>main[1] <br>
Sending command: step<br>
<br>
reply[0]: <br>
reply[1]: <br>
Step completed: main[1] <br>
<b>====================================</b><br>
Sending command: eval
nsk.jdb.eval.eval001.eval001a.myStaticField<br>
<b>reply[0]: "thread=main",
nsk.jdb.eval.eval001.eval001a.runIt(), line=36 bci=53</b><br>
reply[1]: <br>
36            }<br>
reply[2]: <br>
<br>
reply[3]: <br>
<br>
<b>====================================</b><br>
main[1] <br>
# ERROR: jdb failed to report value of expression:
nsk.jdb.eval.eval001.eval001a.myStaticField<br>
&lt;snip&gt;<br>
# ERROR: expected : -2147483648 ;<br>
<b># ERROR: reported: "thread=main",
nsk.jdb.eval.eval001.eval001a.runIt(), line=36 bci=53</b><br>
<br>
<br>
Finally the sequence from the jdb.session file. Notice the
main[1] prompt<br>
embedded in the Step completed reply.<br>
<br>
<b>main[1]</b> cont<br>
&gt; <br>
Breakpoint hit: "thread=main",
nsk.jdb.eval.eval001.eval001a.lastBreak(), line=23 bci=0<br>
23        static void lastBreak () {}<br>
<br>
<b>main[1] </b>step<br>
<br>
Step completed: <b>main[1]</b> "thread=main",
nsk.jdb.eval.eval001.eval001a.runIt(), line=36 bci=53<br>
36            }<br>
<br>
<b>main[1]</b> eval nsk.jdb.eval.eval001.eval001a.myStaticField</blockquote>
The part that uses "cont" is fine. The problems started when the
step command was issued. Rather than receiving:<br>
<br>
Step completed: "thread=main",
nsk.jdb.eval.eval001.eval001a.runIt(), line=36 bci=53<br>
<br>
We got:<br>
<br>
Step completed: main[1] <br>
<b>====================================</b><br>
Sending command: eval nsk.jdb.eval.eval001.eval001a.myStaticField<br>
reply[0]: "thread=main", nsk.jdb.eval.eval001.eval001a.runIt(),
line=36 bci=53<br>
<br>
I don't see your change addressing this. The step command was
considered to be done as soon as the main[] prompt arrived, even
though it came right in the middle of the full "Step completed:
..." message. Your delay won't change this. Yes, it will make sure
the rest of the reply has arrived, but the part after main[1] will
not be considered part of the step reply because it will terminate
at the main[1] prompt.<br>
<br>
Chris<br>
<br>
On 6/6/18 10:44 AM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:***@oracle.com">On
6/6/18, 1:26 PM, Chris Plummer wrote:
<br>
<blockquote type="cite">Hi Gary,
<br>
<br>
I thought the issue was having 1 command like "cont" or "step"
producing two prompts, and the first prompt ending up in the
middle of the output for the breakpoint. So instead of:
<br>
<br>
main[1] cont
<br>
main[1]
<br>
"Breakpoint hit:" &lt;location&gt;
<br>
main[1]
<br>
<br>
We sometimes get:
<br>
<br>
main[1] cont
<br>
"Breakpoint hit:" main[1] &lt;location&gt;
<br>
main[1]
<br>
</blockquote>
If you look at the commands sent :
<br>
<br>
stop at location1
<br>
stop at location2
<br>
cont
<br>
cont
<br>
<br>
the expectation is that breakpoint will be encountered and then
<br>
the test will continue til another breakpoint is encountered.
<br>
<br>
The problem only occurs because the commands are sent
<br>
so quickly a race condition is introduced between the
<br>
command processing and the event handling in the debuggee
<br>
process. Since prompts are written both from the command
<br>
thread and the event handler, the output can end up interspersed.
<br>
<br>
<blockquote type="cite">
<br>
I don't think a delay on the receiving end like you have added
can effect this. You need to make sure the main[1] produced for
the "cont" command is in the output stream before handling the
breakpoint event. Dan's solution was a bit different in that it
actually delayed reaching the breakpoint. I don't think your
solution does the same.
<br>
</blockquote>
My goal is to delay the sending of the next command,
<br>
because the debuggee is not actually ready for the command.
<br>
<blockquote type="cite">
<br>
Chris
<br>
<br>
On 6/6/18 10:07 AM, Gary Adams wrote:
<br>
<blockquote type="cite">It may be sufficient when the command
prompt is delivered
<br>
to introduce a brief pause to allow an asynchronous event to
<br>
be processed before another command can be sent. e.g.
<br>
throttling the stream of commands being sent
<br>
<br>
diff --git
a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
<br>
--- a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
<br>
+++ b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
<br>
@@ -348,7 +348,17 @@
<br>
<br>
         int startPos = stdoutBuffer.length();
<br>
         sendCommand(command);
<br>
-        return receiveReply(startPos, compoundPromptOnly,
count);
<br>
+        String[] reply = receiveReply(startPos,
compoundPromptOnly, count);
<br>
+
<br>
+        if (asyncCommand(command)) {
<br>
+            // delay to allow debuggee time to process async
command
<br>
+            try {
<br>
+                Thread.sleep(100);
<br>
+            } catch (InterruptedException e) {
<br>
+                // ignored
<br>
+            }
<br>
+        }
<br>
+        return reply;
<br>
     }
<br>
<br>
<br>
On 6/6/18, 11:51 AM, Daniel D. Daugherty wrote:
<br>
<blockquote type="cite">The two prompts are for two different
things so I don't think
<br>
that removing one of the prompts is the right solution
either.
<br>
<br>
In the case of an asynchronous command like 'cont', you get
<br>
the '&lt;prompt=after-cont-command&gt;' right away because
the command
<br>
processor is ready for another command. You will also get
the
<br>
'&lt;prompt-after-breakpoint-event-location1&gt;' prompt
when the
<br>
breakpoint is hit because that identifies the thread that
posted
<br>
the breakpoint event. If it takes a long time to hit the
breakpoint,
<br>
then it will be a long time before the breakpoint prompt is
seen.
<br>
Likewise, if the breakpoint is hit quickly, then the
breakpoint
<br>
prompt will be seen more quickly and will sometimes
interfere with
<br>
the command processor prompt. That's normal for an async
command.
<br>
<br>
One possible way to fix this kind of issue is to structure
the
<br>
test flow to slightly delay the target thread so it takes
longer
<br>
to get to the breakpoint.
'&lt;prompt=after-cont-command&gt;' will have
<br>
a better chance to be printed before (and not colliding
with) the
<br>
'&lt;prompt-after-breakpoint-event-location1&gt;' prompt.
<br>
<br>
So if the test program has:
<br>
<br>
&lt;some line of code&gt;;  // breakpoint here
<br>
<br>
change it to something like:
<br>
<br>
    try { Thread.sleep(100); } catch (InterruptedException
ie) {};
<br>
&lt;some line of code&gt;;  // breakpoint here
<br>
<br>
Of course, I'm assuming here that the test driver code is
looking
<br>
for the breakpoint to be hit (and not for
'&lt;prompt=after-cont-command&gt;').
<br>
Jerry and I did find a few jdb based tests looking for the
wrong prompt...
<br>
<br>
Dan
<br>
<br>
<br>
On 6/2/18 7:02 AM, <a class="moz-txt-link-abbreviated" href="mailto:***@oracle.com">***@oracle.com</a> wrote:
<br>
<blockquote type="cite">On 6/1/18 6:54 PM, Chris Plummer
wrote:
<br>
<blockquote type="cite">Hi Gary,
<br>
<br>
I'm a bit unclear on one point. According to your
description, should we be seeing two prompts for
commands like "cont", and "step"? When you say we are
expecting:
<br>
<br>
  "Breakpoint hit:" &lt;location&gt;
<br>
&lt;prompt&gt;
<br>
<br>
Isn't it really:
<br>
<br>
&lt;prompt&gt;
<br>
  "Breakpoint hit:" &lt;location&gt;
<br>
&lt;prompt&gt;
<br>
</blockquote>
Let me stretch out the log a little bit
<br>
The &lt;prompt&gt; is main[1] or Thread-1[1], etc.
<br>
and not distinguished for command or event completion.
<br>
<br>
&lt;prompt&gt;
<br>
   stop at &lt;location1&gt;
<br>
&lt;prompt-after-stop-command&gt;
<br>
   stop at &lt;location2&gt;
<br>
&lt;prompt-after-stop-command&gt;
<br>
   cont
<br>
&lt;prompt=after-cont-command&gt;
<br>
   cont
<br>
   "Breakpoint hit:" &lt;location1&gt;
<br>
&lt;prompt-after-cont-command&gt;
<br>
  "Breakpoint hit:"
<br>
&lt;prompt-after-breakpoint-event-location1&gt;
<br>
&lt;location2&gt;
<br>
&lt;prompt-after-breakpoint-event-location2&gt;
<br>
<br>
The source of the command or event prompts
<br>
was verified with a stacktrace in the printPrompt()
<br>
method.
<br>
<blockquote type="cite">
<br>
I figure this must be the case, because your fix removes
the printing of a prompt, and we still need one prompt,
so there must have been two to start with. I think in
the above the last &lt;prompt&gt; is being produced
sequentially right after the "Breakpoint hit:" output,
so is never an issue. The first &lt;prompt&gt; comes
from the printPrompt() call after issuing the "cont"
command, and it's possible to hit the breakpoint and
start producing more output before printPrompt() has
completed, thus causing the problem you are seeing.
<br>
</blockquote>
printPrompt() is a very quick operation and is not being
interrupted.
<br>
<br>
But the output from the event handler is produced
incrementally.
<br>
   - event header, e.g. "Breakpont hit:", "Step
completed:", etc.
<br>
   - from vmInterrupted()
<br>
     - printCurrentLocation()
<br>
     - monitor command output (if any)
<br>
     - printPrompt()
<br>
<blockquote type="cite">
<br>
There seems to be some purpose to printing this first
prompt and I'm not so sure eliminating it is the right
thing to do. I think it might be there to indicate that
the command has been initiated, because there could be
some delay before the breakpoint is hit. So the issue
then is how to make sure this prompt appears in the
input stream before any of the "Step completed:" or
"Breakpoint hit:" output. Possibly event processing
should be blocked until executeCommand() completes.
<br>
</blockquote>
The primary purposes of prompts is for "user interactive"
nudge to type another command.
<br>
But the failures are from tests that are using the
presence of a prompt as an ACK of the
<br>
last command. This "prompt-protocol", does not include any
sequence numbers ,
<br>
does not pair prompt to command or event, and does not
block during
<br>
any multipart message formation.
<br>
<br>
To prevent the error we are observing we would have to
buffer all of the event output
<br>
<br>
   "Breakpoint hit:" &lt;location&gt;
&lt;monitored-commands&gt;
<br>
&lt;prompt for breakpoint event&gt;
<br>
<br>
Another way to look at the current failure is to say the
event handler output
<br>
has been interrupted by a command sent prematurely. This
would never
<br>
be seen at user speeds typing in commands.
<br>
<br>
I should also mention any other output from the user
application can also
<br>
interfere with the command and event output streams. I
have seen
<br>
stray prompts in the logs from System.out.println(). That
was the
<br>
reason earlier I was looking at changing the tests to look
for
<br>
specific prompts before continuing to the next command.
<br>
<br>
\gra
<br>
<blockquote type="cite">thanks,
<br>
<br>
Chris
<br>
<br>
On 5/26/18 3:50 AM, <a class="moz-txt-link-abbreviated" href="mailto:***@oracle.com">***@oracle.com</a> wrote:
<br>
<blockquote type="cite">This is a review request for a
previously closed test bug.
<br>
The test was recently moved to the open repos, and the
<br>
proposed fix is in the open code.
<br>
<br>
  Webrev:
<a class="moz-txt-link-freetext" href="http://cr.openjdk.java.net/~gadams/8169718/webrev/">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a>
<br>
<br>
<br>
-------- Forwarded Message --------
<br>
Subject:     RFR: JDK-8169718:
nsk/jdb/locals/locals002: ERROR: Cannot find boolVar
with expected value: false
<br>
Date:     Fri, 25 May 2018 11:35:10 -0400
<br>
From:     Gary Adams <a class="moz-txt-link-rfc2396E" href="mailto:***@oracle.com">&lt;***@oracle.com&gt;</a>
<br>
Reply-To:     <a class="moz-txt-link-abbreviated" href="mailto:***@oracle.com">***@oracle.com</a>
<br>
<br>
<br>
<br>
<br>
<br>
The jdb tests use stdin to send commands to a jdb
process
<br>
and parses the stdout to determine if a command was
<br>
successful and when the process is prompting for new
commands
<br>
to be sent.
<br>
<br>
Some commands are synchronous, so when the command is
completed
<br>
a new prompt is sent back immediately.
<br>
<br>
Some commands are asynchronous, so there could be a
delay
<br>
until a breakpoint is reached. The event handler then
sends a prompt
<br>
when the application thread is stopped and new jdb
commands can be sent.
<br>
<br>
The problem causing the intermittent failures was a
corruption in the
<br>
output stream when prompts were being sent at the
wrong times.
<br>
<br>
Instead of receiving
<br>
   "Breakpoint hit:" &lt;location&gt;
<br>
&lt;prompt&gt;
<br>
<br>
the log contained
<br>
   "Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;
<br>
<br>
Once out of sync, jdb commands were being sent
prematurely
<br>
and the wrong values were being compared against
expected behavior.
<br>
The simple fix proposed here recognizes that commands
like "cont",
<br>
"step" and "next" are asynchronous commands and should
not send back
<br>
a prompt immediately. Instead. the event handler will
deliver the next prompt
<br>
when the next "Breakpoint hit:" or "Step completed:"
state change occurs.
<br>
<br>
The bulk of the testing was done on windows-x64-debug
builds where the
<br>
intermittent failures were observed in ~5 in 1000
testruns. The fix has
<br>
also been tested on linux-x64-debug,
solaris-sparcv9-debug,
<br>
and macosx-x64-debug, even though the failures have
never been reported
<br>
against those platforms.
<br>
<br>
Failures have been observed in many of the nsk/jdb
tests with similar corrupted
<br>
output streams, but never directly associated with
this issue before.
<br>
<br>
  redefine001, caught_exception002, locals002,
eval001, next001,
<br>
  stop_at003, step002, print002, trace001, step_up001,
read001,
<br>
  clear004, kill001, set001
<br>
<br>
<br>
</blockquote>
<br>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
<br>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</body>
</html>
Gary Adams
2018-06-06 18:52:27 UTC
Permalink
If that were the case, shouldn't the delay be added to sendCommand()
instead?
I'm still a bit unsure about how this sequence is suppose to work in
sendCommand(command);
return receiveReply(startPos, compoundPromptOnly, count);
Send command "cont"
Receive the reply "main[1]"

At this point the delay allows the event handler to run in the debuggee
process.
I guess the assumption is that the first prompt will always be a
simple one? Is that the case? If so, then the expectation is that we
main[1] cont
"Breakpoint hit:" <location>
main[1]
main[1] cont
"Breakpoint hit:" > <location>
main[1]
I'm not clear on what you are saying in this example.
However, I think maybe the issue is different for step commands, which
Here's another recent failure with eval001
1. cont and Break hit
2. step and Step completed (partial reply)
3. eval and error with second half of Step completed
*====================================*
Sending command: cont
reply[0]: >
Breakpoint hit: "thread=main",
nsk.jdb.eval.eval001.eval001a.lastBreak(), line=23 bci=0
23 static void lastBreak () {}
reply[4]: *
====================================
*main[1]
Sending command: step
Step completed: main[1]
*====================================*
Sending command: eval nsk.jdb.eval.eval001.eval001a.myStaticField
*reply[0]: "thread=main", nsk.jdb.eval.eval001.eval001a.runIt(),
line=36 bci=53*
36 }
*====================================*
main[1]
nsk.jdb.eval.eval001.eval001a.myStaticField
<snip>
# ERROR: expected : -2147483648 ;
*# ERROR: reported: "thread=main",
nsk.jdb.eval.eval001.eval001a.runIt(), line=36 bci=53*
Finally the sequence from the jdb.session file. Notice the main[1] prompt
embedded in the Step completed reply.
*main[1]* cont
Breakpoint hit: "thread=main",
nsk.jdb.eval.eval001.eval001a.lastBreak(), line=23 bci=0
23 static void lastBreak () {}
*main[1] *step
Step completed: *main[1]* "thread=main",
nsk.jdb.eval.eval001.eval001a.runIt(), line=36 bci=53
36 }
*main[1]* eval nsk.jdb.eval.eval001.eval001a.myStaticField
The part that uses "cont" is fine. The problems started when the step
Step completed: "thread=main", nsk.jdb.eval.eval001.eval001a.runIt(),
line=36 bci=53
Step completed: main[1]
*====================================*
Sending command: eval nsk.jdb.eval.eval001.eval001a.myStaticField
reply[0]: "thread=main", nsk.jdb.eval.eval001.eval001a.runIt(),
line=36 bci=53
I don't see your change addressing this. The step command was
considered to be done as soon as the main[] prompt arrived, even
though it came right in the middle of the full "Step completed: ..."
message. Your delay won't change this. Yes, it will make sure the rest
of the reply has arrived, but the part after main[1] will not be
considered part of the step reply because it will terminate at the
main[1] prompt.
With the proposed change to delay after receiving a reply after command,
we would see


*====================================*
Sending command: cont

reply[0]: >
*==================================== delay after command*
Breakpoint hit: "thread=main",
nsk.jdb.eval.eval001.eval001a.lastBreak(), line=23 bci=0
23 static void lastBreak () {}*

*main[1]
*====================================*
Sending command: step

main[1]
*==================================== delay after command*
Step completed: *"thread=main", nsk.jdb.eval.eval001.eval001a.runIt(),
line=36 bci=53*
36 }

*====================================*
Sending command: eval nsk.jdb.eval.eval001.eval001a.myStaticField

main[1]
*==================================== delay after command*

...

I'll try some testing over night.
Chris
Post by s***@oracle.com
Hi Gary,
I thought the issue was having 1 command like "cont" or "step"
producing two prompts, and the first prompt ending up in the middle
main[1] cont
main[1]
"Breakpoint hit:" <location>
main[1]
main[1] cont
"Breakpoint hit:" main[1] <location>
main[1]
stop at location1
stop at location2
cont
cont
the expectation is that breakpoint will be encountered and then
the test will continue til another breakpoint is encountered.
The problem only occurs because the commands are sent
so quickly a race condition is introduced between the
command processing and the event handling in the debuggee
process. Since prompts are written both from the command
thread and the event handler, the output can end up interspersed.
Post by s***@oracle.com
I don't think a delay on the receiving end like you have added can
effect this. You need to make sure the main[1] produced for the
"cont" command is in the output stream before handling the
breakpoint event. Dan's solution was a bit different in that it
actually delayed reaching the breakpoint. I don't think your
solution does the same.
My goal is to delay the sending of the next command,
because the debuggee is not actually ready for the command.
Post by s***@oracle.com
Chris
It may be sufficient when the command prompt is delivered
to introduce a brief pause to allow an asynchronous event to
be processed before another command can be sent. e.g.
throttling the stream of commands being sent
diff --git a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
--- a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
+++ b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
@@ -348,7 +348,17 @@
int startPos = stdoutBuffer.length();
sendCommand(command);
- return receiveReply(startPos, compoundPromptOnly, count);
+ String[] reply = receiveReply(startPos,
compoundPromptOnly, count);
+
+ if (asyncCommand(command)) {
+ // delay to allow debuggee time to process async command
+ try {
+ Thread.sleep(100);
+ } catch (InterruptedException e) {
+ // ignored
+ }
+ }
+ return reply;
}
The two prompts are for two different things so I don't think
that removing one of the prompts is the right solution either.
In the case of an asynchronous command like 'cont', you get
the '<prompt=after-cont-command>' right away because the command
processor is ready for another command. You will also get the
'<prompt-after-breakpoint-event-location1>' prompt when the
breakpoint is hit because that identifies the thread that posted
the breakpoint event. If it takes a long time to hit the breakpoint,
then it will be a long time before the breakpoint prompt is seen.
Likewise, if the breakpoint is hit quickly, then the breakpoint
prompt will be seen more quickly and will sometimes interfere with
the command processor prompt. That's normal for an async command.
One possible way to fix this kind of issue is to structure the
test flow to slightly delay the target thread so it takes longer
to get to the breakpoint. '<prompt=after-cont-command>' will have
a better chance to be printed before (and not colliding with) the
'<prompt-after-breakpoint-event-location1>' prompt.
<some line of code>; // breakpoint here
try { Thread.sleep(100); } catch (InterruptedException ie) {};
<some line of code>; // breakpoint here
Of course, I'm assuming here that the test driver code is looking
for the breakpoint to be hit (and not for
'<prompt=after-cont-command>').
Jerry and I did find a few jdb based tests looking for the wrong
prompt...
Dan
Post by g***@oracle.com
Post by s***@oracle.com
Hi Gary,
I'm a bit unclear on one point. According to your description,
should we be seeing two prompts for commands like "cont", and
"Breakpoint hit:" <location>
<prompt>
<prompt>
"Breakpoint hit:" <location>
<prompt>
Let me stretch out the log a little bit
The <prompt> is main[1] or Thread-1[1], etc.
and not distinguished for command or event completion.
<prompt>
stop at <location1>
<prompt-after-stop-command>
stop at <location2>
<prompt-after-stop-command>
cont
<prompt=after-cont-command>
cont
"Breakpoint hit:" <location1>
<prompt-after-cont-command>
"Breakpoint hit:"
<prompt-after-breakpoint-event-location1>
<location2>
<prompt-after-breakpoint-event-location2>
The source of the command or event prompts
was verified with a stacktrace in the printPrompt()
method.
Post by s***@oracle.com
I figure this must be the case, because your fix removes the
printing of a prompt, and we still need one prompt, so there
must have been two to start with. I think in the above the last
<prompt> is being produced sequentially right after the
"Breakpoint hit:" output, so is never an issue. The first
<prompt> comes from the printPrompt() call after issuing the
"cont" command, and it's possible to hit the breakpoint and
start producing more output before printPrompt() has completed,
thus causing the problem you are seeing.
printPrompt() is a very quick operation and is not being
interrupted.
But the output from the event handler is produced incrementally.
- event header, e.g. "Breakpont hit:", "Step completed:", etc.
- from vmInterrupted()
- printCurrentLocation()
- monitor command output (if any)
- printPrompt()
Post by s***@oracle.com
There seems to be some purpose to printing this first prompt and
I'm not so sure eliminating it is the right thing to do. I think
it might be there to indicate that the command has been
initiated, because there could be some delay before the
breakpoint is hit. So the issue then is how to make sure this
prompt appears in the input stream before any of the "Step
completed:" or "Breakpoint hit:" output. Possibly event
processing should be blocked until executeCommand() completes.
The primary purposes of prompts is for "user interactive" nudge
to type another command.
But the failures are from tests that are using the presence of a
prompt as an ACK of the
last command. This "prompt-protocol", does not include any
sequence numbers ,
does not pair prompt to command or event, and does not block during
any multipart message formation.
To prevent the error we are observing we would have to buffer all
of the event output
"Breakpoint hit:" <location> <monitored-commands>
<prompt for breakpoint event>
Another way to look at the current failure is to say the event
handler output
has been interrupted by a command sent prematurely. This would never
be seen at user speeds typing in commands.
I should also mention any other output from the user application can also
interfere with the command and event output streams. I have seen
stray prompts in the logs from System.out.println(). That was the
reason earlier I was looking at changing the tests to look for
specific prompts before continuing to the next command.
\gra
Post by s***@oracle.com
thanks,
Chris
Post by g***@oracle.com
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.
Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
-------- Forwarded Message --------
Cannot find boolVar with expected value: false
Date: Fri, 25 May 2018 11:35:10 -0400
The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
"Breakpoint hit:" <location>
<prompt>
the log contained
"Breakpoint hit:" <prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with
similar corrupted
output streams, but never directly associated with this issue before.
redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001
Gary Adams
2018-06-07 12:48:52 UTC
Permalink
It's theoretically possible that the corrupted output
could be produced with a single step command
because of the separate event handler and command
processing threads, although the failures that have
been recorded were all observed later in a test scenario.

So let's look at a third possible way to fix the issue.

- we need a mechanism that will protect executeCommand and handleEvent
- to ensure that neither produces a partial output

A few key snippets from the code below.

The method executeCommand() is called from both
the command processing thread and the event handler
thread. It is also called recursively, so a lock may not be
appropriate

The eventHandler output is produce incrementally
- handleEvent produces the header, e.g. "Set completed:"
- vmInterrupted() produces
printCurrentLocation()
monitor commands output (if any) by calling executeCommand()
printPrompt()

I don't have concrete fix to propose, yet, but I'm leaning towards
having executeCommand buffer it's output and return it as a single
string. It would involve a getPrompt() instead of printPrompt, so
the text could be composed. It would also require aggregating any
output that is currently sent directly to MessageOutput.

It might also be addressed by enhancing MessageOutput to
include some sort of buffering for the current event or current command.

It's unlikely this would be ready before the jdk11 repos are cloned, so
it may be best to target these bugs for the next release.

src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/TTY.java:
...
209 public void vmInterrupted() {
210 Thread.yield(); // fetch output
211 printCurrentLocation();
212 for (String cmd : monitorCommands) {
213 StringTokenizer t = new StringTokenizer(cmd);
214 t.nextToken(); // get rid of monitor number
* 215 executeCommand(t);*
216 }
217 MessageOutput.printPrompt();
218 }
...
* 391 void executeCommand(StringTokenizer t) {*
392 String cmd = t.nextToken().toLowerCase();
393 // Normally, prompt for the next command after this
one is done
394 boolean showPrompt = true;
395
...
589 if (showPrompt) {
590 MessageOutput.printPrompt();
591 }
592 }

src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/EventHandler.java:
...
66 public void run() {
67 EventQueue queue = Env.vm().eventQueue();
68 while (connected) {
69 try {
70 EventSet eventSet = queue.remove();
71 boolean resumeStoppedApp = false;
72 EventIterator it = eventSet.eventIterator();
73 while (it.hasNext()) {
* 74 resumeStoppedApp |=
!handleEvent(it.nextEvent());*
75 }
76
77 if (resumeStoppedApp) {
78 eventSet.resume();
79 } else if (eventSet.suspendPolicy() ==
EventRequest.SUSPEND_ALL) {
80 setCurrentThread(eventSet);
* 81 notifier.vmInterrupted();*
82 }
83 } catch (InterruptedException exc) {
84 // Do nothing. Any changes will be seen at
top of loop.
85 } catch (VMDisconnectedException discExc) {
86 handleDisconnectedException();
87 break;
88 }
89 }
90 synchronized (this) {
91 completed = true;
92 notifyAll();
93 }
94 }

96 private boolean handleEvent(Event event) {
97 notifier.receivedEvent(event);
98
99 if (event instanceof ExceptionEvent) {
100 return exceptionEvent(event);
101 } else if (event instanceof BreakpointEvent) {
102 return breakpointEvent(event);
103 } else if (event instanceof WatchpointEvent) {
104 return fieldWatchEvent(event);
105 } else if (event instanceof StepEvent) {
106 return stepEvent(event);
107 } else if (event instanceof MethodEntryEvent) {
108 return methodEntryEvent(event);
109 } else if (event instanceof MethodExitEvent) {
110 return methodExitEvent(event);
111 } else if (event instanceof ClassPrepareEvent) {
112 return classPrepareEvent(event);
113 } else if (event instanceof ClassUnloadEvent) {
114 return classUnloadEvent(event);
115 } else if (event instanceof ThreadStartEvent) {
116 return threadStartEvent(event);
117 } else if (event instanceof ThreadDeathEvent) {
118 return threadDeathEvent(event);
119 } else if (event instanceof VMStartEvent) {
120 return vmStartEvent(event);
121 } else {
122 return handleExitEvent(event);
123 }
124 }




On 6/6/18, 3:27 PM, Chris Plummer wrote:
...
The point I'm trying to get across is I don't think the issue is with
the rate of commands being sent. I think the issue can be isolated to
a single step command that has the first prompt appearing in the
middle of the "Step completed: ..." output.
command thread: receive and execute "step" command
event thread: receive "step" breakpoint. print "Step completed: "
command thread: print "main[1]"
event thread: print text that comes after "Step completed: "
event thread: print "main[1]"
This has to be fixed in the sender of the output, not the receiver of
the output (or the sender of the command).
Chris
...
Chris Plummer
2018-06-07 23:21:19 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I'm not sure how buffering helps. When is the buffered output
printed? It can't be deferred until the event handler, because the
command is async and the prompt is suppose to be printed out right
away, not deferred until the event comes in. w.r.t. this
particular issue, isn't the prompt the only thing that
executeCommand() is printing? How would buffering help in that
case.<br>
<br>
I understand your concerns about using a lock, but I still think
it's probably the way to go. We can possibly work around the two
pitfalls you point out.  <br>
<br>
The call to executeCommand() from the event handler due to monitor
commands should not be an issue since it is on a separate thread.
In fact synchronization should end up working just the way we
want. The monitor commands will be block from execution until the
executeCommand() from the command thread has finished printing the
prompt. And if the monitor command that is executed has async
completion of its own, the event handler won't see it until it's
done with all monitor commands it's currently executing from
vmInterrupted().<br>
<br>
I'm not to sure what the recursive use of executeCommand() is. Can
you point it out to me?<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 6/7/18 5:48 AM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:***@oracle.com">
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
It's theoretically possible that the corrupted output <br>
could be produced with a single step command <br>
because of the separate event handler and command <br>
processing threads, although the failures that have <br>
been recorded were all observed later in a test scenario.<br>
<br>
So let's look at a third possible way to fix the issue.<br>
<br>
   - we need a mechanism that will protect executeCommand and
handleEvent<br>
   - to ensure that neither produces a partial output<br>
<br>
A few key snippets from the code below.<br>
<br>
The method executeCommand() is called from both <br>
the command processing thread and the event handler <br>
thread. It is also called recursively, so a lock may not be <br>
appropriate<br>
<br>
The eventHandler output is produce incrementally<br>
    - handleEvent produces the header, e.g. "Set completed:"<br>
    - vmInterrupted() produces<br>
          printCurrentLocation()<br>
          monitor commands output (if any) by calling
executeCommand()<br>
          printPrompt()<br>
<br>
I don't have concrete fix to propose, yet, but I'm leaning towards
<br>
having executeCommand buffer it's output and return it as a single
<br>
string. It would involve a getPrompt() instead of printPrompt, so<br>
the text could be composed. It would also require aggregating any<br>
output that is currently sent directly to MessageOutput.<br>
<br>
It might also be addressed by enhancing MessageOutput to <br>
include some sort of buffering for the current event or current
command.<br>
<br>
It's unlikely this would be ready before the jdk11 repos are
cloned, so<br>
it may be best to target these bugs for the next release.<br>
<br>
<tt>src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/TTY.java:<br>
...<br>
   209        public void vmInterrupted() {<br>
   210            Thread.yield();  // fetch output<br>
   211            printCurrentLocation();<br>
   212            for (String cmd : monitorCommands) {<br>
   213                StringTokenizer t = new
StringTokenizer(cmd);<br>
   214                t.nextToken();  // get rid of monitor
number<br>
<b>   215                executeCommand(t);</b><br>
   216            }<br>
   217            MessageOutput.printPrompt();<br>
   218        }<br>
...<br>
<b>   391        void executeCommand(StringTokenizer t) {</b><br>
   392            String cmd = t.nextToken().toLowerCase();<br>
   393            // Normally, prompt for the next command after
this one is done<br>
   394            boolean showPrompt = true;<br>
   395    <br>
...<br>
   589            if (showPrompt) {<br>
   590                MessageOutput.printPrompt();<br>
   591            }<br>
   592        }<br>
<br>
src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/EventHandler.java:<br>
...<br>
    66        public void run() {<br>
    67            EventQueue queue = Env.vm().eventQueue();<br>
    68            while (connected) {<br>
    69                try {<br>
    70                    EventSet eventSet = queue.remove();<br>
    71                    boolean resumeStoppedApp = false;<br>
    72                    EventIterator it =
eventSet.eventIterator();<br>
    73                    while (it.hasNext()) {<br>
<b>    74                        resumeStoppedApp |=
!handleEvent(it.nextEvent());</b><br>
    75                    }<br>
    76    <br>
    77                    if (resumeStoppedApp) {<br>
    78                        eventSet.resume();<br>
    79                    } else if (eventSet.suspendPolicy() ==
EventRequest.SUSPEND_ALL) {<br>
    80                        setCurrentThread(eventSet);<br>
<b>    81                        notifier.vmInterrupted();</b><br>
    82                    }<br>
    83                } catch (InterruptedException exc) {<br>
    84                    // Do nothing. Any changes will be
seen at top of loop.<br>
    85                } catch (VMDisconnectedException discExc)
{<br>
    86                    handleDisconnectedException();<br>
    87                    break;<br>
    88                }<br>
    89            }<br>
    90            synchronized (this) {<br>
    91                completed = true;<br>
    92                notifyAll();<br>
    93            }<br>
    94        }<br>
<br>
    96        private boolean handleEvent(Event event) {<br>
    97            notifier.receivedEvent(event);<br>
    98    <br>
    99            if (event instanceof ExceptionEvent) {<br>
   100                return exceptionEvent(event);<br>
   101            } else if (event instanceof BreakpointEvent) {<br>
   102                return breakpointEvent(event);<br>
   103            } else if (event instanceof WatchpointEvent) {<br>
   104                return fieldWatchEvent(event);<br>
   105            } else if (event instanceof StepEvent) {<br>
   106                return stepEvent(event);<br>
   107            } else if (event instanceof MethodEntryEvent)
{<br>
   108                return methodEntryEvent(event);<br>
   109            } else if (event instanceof MethodExitEvent) {<br>
   110                return methodExitEvent(event);<br>
   111            } else if (event instanceof ClassPrepareEvent)
{<br>
   112                return classPrepareEvent(event);<br>
   113            } else if (event instanceof ClassUnloadEvent)
{<br>
   114                return classUnloadEvent(event);<br>
   115            } else if (event instanceof ThreadStartEvent)
{<br>
   116                return threadStartEvent(event);<br>
   117            } else if (event instanceof ThreadDeathEvent)
{<br>
   118                return threadDeathEvent(event);<br>
   119            } else if (event instanceof VMStartEvent) {<br>
   120                return vmStartEvent(event);<br>
   121            } else {<br>
   122                return handleExitEvent(event);<br>
   123            }<br>
   124        }</tt><br>
<br>
<br>
  <br>
<br>
On 6/6/18, 3:27 PM, Chris Plummer wrote:<br>
...<br>
<blockquote
cite="mid:40d44ef3-97dc-ead0-158e-***@oracle.com"
type="cite"><br>
The point I'm trying to get across is I don't think the issue is
with the rate of commands being sent. I think the issue can be
isolated to a single step command that has the first prompt
appearing in the middle of the "Step completed: ..." output.<br>
<br>
command thread: receive and execute "step" command<br>
event thread: receive "step" breakpoint. print "Step completed:
"<br>
command thread: print "main[1]"<br>
event thread: print text that comes after "Step completed: "<br>
event thread: print "main[1]"<br>
<br>
This has to be fixed in the sender of the output, not the
receiver of the output (or the sender of the command).<br>
<br>
Chris</blockquote>
...<br>
</blockquote>
<p><br>
</p>
</body>
</html>
Chris Plummer
2018-06-06 17:06:35 UTC
Permalink
The two prompts are for two different things so I don't think
that removing one of the prompts is the right solution either.
In the case of an asynchronous command like 'cont', you get
the '<prompt=after-cont-command>' right away because the command
processor is ready for another command. You will also get the
'<prompt-after-breakpoint-event-location1>' prompt when the
breakpoint is hit because that identifies the thread that posted
the breakpoint event. If it takes a long time to hit the breakpoint,
then it will be a long time before the breakpoint prompt is seen.
Likewise, if the breakpoint is hit quickly, then the breakpoint
prompt will be seen more quickly and will sometimes interfere with
the command processor prompt. That's normal for an async command.
That was the point I was trying to make. I also felt that printing both
prompts was appropriate.
One possible way to fix this kind of issue is to structure the
test flow to slightly delay the target thread so it takes longer
to get to the breakpoint. '<prompt=after-cont-command>' will have
a better chance to be printed before (and not colliding with) the
'<prompt-after-breakpoint-event-location1>' prompt.
    <some line of code>;  // breakpoint here
    try { Thread.sleep(100); } catch (InterruptedException ie) {};
    <some line of code>;  // breakpoint here
Of course, I'm assuming here that the test driver code is looking
for the breakpoint to be hit (and not for '<prompt=after-cont-command>').
Jerry and I did find a few jdb based tests looking for the wrong
prompt...
The issue is really with the delivery of the first prompt. It can be
intermingled with the output from the breakpoint. Yes, your solution
might work most of the time (still not 100% guaranteed) when dealing
with breakpoints, but is not a solution when dealing with this same
problem occurring while single stepping.

My suggestion was to have processing of breakpoints and single stepping
events block until executeCommand() completed and had printed the first
prompt. I'm not sure if this is feasible or practical.

thanks,

Chris
Dan
Post by g***@oracle.com
Post by s***@oracle.com
Hi Gary,
I'm a bit unclear on one point. According to your description,
should we be seeing two prompts for commands like "cont", and
  "Breakpoint hit:" <location>
   <prompt>
   <prompt>
  "Breakpoint hit:" <location>
   <prompt>
Let me stretch out the log a little bit
The <prompt> is main[1] or Thread-1[1], etc.
and not distinguished for command or event completion.
<prompt>
   stop at <location1>
<prompt-after-stop-command>
   stop at <location2>
<prompt-after-stop-command>
   cont
<prompt=after-cont-command>
   cont
   "Breakpoint hit:" <location1>
<prompt-after-cont-command>
  "Breakpoint hit:"
<prompt-after-breakpoint-event-location1>
  <location2>
<prompt-after-breakpoint-event-location2>
The source of the command or event prompts
was verified with a stacktrace in the printPrompt()
method.
Post by s***@oracle.com
I figure this must be the case, because your fix removes the
printing of a prompt, and we still need one prompt, so there must
have been two to start with. I think in the above the last <prompt>
is being produced sequentially right after the "Breakpoint hit:"
output, so is never an issue. The first <prompt> comes from the
printPrompt() call after issuing the "cont" command, and it's
possible to hit the breakpoint and start producing more output
before printPrompt() has completed, thus causing the problem you are
seeing.
printPrompt() is a very quick operation and is not being interrupted.
But the output from the event handler is produced incrementally.
   - event header, e.g. "Breakpont hit:", "Step completed:", etc.
   - from vmInterrupted()
     - printCurrentLocation()
     - monitor command output (if any)
     - printPrompt()
Post by s***@oracle.com
There seems to be some purpose to printing this first prompt and I'm
not so sure eliminating it is the right thing to do. I think it
might be there to indicate that the command has been initiated,
because there could be some delay before the breakpoint is hit. So
the issue then is how to make sure this prompt appears in the input
stream before any of the "Step completed:" or "Breakpoint hit:"
output. Possibly event processing should be blocked until
executeCommand() completes.
The primary purposes of prompts is for "user interactive" nudge to
type another command.
But the failures are from tests that are using the presence of a
prompt as an ACK of the
last command. This "prompt-protocol", does not include any sequence
numbers ,
does not pair prompt to command or event, and does not block during
any multipart message formation.
To prevent the error we are observing we would have to buffer all of
the event output
   "Breakpoint hit:" <location> <monitored-commands>
    <prompt for breakpoint event>
Another way to look at the current failure is to say the event
handler output
has been interrupted by a command sent prematurely. This would never
be seen at user speeds typing in commands.
I should also mention any other output from the user application can also
interfere with the command and event output streams. I have seen
stray prompts in the logs from System.out.println(). That was the
reason earlier I was looking at changing the tests to look for
specific prompts before continuing to the next command.
\gra
Post by s***@oracle.com
thanks,
Chris
Post by g***@oracle.com
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.
  Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
-------- Forwarded Message --------
Cannot find boolVar with expected value: false
Date:     Fri, 25 May 2018 11:35:10 -0400
The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
   "Breakpoint hit:" <location>
    <prompt>
the log contained
   "Breakpoint hit:" <prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.
  redefine001, caught_exception002, locals002, eval001, next001,
  stop_at003, step002, print002, trace001, step_up001, read001,
  clear004, kill001, set001
Gary Adams
2018-07-20 19:11:24 UTC
Permalink
Here's another attempt to clear up the overlapping output from
the command processing and event handler in the jdb tests.

The fundamental problem is observed when "prompts"
are produced interleaved with command and event output.

This attempts to fix the issue by buffering the output and
printing it fully assembled.

Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev.01/
Post by g***@oracle.com
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.
Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
-------- Forwarded Message --------
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot
find boolVar with expected value: false
Date: Fri, 25 May 2018 11:35:10 -0400
The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
"Breakpoint hit:"<location>
<prompt>
the log contained
"Breakpoint hit:"<prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.
redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001
Chris Plummer
2018-07-27 23:27:45 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due to
having the separate event thread whose output needs to be
co-ordinated with command thread. Also made more complicated by
the multiple entrypoints into executeCommand(), and there being
various places where output produced and we care about the order.
I think I have it all mapped out now and a pretty good
understanding of how it all works.<br>
<br>
My first impression is that it seems like a large number of
changes and a lot of passing around of a StringBuffer object to
fix a problem which I think could be resolved with some strategic
wait/notify code. For example, it might be as simple as doing
something like this:<br>
<br>
<a class="moz-txt-link-freetext" href="http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/</a><br>
<br>
This only covers the step command. You'd have to figure out which
other commands need to ensure an event has been processed before
printing the prompt, and set waitForEventCompletion = true for
them. Is there a reason not to go with something like this?<br>
<br>
As for you current changes, just a couple of things I've noticed,
but I have not given it a thorough review (for example, making
sure that the StringBuffer is always eventually printed, and done
so in the right order).<br>
<br>
In EventHandler.run(), it's unclear who is handling the printing
of sb when handleEvent() or handleDisconnectedException() are
called.<br>
<br>
In the following EventHandler code:<br>
<br>
 123     @Override<br>
 124     public void stepEvent(StepEvent se, StringBuilder sb)  {<br>
 125         Thread.yield();  // fetch output<br>
 126         sb.append(MessageOutput.format("Step completed:"));<br>
 127     }<br>
<br>
Do you understand why the Thread.yield() is there? What does the
comment mean by "fetch output"? Just seems like it has something
to do with the ordering of output, but it seems bad that the code
ever relied on a yield to make sure the output was ordered
properly.<br>
<br>
Also in the above code snippet, you need to be careful when
replacing MessageOutput.lnprint() with MessageOutput.format().
lnprint() adds a newline to the start of the line. That appears to
be missing in your code above.<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 7/20/18 12:11 PM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:***@oracle.com">
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
Here's another attempt to clear up the overlapping output from <br>
the command processing and event handler in the jdb tests.<br>
<br>
The fundamental problem is observed when "prompts"<br>
are produced interleaved with command and event output.<br>
<br>
This attempts to fix the issue by buffering the output and<br>
printing it fully assembled.<br>
<br>
 Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.01/</a><br>
<br>
On 5/26/18, 6:50 AM, <a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com" moz-do-not-send="true">***@oracle.com</a>
wrote:
<blockquote
cite="mid:5d58fa2b-7dda-db64-0280-***@oracle.com"
type="cite">
<meta http-equiv="content-type" content="text/html;
charset=utf-8">
This is a review request for a previously closed test bug.<br>
The test was recently moved to the open repos, and the <br>
proposed fix is in the open code.<br>
<p>  Webrev: <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Subject:
</th>
<td>RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR:
Cannot find boolVar with expected value: false</td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Date:
</th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">From:
</th>
<td>Gary Adams <a moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
href="mailto:***@oracle.com">&lt;***@oracle.com&gt;</a></td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Reply-To:
</th>
<td><a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com">***@oracle.com</a></td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" &lt;location&gt;
&lt;prompt&gt;

the log contained
"Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001


</pre>
</div>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</body>
</html>
Chris Plummer
2018-07-30 05:05:56 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I updated my changes to do the wait() for step, stepi, cont, and
next. Some issues turned up when testing. A number of tests were
timing out occasionally, I believe because sometimes the event was
handled (and the notify() call already completed) before we ever
got to the handler.wait() call. Although I never confirmed this
was the cause, it certainly seems feasible.<br>
<br>
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it should really
always be printed before any event handling. Currently that seems
to be the case, probably due to the Thread.yield() I mention below
in event handling code like stepEvent(), although I don't think
it's necessarily guaranteed. However, with my changes prompt was
always printed after the event was done being handled, and this
proved to be a problem for some tests.<br>
<br>
When executing the "cont" command, it seems this prompt is
expected to be "&gt; ", which is the prompt printed if the VM is
not currently suspended. Some tests count on this, mostly because
they call jdb.receiveReplyFor(), and pass in the number of prompts
that are expected. jdb.receiveReplyFor() does not count "&gt; ".
My change ended up making the printPrompt() in
TTY:executeCommand() print as "main[1]" after a "cont" command,
because the added synchronizing was making it not print until the
thread was suspended. This caused some tests to fail because the
expected reply was not yet received, even after
jdb.receiveReplyFor() had returned (it was basically returning one
prompt too soon).<br>
<br>
I started thinking that the synchronized approach probably needs a
large synchronized block in TTY:executeCommand(). But then I
figured out a much easier solution.  Thinking more about the
"prompt after cont" issue, it occurred to me that maybe we just
need to print out the prompt before executing the command, so it's
not possible for events to come in before the prompt is printed
(or worse yet, in the middle of event handling output). Doing this
ends up looking much like your original suggestion to set
printPrompt = false for "cont", "step", "stepi", and "next",
except I also added a call printPrompt() before doing this (Dan an
I had rejected your proposal to simply remove the prompt, since it
did server a purpose). For example:<br>
<br>
                        } else if (cmd.equals("cont")) {<br>
                            MessageOutput.printPrompt();<br>
                            showPrompt = false;<br>
                            evaluator.commandCont();<br>
<br>
But this does cause the issue I brought up above that the prompt
from "cont" is expected (by a number of tests) to be "&gt; ", but
since the above code is executed while the vm is suspended, it
ends up being main[1]. I fixed this by adding a "simple" argument
to printPrompt() so we can force it to be "&gt; " when needed.
This will maintain compatibility with anyone making assumptions
about what the prompt should look like.<br>
<br>
Here's the webrev, which seems to be working. I've run about 1000
iterations on all platforms. So far it seems to be working. There
have been some failures, but for the most part I've also seen
those with clean repos, and the ones that I haven't I don't
believe are related to my changes.<br>
<br>
<a class="moz-txt-link-freetext" href="http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
One other comment about your initial fix to set showPrompt =
false:<br>
<br>
+ showPrompt = false; // asynchronous command
<br>
<br>
I'm not sure if the "asynchronous command" comment was intentional
or just the result of the copy-and-paste. I just want to clarify
what is meant by "asynchronous command"
in the other places where this comment appears. There are a few
commands that are executed async on a different thread. If you
look in Commands.java, you'll see how these commands use the
AsyncExecution class to handle execution of the command. I'm not
sure why this is done, because all these commands look like ones
that are inherently synchronous, yet they are always run on a
different thread, and as a result may not be complete when
execution returns to TTY::executeCommand(). These commands also
all set showPrompt = false, but the prompt is printed by
AsyncExecution when complete. So, the point is the commands which
are inherently async, like "step" and "cont", are not async w.r.t.
these "asynchronous command" comments or the AsyncExecution class.<br>
<br>
Chris<br>
<br>
<br>
On 7/27/18 4:27 PM, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:853aba55-fafc-2797-ed44-***@oracle.com">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due to
having the separate event thread whose output needs to be
co-ordinated with command thread. Also made more complicated by
the multiple entrypoints into executeCommand(), and there being
various places where output produced and we care about the
order. I think I have it all mapped out now and a pretty good
understanding of how it all works.<br>
<br>
My first impression is that it seems like a large number of
changes and a lot of passing around of a StringBuffer object to
fix a problem which I think could be resolved with some
strategic wait/notify code. For example, it might be as simple
as doing something like this:<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.00/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/</a><br>
<br>
This only covers the step command. You'd have to figure out
which other commands need to ensure an event has been processed
before printing the prompt, and set waitForEventCompletion =
true for them. Is there a reason not to go with something like
this?<br>
<br>
As for you current changes, just a couple of things I've
noticed, but I have not given it a thorough review (for example,
making sure that the StringBuffer is always eventually printed,
and done so in the right order).<br>
<br>
In EventHandler.run(), it's unclear who is handling the printing
of sb when handleEvent() or handleDisconnectedException() are
called.<br>
<br>
In the following EventHandler code:<br>
<br>
 123     @Override<br>
 124     public void stepEvent(StepEvent se, StringBuilder sb) 
{<br>
 125         Thread.yield();  // fetch output<br>
 126         sb.append(MessageOutput.format("Step completed:"));<br>
 127     }<br>
<br>
Do you understand why the Thread.yield() is there? What does the
comment mean by "fetch output"? Just seems like it has something
to do with the ordering of output, but it seems bad that the
code ever relied on a yield to make sure the output was ordered
properly.<br>
<br>
Also in the above code snippet, you need to be careful when
replacing MessageOutput.lnprint() with MessageOutput.format().
lnprint() adds a newline to the start of the line. That appears
to be missing in your code above.<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 7/20/18 12:11 PM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:***@oracle.com">
<meta content="text/html; charset=utf-8"
http-equiv="Content-Type">
Here's another attempt to clear up the overlapping output from <br>
the command processing and event handler in the jdb tests.<br>
<br>
The fundamental problem is observed when "prompts"<br>
are produced interleaved with command and event output.<br>
<br>
This attempts to fix the issue by buffering the output and<br>
printing it fully assembled.<br>
<br>
 Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.01/</a><br>
<br>
On 5/26/18, 6:50 AM, <a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com" moz-do-not-send="true">***@oracle.com</a>
wrote:
<blockquote
cite="mid:5d58fa2b-7dda-db64-0280-***@oracle.com"
type="cite">
<meta http-equiv="content-type" content="text/html;
charset=utf-8">
This is a review request for a previously closed test bug.<br>
The test was recently moved to the open repos, and the <br>
proposed fix is in the open code.<br>
<p>  Webrev: <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Subject:
</th>
<td>RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR:
Cannot find boolVar with expected value: false</td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Date:
</th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">From:
</th>
<td>Gary Adams <a moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
href="mailto:***@oracle.com">&lt;***@oracle.com&gt;</a></td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Reply-To:
</th>
<td><a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com">***@oracle.com</a></td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" &lt;location&gt;
&lt;prompt&gt;

the log contained
"Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001


</pre>
</div>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<p><br>
</p>
</body>
</html>
s***@oracle.com
2018-07-30 07:47:04 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Chris,<br>
<br>
Just one quick simple comment below.<br>
<br>
On 7/29/18 22:05, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I updated my changes to do the wait() for step, stepi, cont, and
next. Some issues turned up when testing. A number of tests were
timing out occasionally, I believe because sometimes the event
was handled (and the notify() call already completed) before we
ever got to the handler.wait() call.</div>
</blockquote>
<br>
I agree as I had this exact concern when was looking at the
webrev.0.<br>
Need more time to understand all the details.<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix"> Although I never confirmed this was
the cause, it certainly seems feasible.<br>
<br>
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it should really
always be printed before any event handling. Currently that
seems to be the case, probably due to the Thread.yield() I
mention below in event handling code like stepEvent(), although
I don't think it's necessarily guaranteed. However, with my
changes prompt was always printed after the event was done being
handled, and this proved to be a problem for some tests.<br>
<br>
When executing the "cont" command, it seems this prompt is
expected to be "&gt; ", which is the prompt printed if the VM is
not currently suspended. Some tests count on this, mostly
because they call jdb.receiveReplyFor(), and pass in the number
of prompts that are expected. jdb.receiveReplyFor() does not
count "&gt; ". My change ended up making the printPrompt() in
TTY:executeCommand() print as "main[1]" after a "cont" command,
because the added synchronizing was making it not print until
the thread was suspended. This caused some tests to fail because
the expected reply was not yet received, even after
jdb.receiveReplyFor() had returned (it was basically returning
one prompt too soon).<br>
<br>
I started thinking that the synchronized approach probably needs
a large synchronized block in TTY:executeCommand(). But then I
figured out a much easier solution.  Thinking more about the
"prompt after cont" issue, it occurred to me that maybe we just
need to print out the prompt before executing the command, so
it's not possible for events to come in before the prompt is
printed (or worse yet, in the middle of event handling output).
Doing this ends up looking much like your original suggestion to
set printPrompt = false for "cont", "step", "stepi", and "next",
except I also added a call printPrompt() before doing this (Dan
an I had rejected your proposal to simply remove the prompt,
since it did server a purpose). For example:<br>
<br>
                        } else if (cmd.equals("cont")) {<br>
                            MessageOutput.printPrompt();<br>
                            showPrompt = false;<br>
                            evaluator.commandCont();<br>
<br>
But this does cause the issue I brought up above that the prompt
from "cont" is expected (by a number of tests) to be "&gt; ",
but since the above code is executed while the vm is suspended,
it ends up being main[1]. I fixed this by adding a "simple"
argument to printPrompt() so we can force it to be "&gt; " when
needed. This will maintain compatibility with anyone making
assumptions about what the prompt should look like.<br>
<br>
Here's the webrev, which seems to be working. I've run about
1000 iterations on all platforms. So far it seems to be working.
There have been some failures, but for the most part I've also
seen those with clean repos, and the ones that I haven't I don't
believe are related to my changes.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
One other comment about your initial fix to set showPrompt =
false:<br>
<br>
+ showPrompt = false; // asynchronous command <br>
<br>
I'm not sure if the "asynchronous command" comment was
intentional or just the result of the copy-and-paste. I just
want to clarify what is meant by "asynchronous command" in the
other places where this comment appears. There are a few
commands that are executed async on a different thread. If you
look in Commands.java, you'll see how these commands use the
AsyncExecution class to handle execution of the command. I'm not
sure why this is done, because all these commands look like ones
that are inherently synchronous, yet they are always run on a
different thread, and as a result may not be complete when
execution returns to TTY::executeCommand(). These commands also
all set showPrompt = false, but the prompt is printed by
AsyncExecution when complete. So, the point is the commands
which are inherently async, like "step" and "cont", are not
async w.r.t. these "asynchronous command" comments or the
AsyncExecution class.<br>
<br>
Chris<br>
<br>
<br>
On 7/27/18 4:27 PM, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:853aba55-fafc-2797-ed44-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due to
having the separate event thread whose output needs to be
co-ordinated with command thread. Also made more complicated
by the multiple entrypoints into executeCommand(), and there
being various places where output produced and we care about
the order. I think I have it all mapped out now and a pretty
good understanding of how it all works.<br>
<br>
My first impression is that it seems like a large number of
changes and a lot of passing around of a StringBuffer object
to fix a problem which I think could be resolved with some
strategic wait/notify code. For example, it might be as simple
as doing something like this:<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.00/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/</a><br>
<br>
This only covers the step command. You'd have to figure out
which other commands need to ensure an event has been
processed before printing the prompt, and set
waitForEventCompletion = true for them. Is there a reason not
to go with something like this?<br>
<br>
As for you current changes, just a couple of things I've
noticed, but I have not given it a thorough review (for
example, making sure that the StringBuffer is always
eventually printed, and done so in the right order).<br>
<br>
In EventHandler.run(), it's unclear who is handling the
printing of sb when handleEvent() or
handleDisconnectedException() are called.<br>
<br>
In the following EventHandler code:<br>
<br>
 123     @Override<br>
 124     public void stepEvent(StepEvent se, StringBuilder
sb)  {<br>
 125         Thread.yield();  // fetch output<br>
 126         sb.append(MessageOutput.format("Step
completed:"));<br>
 127     }<br>
<br>
Do you understand why the Thread.yield() is there? What does
the comment mean by "fetch output"? Just seems like it has
something to do with the ordering of output, but it seems bad
that the code ever relied on a yield to make sure the output
was ordered properly.<br>
<br>
Also in the above code snippet, you need to be careful when
replacing MessageOutput.lnprint() with MessageOutput.format().
lnprint() adds a newline to the start of the line. That
appears to be missing in your code above.<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 7/20/18 12:11 PM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:***@oracle.com">
Here's another attempt to clear up the overlapping output from
<br>
the command processing and event handler in the jdb tests.<br>
<br>
The fundamental problem is observed when "prompts"<br>
are produced interleaved with command and event output.<br>
<br>
This attempts to fix the issue by buffering the output and<br>
printing it fully assembled.<br>
<br>
 Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.01/</a><br>
<br>
On 5/26/18, 6:50 AM, <a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com" moz-do-not-send="true">***@oracle.com</a>
wrote:
<blockquote
cite="mid:5d58fa2b-7dda-db64-0280-***@oracle.com"
type="cite"> This is a review request for a previously
closed test bug.<br>
The test was recently moved to the open repos, and the <br>
proposed fix is in the open code.<br>
<p>  Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table">
<tbody>
<tr>
<th nowrap="nowrap">Subject: </th>
<td>RFR: JDK-8169718: nsk/jdb/locals/locals002:
ERROR: Cannot find boolVar with expected value:
false</td>
</tr>
<tr>
<th nowrap="nowrap">Date: </th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th nowrap="nowrap">From: </th>
<td>Gary Adams <a class="moz-txt-link-rfc2396E"
href="mailto:***@oracle.com"
moz-do-not-send="true">&lt;***@oracle.com&gt;</a></td>
</tr>
<tr>
<th nowrap="nowrap">Reply-To: </th>
<td><a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a></td>
</tr>
<tr>
<th nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" &lt;location&gt;
&lt;prompt&gt;

the log contained
"Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001


</pre>
</div>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</body>
</html>
Chris Plummer
2018-07-30 16:46:35 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 7/30/18 12:47 AM,
<a class="moz-txt-link-abbreviated" href="mailto:***@oracle.com">***@oracle.com</a> wrote:<br>
</div>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-***@oracle.com">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div class="moz-cite-prefix">Hi Chris,<br>
<br>
Just one quick simple comment below.<br>
<br>
On 7/29/18 22:05, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I updated my changes to do the wait() for step, stepi, cont,
and next. Some issues turned up when testing. A number of
tests were timing out occasionally, I believe because
sometimes the event was handled (and the notify() call already
completed) before we ever got to the handler.wait() call.</div>
</blockquote>
<br>
I agree as I had this exact concern when was looking at the
webrev.0.<br>
Need more time to understand all the details.<br>
</blockquote>
Hopefully it's a moot point since my 2nd version doesn't rely on any
synchronization.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
I think a properly working synchronized version would require a
large synchronized block in TTY::executeCommand() to block out any
event handling until the prompt has been printed. I think this is
safe even when you have a monitor command in place. Although that
results in TTY::executeCommand() being called from the event
handler, and it may temporarily block if the command thread is
already in TTY::executeCommand(), eventually the command thread will
exit the synchronized block allowing the event thread to make
progress. Having said that, I just assume stick with the simpler
webrev above if no one see issues with it.<br>
<br>
thanks,<br>
<br>
Chris<br>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-***@oracle.com"> <br>
Thanks,<br>
Serguei<br>
<br>
<br>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix"> Although I never confirmed this
was the cause, it certainly seems feasible.<br>
<br>
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it should
really always be printed before any event handling. Currently
that seems to be the case, probably due to the Thread.yield()
I mention below in event handling code like stepEvent(),
although I don't think it's necessarily guaranteed. However,
with my changes prompt was always printed after the event was
done being handled, and this proved to be a problem for some
tests.<br>
<br>
When executing the "cont" command, it seems this prompt is
expected to be "&gt; ", which is the prompt printed if the VM
is not currently suspended. Some tests count on this, mostly
because they call jdb.receiveReplyFor(), and pass in the
number of prompts that are expected. jdb.receiveReplyFor()
does not count "&gt; ". My change ended up making the
printPrompt() in TTY:executeCommand() print as "main[1]" after
a "cont" command, because the added synchronizing was making
it not print until the thread was suspended. This caused some
tests to fail because the expected reply was not yet received,
even after jdb.receiveReplyFor() had returned (it was
basically returning one prompt too soon).<br>
<br>
I started thinking that the synchronized approach probably
needs a large synchronized block in TTY:executeCommand(). But
then I figured out a much easier solution.  Thinking more
about the "prompt after cont" issue, it occurred to me that
maybe we just need to print out the prompt before executing
the command, so it's not possible for events to come in before
the prompt is printed (or worse yet, in the middle of event
handling output). Doing this ends up looking much like your
original suggestion to set printPrompt = false for "cont",
"step", "stepi", and "next", except I also added a call
printPrompt() before doing this (Dan an I had rejected your
proposal to simply remove the prompt, since it did server a
purpose). For example:<br>
<br>
                        } else if (cmd.equals("cont")) {<br>
                            MessageOutput.printPrompt();<br>
                            showPrompt = false;<br>
                            evaluator.commandCont();<br>
<br>
But this does cause the issue I brought up above that the
prompt from "cont" is expected (by a number of tests) to be
"&gt; ", but since the above code is executed while the vm is
suspended, it ends up being main[1]. I fixed this by adding a
"simple" argument to printPrompt() so we can force it to be
"&gt; " when needed. This will maintain compatibility with
anyone making assumptions about what the prompt should look
like.<br>
<br>
Here's the webrev, which seems to be working. I've run about
1000 iterations on all platforms. So far it seems to be
working. There have been some failures, but for the most part
I've also seen those with clean repos, and the ones that I
haven't I don't believe are related to my changes.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
One other comment about your initial fix to set showPrompt =
false:<br>
<br>
+ showPrompt = false; // asynchronous command <br>
<br>
I'm not sure if the "asynchronous command" comment was
intentional or just the result of the copy-and-paste. I just
want to clarify what is meant by "asynchronous command" in the
other places where this comment appears. There are a few
commands that are executed async on a different thread. If you
look in Commands.java, you'll see how these commands use the
AsyncExecution class to handle execution of the command. I'm
not sure why this is done, because all these commands look
like ones that are inherently synchronous, yet they are always
run on a different thread, and as a result may not be complete
when execution returns to TTY::executeCommand(). These
commands also all set showPrompt = false, but the prompt is
printed by AsyncExecution when complete. So, the point is the
commands which are inherently async, like "step" and "cont",
are not async w.r.t. these "asynchronous command" comments or
the AsyncExecution class.<br>
<br>
Chris<br>
<br>
<br>
On 7/27/18 4:27 PM, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:853aba55-fafc-2797-ed44-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due
to having the separate event thread whose output needs to be
co-ordinated with command thread. Also made more complicated
by the multiple entrypoints into executeCommand(), and there
being various places where output produced and we care about
the order. I think I have it all mapped out now and a pretty
good understanding of how it all works.<br>
<br>
My first impression is that it seems like a large number of
changes and a lot of passing around of a StringBuffer object
to fix a problem which I think could be resolved with some
strategic wait/notify code. For example, it might be as
simple as doing something like this:<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.00/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/</a><br>
<br>
This only covers the step command. You'd have to figure out
which other commands need to ensure an event has been
processed before printing the prompt, and set
waitForEventCompletion = true for them. Is there a reason
not to go with something like this?<br>
<br>
As for you current changes, just a couple of things I've
noticed, but I have not given it a thorough review (for
example, making sure that the StringBuffer is always
eventually printed, and done so in the right order).<br>
<br>
In EventHandler.run(), it's unclear who is handling the
printing of sb when handleEvent() or
handleDisconnectedException() are called.<br>
<br>
In the following EventHandler code:<br>
<br>
 123     @Override<br>
 124     public void stepEvent(StepEvent se, StringBuilder
sb)  {<br>
 125         Thread.yield();  // fetch output<br>
 126         sb.append(MessageOutput.format("Step
completed:"));<br>
 127     }<br>
<br>
Do you understand why the Thread.yield() is there? What does
the comment mean by "fetch output"? Just seems like it has
something to do with the ordering of output, but it seems
bad that the code ever relied on a yield to make sure the
output was ordered properly.<br>
<br>
Also in the above code snippet, you need to be careful when
replacing MessageOutput.lnprint() with
MessageOutput.format(). lnprint() adds a newline to the
start of the line. That appears to be missing in your code
above.<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 7/20/18 12:11 PM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:***@oracle.com">
Here's another attempt to clear up the overlapping output
from <br>
the command processing and event handler in the jdb tests.<br>
<br>
The fundamental problem is observed when "prompts"<br>
are produced interleaved with command and event output.<br>
<br>
This attempts to fix the issue by buffering the output and<br>
printing it fully assembled.<br>
<br>
 Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.01/</a><br>
<br>
On 5/26/18, 6:50 AM, <a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com" moz-do-not-send="true">***@oracle.com</a>
wrote:
<blockquote
cite="mid:5d58fa2b-7dda-db64-0280-***@oracle.com"
type="cite"> This is a review request for a previously
closed test bug.<br>
The test was recently moved to the open repos, and the <br>
proposed fix is in the open code.<br>
<p>  Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table">
<tbody>
<tr>
<th nowrap="nowrap">Subject: </th>
<td>RFR: JDK-8169718: nsk/jdb/locals/locals002:
ERROR: Cannot find boolVar with expected value:
false</td>
</tr>
<tr>
<th nowrap="nowrap">Date: </th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th nowrap="nowrap">From: </th>
<td>Gary Adams <a class="moz-txt-link-rfc2396E"
href="mailto:***@oracle.com"
moz-do-not-send="true">&lt;***@oracle.com&gt;</a></td>
</tr>
<tr>
<th nowrap="nowrap">Reply-To: </th>
<td><a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a></td>
</tr>
<tr>
<th nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" &lt;location&gt;
&lt;prompt&gt;

the log contained
"Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001


</pre>
</div>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</body>
</html>
s***@oracle.com
2018-07-31 08:32:34 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 7/30/18 09:46, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:8a256bfc-1ff1-da31-ce31-***@oracle.com">
<div class="moz-cite-prefix">On 7/30/18 12:47 AM, <a
class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a> wrote:<br>
</div>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-***@oracle.com">
<div class="moz-cite-prefix">Hi Chris,<br>
<br>
Just one quick simple comment below.<br>
<br>
On 7/29/18 22:05, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I updated my changes to do the wait() for step, stepi, cont,
and next. Some issues turned up when testing. A number of
tests were timing out occasionally, I believe because
sometimes the event was handled (and the notify() call
already completed) before we ever got to the handler.wait()
call.</div>
</blockquote>
<br>
I agree as I had this exact concern when was looking at the
webrev.0.<br>
Need more time to understand all the details.<br>
</blockquote>
Hopefully it's a moot point since my 2nd version doesn't rely on
any synchronization.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
I think a properly working synchronized version would require a
large synchronized block in TTY::executeCommand() to block out any
event handling until the prompt has been printed. I think this is
safe even when you have a monitor command in place. Although that
results in TTY::executeCommand() being called from the event
handler, and it may temporarily block if the command thread is
already in TTY::executeCommand(), eventually the command thread
will exit the synchronized block allowing the event thread to make
progress. Having said that, I just assume stick with the simpler
webrev above if no one see issues with it.<br>
</blockquote>
<br>
Thank you for this explanation.<br>
It looks reasonable but I feel more meditation is needed.<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<blockquote type="cite"
cite="mid:8a256bfc-1ff1-da31-ce31-***@oracle.com">
thanks,<br>
<br>
Chris<br>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-***@oracle.com"> <br>
Thanks,<br>
Serguei<br>
<br>
<br>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix"> Although I never confirmed this
was the cause, it certainly seems feasible.<br>
<br>
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it should
really always be printed before any event handling.
Currently that seems to be the case, probably due to the
Thread.yield() I mention below in event handling code like
stepEvent(), although I don't think it's necessarily
guaranteed. However, with my changes prompt was always
printed after the event was done being handled, and this
proved to be a problem for some tests.<br>
<br>
When executing the "cont" command, it seems this prompt is
expected to be "&gt; ", which is the prompt printed if the
VM is not currently suspended. Some tests count on this,
mostly because they call jdb.receiveReplyFor(), and pass in
the number of prompts that are expected.
jdb.receiveReplyFor() does not count "&gt; ". My change
ended up making the printPrompt() in TTY:executeCommand()
print as "main[1]" after a "cont" command, because the added
synchronizing was making it not print until the thread was
suspended. This caused some tests to fail because the
expected reply was not yet received, even after
jdb.receiveReplyFor() had returned (it was basically
returning one prompt too soon).<br>
<br>
I started thinking that the synchronized approach probably
needs a large synchronized block in TTY:executeCommand().
But then I figured out a much easier solution.  Thinking
more about the "prompt after cont" issue, it occurred to me
that maybe we just need to print out the prompt before
executing the command, so it's not possible for events to
come in before the prompt is printed (or worse yet, in the
middle of event handling output). Doing this ends up looking
much like your original suggestion to set printPrompt =
false for "cont", "step", "stepi", and "next", except I also
added a call printPrompt() before doing this (Dan an I had
rejected your proposal to simply remove the prompt, since it
did server a purpose). For example:<br>
<br>
                        } else if (cmd.equals("cont")) {<br>
                            MessageOutput.printPrompt();<br>
                            showPrompt = false;<br>
                            evaluator.commandCont();<br>
<br>
But this does cause the issue I brought up above that the
prompt from "cont" is expected (by a number of tests) to be
"&gt; ", but since the above code is executed while the vm
is suspended, it ends up being main[1]. I fixed this by
adding a "simple" argument to printPrompt() so we can force
it to be "&gt; " when needed. This will maintain
compatibility with anyone making assumptions about what the
prompt should look like.<br>
<br>
Here's the webrev, which seems to be working. I've run about
1000 iterations on all platforms. So far it seems to be
working. There have been some failures, but for the most
part I've also seen those with clean repos, and the ones
that I haven't I don't believe are related to my changes.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
One other comment about your initial fix to set showPrompt =
false:<br>
<br>
+ showPrompt = false; // asynchronous command <br>
<br>
I'm not sure if the "asynchronous command" comment was
intentional or just the result of the copy-and-paste. I just
want to clarify what is meant by "asynchronous command" in
the other places where this comment appears. There are a few
commands that are executed async on a different thread. If
you look in Commands.java, you'll see how these commands use
the AsyncExecution class to handle execution of the command.
I'm not sure why this is done, because all these commands
look like ones that are inherently synchronous, yet they are
always run on a different thread, and as a result may not be
complete when execution returns to TTY::executeCommand().
These commands also all set showPrompt = false, but the
prompt is printed by AsyncExecution when complete. So, the
point is the commands which are inherently async, like
"step" and "cont", are not async w.r.t. these "asynchronous
command" comments or the AsyncExecution class.<br>
<br>
Chris<br>
<br>
<br>
On 7/27/18 4:27 PM, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:853aba55-fafc-2797-ed44-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due
to having the separate event thread whose output needs to
be co-ordinated with command thread. Also made more
complicated by the multiple entrypoints into
executeCommand(), and there being various places where
output produced and we care about the order. I think I
have it all mapped out now and a pretty good understanding
of how it all works.<br>
<br>
My first impression is that it seems like a large number
of changes and a lot of passing around of a StringBuffer
object to fix a problem which I think could be resolved
with some strategic wait/notify code. For example, it
might be as simple as doing something like this:<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.00/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/</a><br>
<br>
This only covers the step command. You'd have to figure
out which other commands need to ensure an event has been
processed before printing the prompt, and set
waitForEventCompletion = true for them. Is there a reason
not to go with something like this?<br>
<br>
As for you current changes, just a couple of things I've
noticed, but I have not given it a thorough review (for
example, making sure that the StringBuffer is always
eventually printed, and done so in the right order).<br>
<br>
In EventHandler.run(), it's unclear who is handling the
printing of sb when handleEvent() or
handleDisconnectedException() are called.<br>
<br>
In the following EventHandler code:<br>
<br>
 123     @Override<br>
 124     public void stepEvent(StepEvent se, StringBuilder
sb)  {<br>
 125         Thread.yield();  // fetch output<br>
 126         sb.append(MessageOutput.format("Step
completed:"));<br>
 127     }<br>
<br>
Do you understand why the Thread.yield() is there? What
does the comment mean by "fetch output"? Just seems like
it has something to do with the ordering of output, but it
seems bad that the code ever relied on a yield to make
sure the output was ordered properly.<br>
<br>
Also in the above code snippet, you need to be careful
when replacing MessageOutput.lnprint() with
MessageOutput.format(). lnprint() adds a newline to the
start of the line. That appears to be missing in your code
above.<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 7/20/18 12:11 PM, Gary Adams wrote:<br>
</div>
<blockquote type="cite"
cite="mid:***@oracle.com"> Here's another
attempt to clear up the overlapping output from <br>
the command processing and event handler in the jdb tests.<br>
<br>
The fundamental problem is observed when "prompts"<br>
are produced interleaved with command and event output.<br>
<br>
This attempts to fix the issue by buffering the output and<br>
printing it fully assembled.<br>
<br>
 Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.01/</a><br>
<br>
On 5/26/18, 6:50 AM, <a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a> wrote:
<blockquote
cite="mid:5d58fa2b-7dda-db64-0280-***@oracle.com"
type="cite"> This is a review request for a previously
closed test bug.<br>
The test was recently moved to the open repos, and the <br>
proposed fix is in the open code.<br>
<p>  Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table">
<tbody>
<tr>
<th nowrap="nowrap">Subject: </th>
<td>RFR: JDK-8169718: nsk/jdb/locals/locals002:
ERROR: Cannot find boolVar with expected
value: false</td>
</tr>
<tr>
<th nowrap="nowrap">Date: </th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th nowrap="nowrap">From: </th>
<td>Gary Adams <a class="moz-txt-link-rfc2396E"
href="mailto:***@oracle.com"
moz-do-not-send="true">&lt;***@oracle.com&gt;</a></td>
</tr>
<tr>
<th nowrap="nowrap">Reply-To: </th>
<td><a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a></td>
</tr>
<tr>
<th nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" &lt;location&gt;
&lt;prompt&gt;

the log contained
"Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001


</pre>
</div>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</body>
</html>
Gary Adams
2018-08-06 16:54:21 UTC
Permalink
Post by Chris Plummer
Post by s***@oracle.com
Hi Chris,
Just one quick simple comment below.
Post by s***@oracle.com
Hi Gary,
I updated my changes to do the wait() for step, stepi, cont, and
next. Some issues turned up when testing. A number of tests were
timing out occasionally, I believe because sometimes the event was
handled (and the notify() call already completed) before we ever got
to the handler.wait() call.
I agree as I had this exact concern when was looking at the webrev.0.
Need more time to understand all the details.
Hopefully it's a moot point since my 2nd version doesn't rely on any
synchronization.
http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/
I think a properly working synchronized version would require a large
synchronized block in TTY::executeCommand() to block out any event
handling until the prompt has been printed. I think this is safe even
when you have a monitor command in place. Although that results in
TTY::executeCommand() being called from the event handler, and it may
temporarily block if the command thread is already in
TTY::executeCommand(), eventually the command thread will exit the
synchronized block allowing the event thread to make progress. Having
said that, I just assume stick with the simpler webrev above if no one
see issues with it.
I like the approach used here - to print the prompt immediately
and skip the the prompt at the bottom of the command processing.

The original error though was an interspersed prompt disrupting
the event handler output, which came from another event
being processed. e.g.
"Breakpoint hit:"
prompt from previous event
printCurrentLocation()

Let me try a few test runs with your patch
along with a diagnostic in printPrompt()
so we can identify the printPrompt caller.

new Exception().printStackTrace()

...
Post by Chris Plummer
thanks,
Chris
Post by s***@oracle.com
Thanks,
Serguei
Post by s***@oracle.com
Although I never confirmed this was the cause, it certainly seems
feasible.
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it should really
always be printed before any event handling. Currently that seems to
be the case, probably due to the Thread.yield() I mention below in
event handling code like stepEvent(), although I don't think it's
necessarily guaranteed. However, with my changes prompt was always
printed after the event was done being handled, and this proved to
be a problem for some tests.
When executing the "cont" command, it seems this prompt is expected
to be "> ", which is the prompt printed if the VM is not currently
suspended. Some tests count on this, mostly because they call
jdb.receiveReplyFor(), and pass in the number of prompts that are
expected. jdb.receiveReplyFor() does not count "> ". My change ended
up making the printPrompt() in TTY:executeCommand() print as
"main[1]" after a "cont" command, because the added synchronizing
was making it not print until the thread was suspended. This caused
some tests to fail because the expected reply was not yet received,
even after jdb.receiveReplyFor() had returned (it was basically
returning one prompt too soon).
I started thinking that the synchronized approach probably needs a
large synchronized block in TTY:executeCommand(). But then I figured
out a much easier solution. Thinking more about the "prompt after
cont" issue, it occurred to me that maybe we just need to print out
the prompt before executing the command, so it's not possible for
events to come in before the prompt is printed (or worse yet, in the
middle of event handling output). Doing this ends up looking much
like your original suggestion to set printPrompt = false for "cont",
"step", "stepi", and "next", except I also added a call
printPrompt() before doing this (Dan an I had rejected your proposal
to simply remove the prompt, since it did server a purpose). For
} else if (cmd.equals("cont")) {
MessageOutput.printPrompt();
showPrompt = false;
evaluator.commandCont();
But this does cause the issue I brought up above that the prompt
from "cont" is expected (by a number of tests) to be "> ", but since
the above code is executed while the vm is suspended, it ends up
being main[1]. I fixed this by adding a "simple" argument to
printPrompt() so we can force it to be "> " when needed. This will
maintain compatibility with anyone making assumptions about what the
prompt should look like.
Here's the webrev, which seems to be working. I've run about 1000
iterations on all platforms. So far it seems to be working. There
have been some failures, but for the most part I've also seen those
with clean repos, and the ones that I haven't I don't believe are
related to my changes.
http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/
+ showPrompt = false; // asynchronous command
I'm not sure if the "asynchronous command" comment was intentional
or just the result of the copy-and-paste. I just want to clarify
what is meant by "asynchronous command" in the other places where
this comment appears. There are a few commands that are executed
async on a different thread. If you look in Commands.java, you'll
see how these commands use the AsyncExecution class to handle
execution of the command. I'm not sure why this is done, because all
these commands look like ones that are inherently synchronous, yet
they are always run on a different thread, and as a result may not
be complete when execution returns to TTY::executeCommand(). These
commands also all set showPrompt = false, but the prompt is printed
by AsyncExecution when complete. So, the point is the commands which
are inherently async, like "step" and "cont", are not async w.r.t.
these "asynchronous command" comments or the AsyncExecution class.
Chris
Post by s***@oracle.com
Hi Gary,
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due to
having the separate event thread whose output needs to be
co-ordinated with command thread. Also made more complicated by the
multiple entrypoints into executeCommand(), and there being various
places where output produced and we care about the order. I think I
have it all mapped out now and a pretty good understanding of how
it all works.
My first impression is that it seems like a large number of changes
and a lot of passing around of a StringBuffer object to fix a
problem which I think could be resolved with some strategic
wait/notify code. For example, it might be as simple as doing
http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/
This only covers the step command. You'd have to figure out which
other commands need to ensure an event has been processed before
printing the prompt, and set waitForEventCompletion = true for
them. Is there a reason not to go with something like this?
As for you current changes, just a couple of things I've noticed,
but I have not given it a thorough review (for example, making sure
that the StringBuffer is always eventually printed, and done so in
the right order).
In EventHandler.run(), it's unclear who is handling the printing of
sb when handleEvent() or handleDisconnectedException() are called.
124 public void stepEvent(StepEvent se, StringBuilder sb) {
125 Thread.yield(); // fetch output
126 sb.append(MessageOutput.format("Step completed:"));
127 }
Do you understand why the Thread.yield() is there? What does the
comment mean by "fetch output"? Just seems like it has something to
do with the ordering of output, but it seems bad that the code ever
relied on a yield to make sure the output was ordered properly.
Also in the above code snippet, you need to be careful when
replacing MessageOutput.lnprint() with MessageOutput.format().
lnprint() adds a newline to the start of the line. That appears to
be missing in your code above.
thanks,
Chris
Post by Gary Adams
Here's another attempt to clear up the overlapping output from
the command processing and event handler in the jdb tests.
The fundamental problem is observed when "prompts"
are produced interleaved with command and event output.
This attempts to fix the issue by buffering the output and
printing it fully assembled.
Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev.01/
Post by g***@oracle.com
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.
Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
-------- Forwarded Message --------
Cannot find boolVar with expected value: false
Date: Fri, 25 May 2018 11:35:10 -0400
The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
"Breakpoint hit:"<location>
<prompt>
the log contained
"Breakpoint hit:"<prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.
redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001
Gary Adams
2018-09-12 15:43:21 UTC
Permalink
Here's an updated webrev including Chris's proposal for early
prompt print outs for commands that could be interrupted
by event processing before command prompt delivery.

Issue: https://bugs.openjdk.java.net/browse/JDK-8169718
Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev.02/

Changes include
- removing the yield in vmInterrupted, to allow the event thread
output to be less interspersed
- improved the command prompts for cont, step, stepi and next
to be less likely to be interspersed with event output
- avoid NPE in printCurrentLocation if current thread is invalidated
Post by Gary Adams
Post by Chris Plummer
Post by s***@oracle.com
Hi Chris,
Just one quick simple comment below.
Post by s***@oracle.com
Hi Gary,
I updated my changes to do the wait() for step, stepi, cont, and
next. Some issues turned up when testing. A number of tests were
timing out occasionally, I believe because sometimes the event was
handled (and the notify() call already completed) before we ever
got to the handler.wait() call.
I agree as I had this exact concern when was looking at the webrev.0.
Need more time to understand all the details.
Hopefully it's a moot point since my 2nd version doesn't rely on any
synchronization.
http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/
I think a properly working synchronized version would require a large
synchronized block in TTY::executeCommand() to block out any event
handling until the prompt has been printed. I think this is safe even
when you have a monitor command in place. Although that results in
TTY::executeCommand() being called from the event handler, and it may
temporarily block if the command thread is already in
TTY::executeCommand(), eventually the command thread will exit the
synchronized block allowing the event thread to make progress. Having
said that, I just assume stick with the simpler webrev above if no
one see issues with it.
I like the approach used here - to print the prompt immediately
and skip the the prompt at the bottom of the command processing.
The original error though was an interspersed prompt disrupting
the event handler output, which came from another event
being processed. e.g.
"Breakpoint hit:"
prompt from previous event
printCurrentLocation()
Let me try a few test runs with your patch
along with a diagnostic in printPrompt()
so we can identify the printPrompt caller.
new Exception().printStackTrace()
...
Post by Chris Plummer
thanks,
Chris
Post by s***@oracle.com
Thanks,
Serguei
Post by s***@oracle.com
Although I never confirmed this was the cause, it certainly seems
feasible.
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it should really
always be printed before any event handling. Currently that seems
to be the case, probably due to the Thread.yield() I mention below
in event handling code like stepEvent(), although I don't think
it's necessarily guaranteed. However, with my changes prompt was
always printed after the event was done being handled, and this
proved to be a problem for some tests.
When executing the "cont" command, it seems this prompt is expected
to be "> ", which is the prompt printed if the VM is not currently
suspended. Some tests count on this, mostly because they call
jdb.receiveReplyFor(), and pass in the number of prompts that are
expected. jdb.receiveReplyFor() does not count "> ". My change
ended up making the printPrompt() in TTY:executeCommand() print as
"main[1]" after a "cont" command, because the added synchronizing
was making it not print until the thread was suspended. This caused
some tests to fail because the expected reply was not yet received,
even after jdb.receiveReplyFor() had returned (it was basically
returning one prompt too soon).
I started thinking that the synchronized approach probably needs a
large synchronized block in TTY:executeCommand(). But then I
figured out a much easier solution. Thinking more about the
"prompt after cont" issue, it occurred to me that maybe we just
need to print out the prompt before executing the command, so it's
not possible for events to come in before the prompt is printed (or
worse yet, in the middle of event handling output). Doing this ends
up looking much like your original suggestion to set printPrompt =
false for "cont", "step", "stepi", and "next", except I also added
a call printPrompt() before doing this (Dan an I had rejected your
proposal to simply remove the prompt, since it did server a
} else if (cmd.equals("cont")) {
MessageOutput.printPrompt();
showPrompt = false;
evaluator.commandCont();
But this does cause the issue I brought up above that the prompt
from "cont" is expected (by a number of tests) to be "> ", but
since the above code is executed while the vm is suspended, it ends
up being main[1]. I fixed this by adding a "simple" argument to
printPrompt() so we can force it to be "> " when needed. This will
maintain compatibility with anyone making assumptions about what
the prompt should look like.
Here's the webrev, which seems to be working. I've run about 1000
iterations on all platforms. So far it seems to be working. There
have been some failures, but for the most part I've also seen those
with clean repos, and the ones that I haven't I don't believe are
related to my changes.
http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/
+ showPrompt = false; // asynchronous command
I'm not sure if the "asynchronous command" comment was intentional
or just the result of the copy-and-paste. I just want to clarify
what is meant by "asynchronous command" in the other places where
this comment appears. There are a few commands that are executed
async on a different thread. If you look in Commands.java, you'll
see how these commands use the AsyncExecution class to handle
execution of the command. I'm not sure why this is done, because
all these commands look like ones that are inherently synchronous,
yet they are always run on a different thread, and as a result may
not be complete when execution returns to TTY::executeCommand().
These commands also all set showPrompt = false, but the prompt is
printed by AsyncExecution when complete. So, the point is the
commands which are inherently async, like "step" and "cont", are
not async w.r.t. these "asynchronous command" comments or the
AsyncExecution class.
Chris
Post by s***@oracle.com
Hi Gary,
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due to
having the separate event thread whose output needs to be
co-ordinated with command thread. Also made more complicated by
the multiple entrypoints into executeCommand(), and there being
various places where output produced and we care about the order.
I think I have it all mapped out now and a pretty good
understanding of how it all works.
My first impression is that it seems like a large number of
changes and a lot of passing around of a StringBuffer object to
fix a problem which I think could be resolved with some strategic
wait/notify code. For example, it might be as simple as doing
http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/
This only covers the step command. You'd have to figure out which
other commands need to ensure an event has been processed before
printing the prompt, and set waitForEventCompletion = true for
them. Is there a reason not to go with something like this?
As for you current changes, just a couple of things I've noticed,
but I have not given it a thorough review (for example, making
sure that the StringBuffer is always eventually printed, and done
so in the right order).
In EventHandler.run(), it's unclear who is handling the printing
of sb when handleEvent() or handleDisconnectedException() are called.
124 public void stepEvent(StepEvent se, StringBuilder sb) {
125 Thread.yield(); // fetch output
126 sb.append(MessageOutput.format("Step completed:"));
127 }
Do you understand why the Thread.yield() is there? What does the
comment mean by "fetch output"? Just seems like it has something
to do with the ordering of output, but it seems bad that the code
ever relied on a yield to make sure the output was ordered properly.
Also in the above code snippet, you need to be careful when
replacing MessageOutput.lnprint() with MessageOutput.format().
lnprint() adds a newline to the start of the line. That appears to
be missing in your code above.
thanks,
Chris
Post by Gary Adams
Here's another attempt to clear up the overlapping output from
the command processing and event handler in the jdb tests.
The fundamental problem is observed when "prompts"
are produced interleaved with command and event output.
This attempts to fix the issue by buffering the output and
printing it fully assembled.
Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev.01/
Post by g***@oracle.com
This is a review request for a previously closed test bug.
The test was recently moved to the open repos, and the
proposed fix is in the open code.
Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
-------- Forwarded Message --------
Cannot find boolVar with expected value: false
Date: Fri, 25 May 2018 11:35:10 -0400
The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
"Breakpoint hit:"<location>
<prompt>
the log contained
"Breakpoint hit:"<prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.
redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001
Chris Plummer
2018-09-12 19:46:06 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I think you should leave the yield() and NPE code out. I don't
think either are bug fixes, but just partial work arounds for
potential issues, and neither is addressing the core issues
brought up in this CR. They were part of your experiments for
JDK-8208473, so let's keep them as part of the discussion for
addressing that CR (which I still need to look into why an NPE is
happening there).<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 9/12/18 8:43 AM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:***@oracle.com">
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
Here's an updated webrev including Chris's proposal for early <br>
prompt print outs for commands that could be interrupted <br>
by event processing before command prompt delivery.<br>
<br>
  Issue: <a class="moz-txt-link-freetext"
href="https://bugs.openjdk.java.net/browse/JDK-8169718"
moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8169718</a><br>
  Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.02/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.02/</a><br>
<br>
Changes include<br>
   - removing the yield in vmInterrupted, to allow the event
thread <br>
      output to be less interspersed<br>
   - improved the command prompts for cont, step, stepi and next<br>
      to be less likely to be interspersed with event output<br>
   - avoid NPE in printCurrentLocation if current thread is
invalidated<br>
 <br>
On 8/6/18, 12:54 PM, Gary Adams wrote:
<blockquote cite="mid:***@oracle.com" type="cite">
<meta content="text/html; charset=utf-8"
http-equiv="Content-Type">
On 7/30/18, 12:46 PM, Chris Plummer wrote:
<blockquote
cite="mid:8a256bfc-1ff1-da31-ce31-***@oracle.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<div class="moz-cite-prefix">On 7/30/18 12:47 AM, <a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com">***@oracle.com</a>
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-***@oracle.com">
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<div class="moz-cite-prefix">Hi Chris,<br>
<br>
Just one quick simple comment below.<br>
<br>
On 7/29/18 22:05, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I updated my changes to do the wait() for step, stepi,
cont, and next. Some issues turned up when testing. A
number of tests were timing out occasionally, I believe
because sometimes the event was handled (and the
notify() call already completed) before we ever got to
the handler.wait() call.</div>
</blockquote>
<br>
I agree as I had this exact concern when was looking at the
webrev.0.<br>
Need more time to understand all the details.<br>
</blockquote>
Hopefully it's a moot point since my 2nd version doesn't rely
on any synchronization.<br>
<br>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
I think a properly working synchronized version would require
a large synchronized block in TTY::executeCommand() to block
out any event handling until the prompt has been printed. I
think this is safe even when you have a monitor command in
place. Although that results in TTY::executeCommand() being
called from the event handler, and it may temporarily block if
the command thread is already in TTY::executeCommand(),
eventually the command thread will exit the synchronized block
allowing the event thread to make progress. Having said that,
I just assume stick with the simpler webrev above if no one
see issues with it.<br>
</blockquote>
<br>
I like the approach used here - to print the prompt immediately
<br>
and skip the the prompt at the bottom of the command processing.<br>
<br>
The original error though was an interspersed prompt disrupting<br>
the event handler output, which came from another event <br>
being processed. e.g.<br>
  "Breakpoint hit:"<br>
  prompt from previous event<br>
  printCurrentLocation()<br>
<br>
Let me try a few test runs with your patch<br>
along with a diagnostic in printPrompt()<br>
so we can identify the printPrompt caller.<br>
<br>
  new Exception().printStackTrace()<br>
<br>
...<br>
<blockquote
cite="mid:8a256bfc-1ff1-da31-ce31-***@oracle.com"
type="cite"> <br>
thanks,<br>
<br>
Chris<br>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-***@oracle.com">
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-***@oracle.com">
<div class="moz-cite-prefix"> Although I never confirmed
this was the cause, it certainly seems feasible.<br>
<br>
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it
should really always be printed before any event
handling. Currently that seems to be the case, probably
due to the Thread.yield() I mention below in event
handling code like stepEvent(), although I don't think
it's necessarily guaranteed. However, with my changes
prompt was always printed after the event was done being
handled, and this proved to be a problem for some tests.<br>
<br>
When executing the "cont" command, it seems this prompt
is expected to be "&gt; ", which is the prompt printed
if the VM is not currently suspended. Some tests count
on this, mostly because they call jdb.receiveReplyFor(),
and pass in the number of prompts that are expected.
jdb.receiveReplyFor() does not count "&gt; ". My change
ended up making the printPrompt() in
TTY:executeCommand() print as "main[1]" after a "cont"
command, because the added synchronizing was making it
not print until the thread was suspended. This caused
some tests to fail because the expected reply was not
yet received, even after jdb.receiveReplyFor() had
returned (it was basically returning one prompt too
soon).<br>
<br>
I started thinking that the synchronized approach
probably needs a large synchronized block in
TTY:executeCommand(). But then I figured out a much
easier solution.  Thinking more about the "prompt after
cont" issue, it occurred to me that maybe we just need
to print out the prompt before executing the command, so
it's not possible for events to come in before the
prompt is printed (or worse yet, in the middle of event
handling output). Doing this ends up looking much like
your original suggestion to set printPrompt = false for
"cont", "step", "stepi", and "next", except I also added
a call printPrompt() before doing this (Dan an I had
rejected your proposal to simply remove the prompt,
since it did server a purpose). For example:<br>
<br>
                        } else if (cmd.equals("cont")) {<br>
                            MessageOutput.printPrompt();<br>
                            showPrompt = false;<br>
                            evaluator.commandCont();<br>
<br>
But this does cause the issue I brought up above that
the prompt from "cont" is expected (by a number of
tests) to be "&gt; ", but since the above code is
executed while the vm is suspended, it ends up being
main[1]. I fixed this by adding a "simple" argument to
printPrompt() so we can force it to be "&gt; " when
needed. This will maintain compatibility with anyone
making assumptions about what the prompt should look
like.<br>
<br>
Here's the webrev, which seems to be working. I've run
about 1000 iterations on all platforms. So far it seems
to be working. There have been some failures, but for
the most part I've also seen those with clean repos, and
the ones that I haven't I don't believe are related to
my changes.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
One other comment about your initial fix to set
showPrompt = false:<br>
<br>
+ showPrompt = false; // asynchronous command <br>
<br>
I'm not sure if the "asynchronous command" comment was
intentional or just the result of the copy-and-paste. I
just want to clarify what is meant by "asynchronous
command" in the other places where this comment appears.
There are a few commands that are executed async on a
different thread. If you look in Commands.java, you'll
see how these commands use the AsyncExecution class to
handle execution of the command. I'm not sure why this
is done, because all these commands look like ones that
are inherently synchronous, yet they are always run on a
different thread, and as a result may not be complete
when execution returns to TTY::executeCommand(). These
commands also all set showPrompt = false, but the prompt
is printed by AsyncExecution when complete. So, the
point is the commands which are inherently async, like
"step" and "cont", are not async w.r.t. these
"asynchronous command" comments or the AsyncExecution
class.<br>
<br>
Chris<br>
<br>
<br>
On 7/27/18 4:27 PM, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:853aba55-fafc-2797-ed44-***@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
Sorry about the delay in the review. Been trying to
fully understand all the various code flows, which is
tricky due to having the separate event thread whose
output needs to be co-ordinated with command thread.
Also made more complicated by the multiple entrypoints
into executeCommand(), and there being various places
where output produced and we care about the order. I
think I have it all mapped out now and a pretty good
understanding of how it all works.<br>
<br>
My first impression is that it seems like a large
number of changes and a lot of passing around of a
StringBuffer object to fix a problem which I think
could be resolved with some strategic wait/notify
code. For example, it might be as simple as doing
something like this:<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.00/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/</a><br>
<br>
This only covers the step command. You'd have to
figure out which other commands need to ensure an
event has been processed before printing the prompt,
and set waitForEventCompletion = true for them. Is
there a reason not to go with something like this?<br>
<br>
As for you current changes, just a couple of things
I've noticed, but I have not given it a thorough
review (for example, making sure that the StringBuffer
is always eventually printed, and done so in the right
order).<br>
<br>
In EventHandler.run(), it's unclear who is handling
the printing of sb when handleEvent() or
handleDisconnectedException() are called.<br>
<br>
In the following EventHandler code:<br>
<br>
 123     @Override<br>
 124     public void stepEvent(StepEvent se,
StringBuilder sb)  {<br>
 125         Thread.yield();  // fetch output<br>
 126         sb.append(MessageOutput.format("Step
completed:"));<br>
 127     }<br>
<br>
Do you understand why the Thread.yield() is there?
What does the comment mean by "fetch output"? Just
seems like it has something to do with the ordering of
output, but it seems bad that the code ever relied on
a yield to make sure the output was ordered properly.<br>
<br>
Also in the above code snippet, you need to be careful
when replacing MessageOutput.lnprint() with
MessageOutput.format(). lnprint() adds a newline to
the start of the line. That appears to be missing in
your code above.<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 7/20/18 12:11 PM, Gary Adams wrote:<br>
</div>
<blockquote type="cite"
cite="mid:***@oracle.com"> Here's another
attempt to clear up the overlapping output from <br>
the command processing and event handler in the jdb
tests.<br>
<br>
The fundamental problem is observed when "prompts"<br>
are produced interleaved with command and event
output.<br>
<br>
This attempts to fix the issue by buffering the output
and<br>
printing it fully assembled.<br>
<br>
 Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.01/</a><br>
<br>
On 5/26/18, 6:50 AM, <a
class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a>
wrote:
<blockquote
cite="mid:5d58fa2b-7dda-db64-0280-***@oracle.com"
type="cite"> This is a review request for a
previously closed test bug.<br>
The test was recently moved to the open repos, and
the <br>
proposed fix is in the open code.<br>
<p>  Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table">
<tbody>
<tr>
<th nowrap="nowrap">Subject: </th>
<td>RFR: JDK-8169718:
nsk/jdb/locals/locals002: ERROR: Cannot
find boolVar with expected value: false</td>
</tr>
<tr>
<th nowrap="nowrap">Date: </th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th nowrap="nowrap">From: </th>
<td>Gary Adams <a
class="moz-txt-link-rfc2396E"
href="mailto:***@oracle.com"
moz-do-not-send="true">&lt;***@oracle.com&gt;</a></td>
</tr>
<tr>
<th nowrap="nowrap">Reply-To: </th>
<td><a class="moz-txt-link-abbreviated"
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a></td>
</tr>
<tr>
<th nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.

Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.

Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.

The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.

Instead of receiving
"Breakpoint hit:" &lt;location&gt;
&lt;prompt&gt;

the log contained
"Breakpoint hit:" &lt;prompt&gt; &lt;location&gt;

Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.

The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.

Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.

 redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001


</pre>
</div>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</body>
</html>

Loading...