Discussion:
RFR (M) 8201655: Add thread-enabled support for the Heap Sampling
JC Beyler
2018-10-26 17:48:14 UTC
Permalink
Hi all,

When working on the heap sampling, I had promised to do the per thread
event so here it is!

Could I get a review for this:
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655

I was thinking of adding GC-dev for the memAllocator change once I get
favorable reviews for the rest of the change.

I've done a bit of performance testing and on the Dacapo benchmark I see no
change in performance when turned off (logical, any code change is behind a
flag check already in place) and when turned on it is comparable to the
current performance.

(More information is: I see a very slight degradation if we are doing 512k
sampling but no degradation at 2MB).

Thanks,
Jc
JC Beyler
2018-11-05 18:18:38 UTC
Permalink
Hi all,

Any chance for a review? :)
Jc
Post by JC Beyler
Hi all,
When working on the heap sampling, I had promised to do the per thread
event so here it is!
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
I was thinking of adding GC-dev for the memAllocator change once I get
favorable reviews for the rest of the change.
I've done a bit of performance testing and on the Dacapo benchmark I see
no change in performance when turned off (logical, any code change is
behind a flag check already in place) and when turned on it is comparable
to the current performance.
(More information is: I see a very slight degradation if we are doing 512k
sampling but no degradation at 2MB).
Thanks,
Jc
--
Thanks,
Jc
s***@oracle.com
2018-11-05 20:56:34 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Jc,<br>
<br>
I'm looking at it now.<br>
Need to double-check something.<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 10/26/18 10:48, JC Beyler wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAF9BGBzLKyAe1eHBN7=F3xoA1-5U-***@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Hi all,</div>
<div dir="ltr"><br>
</div>
<div>When working on the heap sampling, I had promised to do
the per thread event so here it is! </div>
<div><br>
</div>
<div dir="ltr">Could I get a review for this:<br>
<div>Webrev: <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/"
moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/</a><br>
</div>
<div>Bug: <a
href="https://bugs.openjdk.java.net/browse/JDK-8201655"
moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8201655</a></div>
<div><br>
</div>
<div>I was thinking of adding GC-dev for the memAllocator
change once I get favorable reviews for the rest of the
change.</div>
<div><br>
</div>
<div>I've done a bit of performance testing and on the
Dacapo benchmark I see no change in performance when
turned off (logical, any code change is behind a flag
check already in place) and when turned on it is
comparable to the current performance.</div>
<div><br>
</div>
<div>(More information is: I see a very slight degradation
if we are doing 512k sampling but no degradation at 2MB). </div>
<div><br>
</div>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">Thanks,
<div>Jc</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>
s***@oracle.com
2018-11-05 22:24:57 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Jc,<br>
<br>
It looks good in general but I have some comments below.<br> <br> <br> <a class="moz-txt-link-freetext" href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html</a><br> <br> <pre><span class="new">+static bool thread_enabled_for_one_jvmti_env() {</span> <span class="new">+ JavaThread *thread = JavaThread::current();</span> <span class="new">+ JvmtiThreadState *state = thread-&gt;jvmti_thread_state();</span> <span class="new">+ if (state == NULL) {</span> <span class="new">+ return false;</span> <span class="new">+ }</span> <span class="new">+</span> <span class="new">+ JvmtiEnvThreadStateIterator it(state);</span> <span class="new">+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {</span> <span class="new">+ if (ets-&gt;is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {</span>
<span class="new">+ return true;</span>
<span class="new">+ }</span>
<span class="new">+ }</span>
<span class="new">+</span>
<span class="new">+ return false;</span>
<span class="new">+}</span>
<span class="new">+</span>
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());

if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}

<span class="new">+ // Sampling is enabled for at least one thread, is it this one?</span>
<span class="new">+ if (!thread_enabled_for_one_jvmti_env()) {</span>
<span class="new">+ return;</span>
<span class="new">+ }</span>
<span class="new">+

I don't think you need this change as this condition already does it:
</span> if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}

Please, look at the following line in the jvmtiEventController.cpp:
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled &amp; SAMPLED_OBJECT_ALLOC_BIT) != 0);

I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?


<a class="moz-txt-link-freetext" href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html</a>

A couple of places where err is declared as int instead of jvmtiError:
<span class="new">
</span> 714 int err;
<span class="changed"> 742 int err;

Should not be silent in a case of JVMTI error: </span> 744 err = (*jvmti)-&gt;GetThreadInfo(jvmti, thread, &amp;info);
745 if (err != JVMTI_ERROR_NONE) {
<span class="changed"> 746 return;</span>
<span class="changed"></span><span class="new"></span></pre>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 10/26/18 10:48, JC Beyler wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAF9BGBzLKyAe1eHBN7=F3xoA1-5U-***@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Hi all,</div>
<div dir="ltr"><br>
</div>
<div>When working on the heap sampling, I had promised to do
the per thread event so here it is! </div>
<div><br>
</div>
<div dir="ltr">Could I get a review for this:<br>
<div>Webrev: <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/"
moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/</a><br>
</div>
<div>Bug: <a
href="https://bugs.openjdk.java.net/browse/JDK-8201655"
moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8201655</a></div>
<div><br>
</div>
<div>I was thinking of adding GC-dev for the memAllocator
change once I get favorable reviews for the rest of the
change.</div>
<div><br>
</div>
<div>I've done a bit of performance testing and on the
Dacapo benchmark I see no change in performance when
turned off (logical, any code change is behind a flag
check already in place) and when turned on it is
comparable to the current performance.</div>
<div><br>
</div>
<div>(More information is: I see a very slight degradation
if we are doing 512k sampling but no degradation at 2MB). </div>
<div><br>
</div>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">Thanks,
<div>Jc</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>
JC Beyler
2018-11-05 23:14:27 UTC
Permalink
Hi Serguei,

First off, thanks as always for looking at this :-) Let me inline my
answers:

I actually "struggled" with this part of the change. My change is correct
semantically and if you care about performance for when sampling a given
thread.
Your change will work semantically but the performance is the same as the
global sampling.

What I mean by working semantically is that that the tests and the code
will work. However, this means that all threads will be doing the sampling
work but when the code will post the event here:
->
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html

(which is why your suggestion works, the change in jvmtiExport basically
ensures only the threads requested are posting events)

The code will check that we actually only post for threads we care about.
The code above ensures that only threads that were requested to be sampling
are the ones that are sampling internally.

Note: I REALLY prefer your suggestion for two reasons:
- We do not change the runtime/GC code at all, it remains "simple"
- The overhead in the general case goes away and this is a NOP for my
actual use-case from a performance point of view (sampling every thread)

But:
- Then sampling per thread really is just telling the system don't
pollute the callbacks, though internally you are doing all the work anyway.

Let me know which you prefer :)
Post by s***@oracle.com
Also, do you see that enabling the sampling events globally still works?
Yes, otherwise HeapMonitorThreadTest.java would fail since it checks that.
Post by s***@oracle.com
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html <http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html>
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Done and done, I added a fprintf on stderr saying the GetThreadInfo failed
and the test is ignoring the add count.

Thanks again for looking and let me know what you think,
Jc
Post by s***@oracle.com
Hi Jc,
It looks good in general but I have some comments below.
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
+static bool thread_enabled_for_one_jvmti_env() {+ JavaThread *thread = JavaThread::current();+ JvmtiThreadState *state = thread->jvmti_thread_state();+ if (state == NULL) {+ return false;+ }++ JvmtiEnvThreadStateIterator it(state);+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {+ if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {+ return true;+ }+ }++ return false;+}+
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());
if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
+ // Sampling is enabled for at least one thread, is it this one?+ if (!thread_enabled_for_one_jvmti_env()) {+ return;+ }+
I don't think you need this change as this condition already does it: if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Thanks,
Serguei
Hi all,
When working on the heap sampling, I had promised to do the per thread
event so here it is!
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
I was thinking of adding GC-dev for the memAllocator change once I get
favorable reviews for the rest of the change.
I've done a bit of performance testing and on the Dacapo benchmark I see
no change in performance when turned off (logical, any code change is
behind a flag check already in place) and when turned on it is comparable
to the current performance.
(More information is: I see a very slight degradation if we are doing 512k
sampling but no degradation at 2MB).
Thanks,
Jc
--
Thanks,
Jc
s***@oracle.com
2018-11-06 02:51:52 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Jc,<br>
<br>
Okay, I see your point: the change in memAllocator.cpp is for
performance.<br>
Do you have any measurements showing a performance difference?<br>
Also, do you need me to submit a mach5 test run?<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 11/5/18 15:14, JC Beyler wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAF9BGBxNkdoKmMWHx2a3H+***@mail.gmail.com">
<div dir="ltr">Hi Serguei,<br>
<div><br>
</div>
<div>First off, thanks as always for looking at this :-) Let me
inline my answers:</div>
<br>
<div class="gmail_quote">
<div>I actually "struggled" with this part of the change. My
change is correct semantically and if you care about
performance for when sampling a given thread. </div>
<div>Your change will work semantically but the performance is
the same as the global sampling. </div>
<div><br>
</div>
<div>What I mean by working semantically is that that the
tests and the code will work. However, this means that all
threads will be doing the sampling work but when the code
will post the event here:</div> <div>   -&gt; <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html"
target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html</a></div>
<div><br>
</div>
<div>(which is why your suggestion works, the change in
jvmtiExport basically ensures only the threads requested are
posting events)</div>
<div><br>
</div>
<div>The code will check that we actually only post for
threads we care about. The code above ensures that only
threads that were requested to be sampling are the ones that
are sampling internally.</div>
<div><br>
</div>
<div>Note: I REALLY prefer your suggestion for two reasons:</div>
<div>  - We do not change the runtime/GC code at all, it
remains "simple"</div>
<div>  - The overhead in the general case goes away and this
is a NOP for my actual use-case from a performance point of
view (sampling every thread)</div>
<div><br>
</div>
<div>But:</div>
<div>  - Then sampling per thread really is just telling the
system don't pollute the callbacks, though internally you
are doing all the work anyway.</div>
<div><br>
</div>
<div>Let me know which you prefer :)<br>
</div>
<span class="gmail-im">
<div><br>
</div>
<div> <br>
</div>
<blockquote class="gmail_quote">
<div>
<div
class="gmail-m_5462246381949391541gmail-m_-3385713941130615224moz-cite-prefix">
<pre> Also, do you see that enabling the sampling events globally still works?</pre>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>Yes, otherwise HeapMonitorThreadTest.java would fail
since it checks that.</div>
<span class="gmail-im">
<div> <br>
</div>
<blockquote class="gmail_quote">
<div>
<div
class="gmail-m_5462246381949391541gmail-m_-3385713941130615224moz-cite-prefix">
<pre><a class="gmail-m_5462246381949391541gmail-m_-3385713941130615224moz-txt-link-freetext" href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html" target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html</a>

A couple of places where err is declared as int instead of jvmtiError:
<span class="gmail-m_5462246381949391541gmail-m_-3385713941130615224new">
</span> 714 int err;
<span class="gmail-m_5462246381949391541gmail-m_-3385713941130615224changed"> 742 int err;

Should not be silent in a case of JVMTI error: </span> 744 err = (*jvmti)-&gt;GetThreadInfo(jvmti, thread, &amp;info);
745 if (err != JVMTI_ERROR_NONE) {
<span class="gmail-m_5462246381949391541gmail-m_-3385713941130615224changed"> 746 return;</span>
<span class="gmail-m_5462246381949391541gmail-m_-3385713941130615224changed"></span><span class="gmail-m_5462246381949391541gmail-m_-3385713941130615224new"></span></pre>
<br>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>Done and done, I added a fprintf on stderr saying the
GetThreadInfo failed and the test is ignoring the add count.</div>
<div><br>
</div>
<div>Thanks again for looking and let me know what you think,</div>
<div>Jc</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Mon, Nov 5, 2018 at 2:25 PM <a
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a> &lt;<a
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a>&gt;
wrote:<br>
</div>
<blockquote class="gmail_quote">
<div>
<div class="m_-3385713941130615224moz-cite-prefix">Hi Jc,<br>
<br>
It looks good in general but I have some comments below.<br>
<br>
<br>
<a class="m_-3385713941130615224moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html"
target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html</a><br> <br> <pre><span class="m_-3385713941130615224new">+static bool thread_enabled_for_one_jvmti_env() {</span> <span class="m_-3385713941130615224new">+ JavaThread *thread = JavaThread::current();</span> <span class="m_-3385713941130615224new">+ JvmtiThreadState *state = thread-&gt;jvmti_thread_state();</span> <span class="m_-3385713941130615224new">+ if (state == NULL) {</span> <span class="m_-3385713941130615224new">+ return false;</span> <span class="m_-3385713941130615224new">+ }</span> <span class="m_-3385713941130615224new">+</span> <span class="m_-3385713941130615224new">+ JvmtiEnvThreadStateIterator it(state);</span> <span class="m_-3385713941130615224new">+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {</span> <span class="m_-3385713941130615224new">+ if (ets-&gt;is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {</span>
<span class="m_-3385713941130615224new">+ return true;</span>
<span class="m_-3385713941130615224new">+ }</span>
<span class="m_-3385713941130615224new">+ }</span>
<span class="m_-3385713941130615224new">+</span>
<span class="m_-3385713941130615224new">+ return false;</span>
<span class="m_-3385713941130615224new">+}</span>
<span class="m_-3385713941130615224new">+</span>
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());

if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}

<span class="m_-3385713941130615224new">+ // Sampling is enabled for at least one thread, is it this one?</span>
<span class="m_-3385713941130615224new">+ if (!thread_enabled_for_one_jvmti_env()) {</span>
<span class="m_-3385713941130615224new">+ return;</span>
<span class="m_-3385713941130615224new">+ }</span>
<span class="m_-3385713941130615224new">+

I don't think you need this change as this condition already does it:
</span> if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}

Please, look at the following line in the jvmtiEventController.cpp:
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled &amp; SAMPLED_OBJECT_ALLOC_BIT) != 0);

I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?


<a class="m_-3385713941130615224moz-txt-link-freetext" href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html" target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html</a>

A couple of places where err is declared as int instead of jvmtiError:
<span class="m_-3385713941130615224new">
</span> 714 int err;
<span class="m_-3385713941130615224changed"> 742 int err;

Should not be silent in a case of JVMTI error: </span> 744 err = (*jvmti)-&gt;GetThreadInfo(jvmti, thread, &amp;info);
745 if (err != JVMTI_ERROR_NONE) {
<span class="m_-3385713941130615224changed"> 746 return;</span>
<span class="m_-3385713941130615224changed"></span><span class="m_-3385713941130615224new"></span></pre>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 10/26/18 10:48, JC Beyler wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Hi all,</div>
<div dir="ltr"><br>
</div>
<div>When working on the heap sampling, I had promised
to do the per thread event so here it is! </div>
<div><br>
</div>
<div dir="ltr">Could I get a review for this:<br>
<div>Webrev: <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/"
target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/</a><br>
</div>
<div>Bug: <a
href="https://bugs.openjdk.java.net/browse/JDK-8201655"
target="_blank" moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8201655</a></div>
<div><br>
</div>
<div>I was thinking of adding GC-dev for the
memAllocator change once I get favorable reviews
for the rest of the change.</div>
<div><br>
</div>
<div>I've done a bit of performance testing and on
the Dacapo benchmark I see no change in
performance when turned off (logical, any code
change is behind a flag check already in place)
and when turned on it is comparable to the current
performance.</div>
<div><br>
</div>
<div>(More information is: I see a very slight
degradation if we are doing 512k sampling but no
degradation at 2MB). </div>
<div><br>
</div>
<div dir="ltr"
class="m_-3385713941130615224gmail_signature">
<div dir="ltr">Thanks,
<div>Jc</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div><br>
</div>
Thanks,
<div>Jc</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>
JC Beyler
2018-11-06 17:42:14 UTC
Permalink
Hi Serguei,

Yes exactly it was an optimization. When using a 512k sampling rate, I
don't see a no real difference (the overhead is anyway low for that
sampling rate), I imagine there would be a difference if trying to sample
every allocation or with a low sampling interval. But because you are right
and it is an optimization of the system and not a functional need, I've
reverted it and now the webrev is updated here:

Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655

The incremental webrev is here:
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/

Let me know what you think,
Jc
Post by s***@oracle.com
Hi Jc,
Okay, I see your point: the change in memAllocator.cpp is for performance.
Do you have any measurements showing a performance difference?
Also, do you need me to submit a mach5 test run?
Thanks,
Serguei
Hi Serguei,
First off, thanks as always for looking at this :-) Let me inline my
I actually "struggled" with this part of the change. My change is correct
semantically and if you care about performance for when sampling a given
thread.
Your change will work semantically but the performance is the same as the
global sampling.
What I mean by working semantically is that that the tests and the code
will work. However, this means that all threads will be doing the sampling
->
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html
(which is why your suggestion works, the change in jvmtiExport basically
ensures only the threads requested are posting events)
The code will check that we actually only post for threads we care about.
The code above ensures that only threads that were requested to be sampling
are the ones that are sampling internally.
- We do not change the runtime/GC code at all, it remains "simple"
- The overhead in the general case goes away and this is a NOP for my
actual use-case from a performance point of view (sampling every thread)
- Then sampling per thread really is just telling the system don't
pollute the callbacks, though internally you are doing all the work anyway.
Let me know which you prefer :)
Post by s***@oracle.com
Also, do you see that enabling the sampling events globally still works?
Yes, otherwise HeapMonitorThreadTest.java would fail since it checks that.
Post by s***@oracle.com
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Done and done, I added a fprintf on stderr saying the GetThreadInfo failed
and the test is ignoring the add count.
Thanks again for looking and let me know what you think,
Jc
Post by s***@oracle.com
Hi Jc,
It looks good in general but I have some comments below.
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
+static bool thread_enabled_for_one_jvmti_env() {+ JavaThread *thread = JavaThread::current();+ JvmtiThreadState *state = thread->jvmti_thread_state();+ if (state == NULL) {+ return false;+ }++ JvmtiEnvThreadStateIterator it(state);+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {+ if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {+ return true;+ }+ }++ return false;+}+
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());
if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
+ // Sampling is enabled for at least one thread, is it this one?+ if (!thread_enabled_for_one_jvmti_env()) {+ return;+ }+
I don't think you need this change as this condition already does it: if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Thanks,
Serguei
Hi all,
When working on the heap sampling, I had promised to do the per thread
event so here it is!
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
I was thinking of adding GC-dev for the memAllocator change once I get
favorable reviews for the rest of the change.
I've done a bit of performance testing and on the Dacapo benchmark I see
no change in performance when turned off (logical, any code change is
behind a flag check already in place) and when turned on it is comparable
to the current performance.
(More information is: I see a very slight degradation if we are doing
512k sampling but no degradation at 2MB).
Thanks,
Jc
--
Thanks,
Jc
--
Thanks,
Jc
s***@oracle.com
2018-11-07 00:29:29 UTC
Permalink
Hi Jc,

Not sure, I understand a motivation for this change:

- if (JvmtiExport::should_post_sampled_object_alloc()) {
+ {

Also, I'm not sure this is still needed:

+#include "prims/jvmtiEventController.inline.hpp"
+#include "prims/jvmtiThreadState.inline.hpp"

I expected you'd just revert all the changes in the memAllocator.cpp.

Also, it is up to you to make a decision if these performance-related
fix is needed or not.

But it needs to be well tested so that both global+thread event
management works correctly.

Thanks,
Serguei
Post by JC Beyler
Hi Serguei,
Yes exactly it was an optimization. When using a 512k sampling rate, I
don't see a no real difference (the overhead is anyway low for that
sampling rate), I imagine there would be a difference if trying to
sample every allocation or with a low sampling interval. But because
you are right and it is an optimization of the system and not a
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/
<http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.03/>
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/
<http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02_03/>
Let me know what you think,
Jc
Hi Jc,
Okay, I see your point: the change in memAllocator.cpp is for performance.
Do you have any measurements showing a performance difference?
Also, do you need me to submit a mach5 test run?
Thanks,
Serguei
Post by JC Beyler
Hi Serguei,
First off, thanks as always for looking at this :-) Let me inline
I actually "struggled" with this part of the change. My change is
correct semantically and if you care about performance for when
sampling a given thread.
Your change will work semantically but the performance is the
same as the global sampling.
What I mean by working semantically is that that the tests and
the code will work. However, this means that all threads will be
   ->
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html
<http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html>
(which is why your suggestion works, the change in jvmtiExport
basically ensures only the threads requested are posting events)
The code will check that we actually only post for threads we
care about. The code above ensures that only threads that were
requested to be sampling are the ones that are sampling internally.
  - We do not change the runtime/GC code at all, it remains "simple"
  - The overhead in the general case goes away and this is a NOP
for my actual use-case from a performance point of view (sampling
every thread)
  - Then sampling per thread really is just telling the system
don't pollute the callbacks, though internally you are doing all
the work anyway.
Let me know which you prefer :)
Also, do you see that enabling the sampling events globally still works?
Yes, otherwise HeapMonitorThreadTest.java would fail since it checks that.
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err;
742 int err; Should not be silent in a case of JVMTI error: 744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) {
746 return;
Done and done, I added a fprintf on stderr saying the
GetThreadInfo failed and the test is ignoring the add count.
Thanks again for looking and let me know what you think,
Jc
Hi Jc,
It looks good in general but I have some comments below.
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
+static bool thread_enabled_for_one_jvmti_env() {
+ JavaThread *thread = JavaThread::current();
+ JvmtiThreadState *state = thread->jvmti_thread_state();
+ if (state == NULL) {
+ return false;
+ }
+
+ JvmtiEnvThreadStateIterator it(state);
+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL;
ets = it.next(ets)) {
+ if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());
if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
+ // Sampling is enabled for at least one thread, is it this one?
+ if (!thread_enabled_for_one_jvmti_env()) {
+ return;
+ }
+ I don't think you need this change as this condition
already does it: if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err;
742 int err; Should not be silent in a case of JVMTI error: 744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) {
746 return;
Thanks,
Serguei
Post by JC Beyler
Hi all,
When working on the heap sampling, I had promised to do the
per thread event so here it is!
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
<http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/>
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
I was thinking of adding GC-dev for the memAllocator change
once I get favorable reviews for the rest of the change.
I've done a bit of performance testing and on the Dacapo
benchmark I see no change in performance when turned off
(logical, any code change is behind a flag check already in
place) and when turned on it is comparable to the current
performance.
(More information is: I see a very slight degradation if we
are doing 512k sampling but no degradation at 2MB).
Thanks,
Jc
--
Thanks,
Jc
--
Thanks,
Jc
JC Beyler
2018-11-07 04:03:12 UTC
Permalink
Hi Serguei,

You are right, I should have reverted the memAllocator.cpp file, sorry
about that.

Here is the new webrev:
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.04/

I think we are good by testing standards, like I
said HeapMonitorThreadTest.java tests multiple threads. I did test an
example with a thousand threads and I get the samples from 1000 threads so
it seems to work there too.

Per thread is tested via the new HeapMonitorThreadDisabledTest.java so I
think we are good there too.

I would recommend a mach-5 testing just in case for this one if you can, it
will be better to have that reinsurance.

Thanks for your help,
Jc
Post by s***@oracle.com
Hi Jc,
- if (JvmtiExport::should_post_sampled_object_alloc()) {+ {
+#include "prims/jvmtiEventController.inline.hpp"+#include "prims/jvmtiThreadState.inline.hpp"
I expected you'd just revert all the changes in the memAllocator.cpp.
Also, it is up to you to make a decision if these performance-related fix
is needed or not.
But it needs to be well tested so that both global+thread event management
works correctly.
Thanks,
Serguei
Hi Serguei,
Yes exactly it was an optimization. When using a 512k sampling rate, I
don't see a no real difference (the overhead is anyway low for that
sampling rate), I imagine there would be a difference if trying to sample
every allocation or with a low sampling interval. But because you are right
and it is an optimization of the system and not a functional need, I've
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/
Let me know what you think,
Jc
Post by s***@oracle.com
Hi Jc,
Okay, I see your point: the change in memAllocator.cpp is for performance.
Do you have any measurements showing a performance difference?
Also, do you need me to submit a mach5 test run?
Thanks,
Serguei
Hi Serguei,
First off, thanks as always for looking at this :-) Let me inline my
I actually "struggled" with this part of the change. My change is correct
semantically and if you care about performance for when sampling a given
thread.
Your change will work semantically but the performance is the same as the
global sampling.
What I mean by working semantically is that that the tests and the code
will work. However, this means that all threads will be doing the sampling
->
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html
(which is why your suggestion works, the change in jvmtiExport basically
ensures only the threads requested are posting events)
The code will check that we actually only post for threads we care about.
The code above ensures that only threads that were requested to be sampling
are the ones that are sampling internally.
- We do not change the runtime/GC code at all, it remains "simple"
- The overhead in the general case goes away and this is a NOP for my
actual use-case from a performance point of view (sampling every thread)
- Then sampling per thread really is just telling the system don't
pollute the callbacks, though internally you are doing all the work anyway.
Let me know which you prefer :)
Post by s***@oracle.com
Also, do you see that enabling the sampling events globally still works?
Yes, otherwise HeapMonitorThreadTest.java would fail since it checks that.
Post by s***@oracle.com
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Done and done, I added a fprintf on stderr saying the GetThreadInfo
failed and the test is ignoring the add count.
Thanks again for looking and let me know what you think,
Jc
Post by s***@oracle.com
Hi Jc,
It looks good in general but I have some comments below.
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
+static bool thread_enabled_for_one_jvmti_env() {+ JavaThread *thread = JavaThread::current();+ JvmtiThreadState *state = thread->jvmti_thread_state();+ if (state == NULL) {+ return false;+ }++ JvmtiEnvThreadStateIterator it(state);+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {+ if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {+ return true;+ }+ }++ return false;+}+
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());
if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
+ // Sampling is enabled for at least one thread, is it this one?+ if (!thread_enabled_for_one_jvmti_env()) {+ return;+ }+
I don't think you need this change as this condition already does it: if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Thanks,
Serguei
Hi all,
When working on the heap sampling, I had promised to do the per thread
event so here it is!
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
I was thinking of adding GC-dev for the memAllocator change once I get
favorable reviews for the rest of the change.
I've done a bit of performance testing and on the Dacapo benchmark I see
no change in performance when turned off (logical, any code change is
behind a flag check already in place) and when turned on it is comparable
to the current performance.
(More information is: I see a very slight degradation if we are doing
512k sampling but no degradation at 2MB).
Thanks,
Jc
--
Thanks,
Jc
--
Thanks,
Jc
--
Thanks,
Jc
s***@oracle.com
2018-11-07 06:41:00 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Jc,<br>
<br>
Thank you for the update!<br>
It looks good.<br>
It is great that testing on your side is Okay.<br>
<br>
I'll submit a mach5 job soon (today or tomorrow).<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 11/6/18 20:03, JC Beyler wrote:<br>
</div>
<blockquote type="cite"
cite="mid:***@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Hi Serguei,
<div><br>
</div>
<div>You are right, I should have reverted the
memAllocator.cpp file, sorry about that.</div>
<div><br>
</div>
<div>Here is the new webrev:</div>
<div><a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.04/"
moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.04/</a><br>
</div>
<div><br>
</div>
<div>I think we are good by testing standards, like I
said HeapMonitorThreadTest.java tests multiple threads. I
did test an example with a thousand threads and I get the
samples from 1000 threads so it seems to work there too.</div>
<div><br>
</div>
<div>Per thread is tested via the new
HeapMonitorThreadDisabledTest.java so I think we are good
there too.</div>
<div><br>
</div>
<div>I would recommend a mach-5 testing just in case for
this one if you can, it will be better to have that
reinsurance.</div>
<div><br>
</div>
<div>Thanks for your help,</div>
<div>Jc</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Tue, Nov 6, 2018 at 4:29 PM &lt;<a
href="mailto:***@oracle.com"
moz-do-not-send="true">***@oracle.com</a>&gt;
wrote:<br>
</div>
<blockquote class="gmail_quote">
<div> Hi Jc,<br>
<br>
Not sure, I understand a motivation for this change:<br>
<pre><span class="m_4822938364167385442removed">- if (JvmtiExport::should_post_sampled_object_alloc()) {</span>
<span class="m_4822938364167385442new">+ {</span></pre>
Also, I'm not sure this is still needed:<br>
<pre><span class="m_4822938364167385442new">+#include "prims/jvmtiEventController.inline.hpp"</span>
<span class="m_4822938364167385442new">+#include "prims/jvmtiThreadState.inline.hpp"</span></pre>
I expected you'd just revert all the changes in the
memAllocator.cpp.<br>
<br>
Also, it is up to you to make a decision if these
performance-related fix is needed or not.<br>
<br>
But it needs to be well tested so that both global+thread
event management works correctly.<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
<div class="m_4822938364167385442moz-cite-prefix">On 11/6/18
9:42 AM, JC Beyler wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Hi Serguei,<br>
<div><br>
</div>
<div>Yes exactly it was an optimization. When
using a 512k sampling rate, I don't see a no
real difference (the overhead is anyway low for
that sampling rate), I imagine there would be a
difference if trying to sample every allocation
or with a low sampling interval. But because you
are right and it is an optimization of the
system and not a functional need, I've reverted
it and now the webrev is updated here:</div>
<div><br>
</div>
<div>Webrev: <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.03/"
target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/</a></div>
<div>Bug: <a
href="https://bugs.openjdk.java.net/browse/JDK-8201655"
target="_blank" moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8201655</a></div>
<div><br>
</div>
<div>The incremental webrev is here: <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02_03/"
target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/</a></div>
<div><br>
</div>
<div>Let me know what you think,</div>
<div>Jc<br>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Mon, Nov 5, 2018 at 6:51 PM <a
href="mailto:***@oracle.com"
target="_blank" moz-do-not-send="true">***@oracle.com</a>
&lt;<a href="mailto:***@oracle.com"
target="_blank" moz-do-not-send="true">***@oracle.com</a>&gt;
wrote:<br>
</div>
<blockquote class="gmail_quote">
<div>
<div
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554moz-cite-prefix">Hi
Jc,<br>
<br>
Okay, I see your point: the change in
memAllocator.cpp is for performance.<br>
Do you have any measurements showing a performance
difference?<br>
Also, do you need me to submit a mach5 test run?<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 11/5/18 15:14, JC Beyler wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Serguei,<br>
<div><br>
</div>
<div>First off, thanks as always for looking at
this :-) Let me inline my answers:</div>
<br>
<div class="gmail_quote">
<div>I actually "struggled" with this part of
the change. My change is correct
semantically and if you care about
performance for when sampling a given
thread. </div>
<div>Your change will work semantically but
the performance is the same as the global
sampling. </div>
<div><br>
</div>
<div>What I mean by working semantically is
that that the tests and the code will work.
However, this means that all threads will be
doing the sampling work but when the code
will post the event here:</div> <div>   -&gt; <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html"
target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html</a></div>
<div><br>
</div>
<div>(which is why your suggestion works, the
change in jvmtiExport basically ensures only
the threads requested are posting events)</div>
<div><br>
</div>
<div>The code will check that we actually only
post for threads we care about. The code
above ensures that only threads that were
requested to be sampling are the ones that
are sampling internally.</div>
<div><br>
</div>
<div>Note: I REALLY prefer your suggestion for
two reasons:</div>
<div>  - We do not change the runtime/GC code
at all, it remains "simple"</div>
<div>  - The overhead in the general case goes
away and this is a NOP for my actual
use-case from a performance point of view
(sampling every thread)</div>
<div><br>
</div>
<div>But:</div>
<div>  - Then sampling per thread really is
just telling the system don't pollute the
callbacks, though internally you are doing
all the work anyway.</div>
<div><br>
</div>
<div>Let me know which you prefer :)<br>
</div>
<span
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-im">
<div><br>
</div>
<div> <br>
</div>
<blockquote class="gmail_quote">
<div>
<div
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224moz-cite-prefix">
<pre> Also, do you see that enabling the sampling events globally still works?</pre>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>Yes, otherwise HeapMonitorThreadTest.java
would fail since it checks that.</div>
<span
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-im">
<div> <br>
</div>
<blockquote class="gmail_quote">
<div>
<div
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224moz-cite-prefix">
<pre><a class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224moz-txt-link-freetext" href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html" target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html</a>

A couple of places where err is declared as int instead of jvmtiError:
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224new">
</span> 714 int err;
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224changed"> 742 int err;

Should not be silent in a case of JVMTI error: </span> 744 err = (*jvmti)-&gt;GetThreadInfo(jvmti, thread, &amp;info);
745 if (err != JVMTI_ERROR_NONE) {
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224changed"> 746 return;</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224changed"></span><span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail-m_5462246381949391541gmail-m_-3385713941130615224new"></span></pre>
<br>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>Done and done, I added a fprintf on
stderr saying the GetThreadInfo failed and
the test is ignoring the add count.</div>
<div><br>
</div>
<div>Thanks again for looking and let me know
what you think,</div>
<div>Jc</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Mon, Nov 5, 2018 at 2:25 PM <a
href="mailto:***@oracle.com"
target="_blank" moz-do-not-send="true">***@oracle.com</a>
&lt;<a
href="mailto:***@oracle.com"
target="_blank" moz-do-not-send="true">***@oracle.com</a>&gt;
wrote:<br>
</div>
<blockquote class="gmail_quote">
<div>
<div
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224moz-cite-prefix">Hi
Jc,<br>
<br>
It looks good in general but I have some
comments below.<br>
<br>
<br>
<a
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html"
target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html</a><br> <br> <pre><span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+static bool thread_enabled_for_one_jvmti_env() {</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ JavaThread *thread = JavaThread::current();</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ JvmtiThreadState *state = thread-&gt;jvmti_thread_state();</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ if (state == NULL) {</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ return false;</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ }</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ JvmtiEnvThreadStateIterator it(state);</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {</span> <span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ if (ets-&gt;is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ return true;</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ }</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ }</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ return false;</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+}</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+</span>
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());

if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}

<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ // Sampling is enabled for at least one thread, is it this one?</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ if (!thread_enabled_for_one_jvmti_env()) {</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ return;</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+ }</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">+

I don't think you need this change as this condition already does it:
</span> if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}

Please, look at the following line in the jvmtiEventController.cpp:
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled &amp; SAMPLED_OBJECT_ALLOC_BIT) != 0);

I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?


<a class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224moz-txt-link-freetext" href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html" target="_blank" moz-do-not-send="true">http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html</a>

A couple of places where err is declared as int instead of jvmtiError:
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new">
</span> 714 int err;
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224changed"> 742 int err;

Should not be silent in a case of JVMTI error: </span> 744 err = (*jvmti)-&gt;GetThreadInfo(jvmti, thread, &amp;info);
745 if (err != JVMTI_ERROR_NONE) {
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224changed"> 746 return;</span>
<span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224changed"></span><span class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224new"></span></pre>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 10/26/18 10:48, JC Beyler wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Hi all,</div>
<div dir="ltr"><br>
</div>
<div>When working on the heap
sampling, I had promised to do the
per thread event so here it is! </div>
<div><br>
</div>
<div dir="ltr">Could I get a review
for this:<br>
<div>Webrev: <a
href="http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/"
target="_blank"
moz-do-not-send="true">http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/</a><br>
</div>
<div>Bug: <a
href="https://bugs.openjdk.java.net/browse/JDK-8201655"
target="_blank"
moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8201655</a></div>
<div><br>
</div>
<div>I was thinking of adding GC-dev
for the memAllocator change once I
get favorable reviews for the rest
of the change.</div>
<div><br>
</div>
<div>I've done a bit of performance
testing and on the Dacapo
benchmark I see no change in
performance when turned off
(logical, any code change is
behind a flag check already in
place) and when turned on it is
comparable to the current
performance.</div>
<div><br>
</div>
<div>(More information is: I see a
very slight degradation if we are
doing 512k sampling but no
degradation at 2MB). </div>
<div><br>
</div>
<div dir="ltr"
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554m_-3385713941130615224gmail_signature">
<div dir="ltr">Thanks,
<div>Jc</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
<div><br>
</div>
-- <br>
<div dir="ltr"
class="m_4822938364167385442m_7443063805677570592m_-7679167297582474554gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div><br>
</div>
Thanks,
<div>Jc</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
<div><br>
</div>
-- <br>
<div dir="ltr"
class="m_4822938364167385442m_7443063805677570592gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div><br>
</div>
Thanks,
<div>Jc</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div><br>
</div>
Thanks,
<div>Jc</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>
JC Beyler
2018-11-12 17:53:12 UTC
Permalink
Hi Serguei,

Thanks for the update and thanks for testing mach5. Serguei sent me that
the testing passed mach5 testing, could I get another review to be able to
push it?

Thanks!
Jc
Post by s***@oracle.com
Hi Jc,
Thank you for the update!
It looks good.
It is great that testing on your side is Okay.
I'll submit a mach5 job soon (today or tomorrow).
Thanks,
Serguei
Hi Serguei,
You are right, I should have reverted the memAllocator.cpp file, sorry
about that.
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.04/
I think we are good by testing standards, like I
said HeapMonitorThreadTest.java tests multiple threads. I did test an
example with a thousand threads and I get the samples from 1000 threads so
it seems to work there too.
Per thread is tested via the new HeapMonitorThreadDisabledTest.java so I
think we are good there too.
I would recommend a mach-5 testing just in case for this one if you can,
it will be better to have that reinsurance.
Thanks for your help,
Jc
Post by s***@oracle.com
Hi Jc,
- if (JvmtiExport::should_post_sampled_object_alloc()) {+ {
+#include "prims/jvmtiEventController.inline.hpp"+#include "prims/jvmtiThreadState.inline.hpp"
I expected you'd just revert all the changes in the memAllocator.cpp.
Also, it is up to you to make a decision if these performance-related fix
is needed or not.
But it needs to be well tested so that both global+thread event
management works correctly.
Thanks,
Serguei
Hi Serguei,
Yes exactly it was an optimization. When using a 512k sampling rate, I
don't see a no real difference (the overhead is anyway low for that
sampling rate), I imagine there would be a difference if trying to sample
every allocation or with a low sampling interval. But because you are right
and it is an optimization of the system and not a functional need, I've
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/
Let me know what you think,
Jc
Post by s***@oracle.com
Hi Jc,
Okay, I see your point: the change in memAllocator.cpp is for performance.
Do you have any measurements showing a performance difference?
Also, do you need me to submit a mach5 test run?
Thanks,
Serguei
Hi Serguei,
First off, thanks as always for looking at this :-) Let me inline my
I actually "struggled" with this part of the change. My change is
correct semantically and if you care about performance for when sampling a
given thread.
Your change will work semantically but the performance is the same as
the global sampling.
What I mean by working semantically is that that the tests and the code
will work. However, this means that all threads will be doing the sampling
->
http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html
(which is why your suggestion works, the change in jvmtiExport basically
ensures only the threads requested are posting events)
The code will check that we actually only post for threads we care
about. The code above ensures that only threads that were requested to be
sampling are the ones that are sampling internally.
- We do not change the runtime/GC code at all, it remains "simple"
- The overhead in the general case goes away and this is a NOP for my
actual use-case from a performance point of view (sampling every thread)
- Then sampling per thread really is just telling the system don't
pollute the callbacks, though internally you are doing all the work anyway.
Let me know which you prefer :)
Post by s***@oracle.com
Also, do you see that enabling the sampling events globally still works?
Yes, otherwise HeapMonitorThreadTest.java would fail since it checks that.
Post by s***@oracle.com
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Done and done, I added a fprintf on stderr saying the GetThreadInfo
failed and the test is ignoring the add count.
Thanks again for looking and let me know what you think,
Jc
Post by s***@oracle.com
Hi Jc,
It looks good in general but I have some comments below.
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
+static bool thread_enabled_for_one_jvmti_env() {+ JavaThread *thread = JavaThread::current();+ JvmtiThreadState *state = thread->jvmti_thread_state();+ if (state == NULL) {+ return false;+ }++ JvmtiEnvThreadStateIterator it(state);+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {+ if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {+ return true;+ }+ }++ return false;+}+
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());
if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
+ // Sampling is enabled for at least one thread, is it this one?+ if (!thread_enabled_for_one_jvmti_env()) {+ return;+ }+
I don't think you need this change as this condition already does it: if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
714 int err; 742 int err;
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) { 746 return;
Thanks,
Serguei
Hi all,
When working on the heap sampling, I had promised to do the per thread
event so here it is!
Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
I was thinking of adding GC-dev for the memAllocator change once I get
favorable reviews for the rest of the change.
I've done a bit of performance testing and on the Dacapo benchmark I
see no change in performance when turned off (logical, any code change is
behind a flag check already in place) and when turned on it is comparable
to the current performance.
(More information is: I see a very slight degradation if we are doing
512k sampling but no degradation at 2MB).
Thanks,
Jc
--
Thanks,
Jc
--
Thanks,
Jc
--
Thanks,
Jc
--
Thanks,
Jc
Loading...