Discussion:
The failure
s***@oracle.com
2018-10-23 07:21:14 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi,<br>
<br>
I've added the seviceability-dev mailing list.<br>
It can be interesting for the SVC folks. :)<br>
<br>
<br>
On 10/22/18 22:14, Leonid Mesnik wrote:<br>
</div>
<blockquote type="cite"
cite="mid:6FF8CDAA-B83B-4BFC-8352-***@oracle.com">
<div dir="auto" class="">
<div dir="auto" class="">Hi</div>
<div dir="auto" class=""><br class="">
</div>
<div dir="auto" class="">Seems last version also crashes with 2
other different symptoms.
<div class=""><a
href="http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid"
class="" moz-do-not-send="true">http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid</a></div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">Also it might hangs with  stack attached.  Seems
that test might be blocked because it invoke 2 jvmti
methods. Can jvmti agent invoke jvmti methods from different
threads?</div>
</div>
</div>
</blockquote>
<br>
Yes, in general.<br>
However, you have to be careful when using debugging features.<br>
Below, one thread is enabling single stepping while another thread
is being suspended.<br>
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.<br>
For instance, the thread #152 is holding the monitor
JvmtiThreadState.<br>
<br>
Also, I see a couple of more threads that are interesting as well:<br>
<br>
Thread 159 (Thread 0x2ae40b78f700 (LWP 27962)):<br>
#0  0x00002ae3927b5945 in <a class="moz-txt-link-abbreviated" href="mailto:pthread_cond_wait@@GLIBC_2.3.2">pthread_cond_wait@@GLIBC_2.3.2</a> () from
/lib64/libpthread.so.0<br>
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
(this=***@entry=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897<br>
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399<br>
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461<br>
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910<br>
#5  Monitor::lock (this=***@entry=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919<br>
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=&lt;synthetic pointer&gt;) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182<br>
#7  ciEnv::cache_jvmti_state (this=***@entry=0x2ae40b78eb30) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229<br>
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
(task=***@entry=0x2ae48800ff40) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084<br>
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798<br>
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
(this=***@entry=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795<br>
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775<br>
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698<br>
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0<br>
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6<br>
<br>
Thread 158 (Thread 0x2ae40b890700 (LWP 27963)):<br>
#0  0x00002ae3927b5945 in <a class="moz-txt-link-abbreviated" href="mailto:pthread_cond_wait@@GLIBC_2.3.2">pthread_cond_wait@@GLIBC_2.3.2</a> () from
/lib64/libpthread.so.0<br>
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
(this=***@entry=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897<br>
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399<br>
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461<br>
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910<br>
#5  Monitor::lock (this=***@entry=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919<br>
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=&lt;synthetic pointer&gt;) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182<br>
#7  ciEnv::cache_jvmti_state (this=***@entry=0x2ae40b88fb30) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229<br>
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
(task=***@entry=0x2ae49c00a670) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084<br>
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798<br>
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
(this=***@entry=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795<br>
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775<br>
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698<br>
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0<br>
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6<br>
<br>
Thread 51 (Thread 0x2ae49549b700 (LWP 29678)):<br>
#0  0x00002ae3927b5945 in <a class="moz-txt-link-abbreviated" href="mailto:pthread_cond_wait@@GLIBC_2.3.2">pthread_cond_wait@@GLIBC_2.3.2</a> () from
/lib64/libpthread.so.0<br>
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
(this=***@entry=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897<br>
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399<br>
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461<br>
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910<br>
#5  Monitor::lock (this=***@entry=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919<br>
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=&lt;synthetic pointer&gt;) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182<br>
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668<br>
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027<br>
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
(thread=***@entry=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395<br>
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764<br>
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698<br>
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0<br>
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6<br>
<br>
<br>
These two thread are blocked on the monitor JvmtiThreadState_lock in
the function ciEnv::cache_jvmti_state().<br>
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.<br>
<br>
<br>
Now, the question is why this safepoint can not start?<br>
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?<br>
<br>
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.<br>
<br>
Conclusion:<br>
<br>
The deadlock is:<br>
<br>
Thread #152:<br>
  - grabbed the monitor JvmtiThreadState_lock<br>
  - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()<br>
<br>
Many other threads:<br>
  - blocked on the monitor JvmtiThreadState_lock<br>
  - can not reach the blocked at a safepoint state (all threads have
to reach this state for this safepoint to happen)<br>
<br>
It seems to me, this is a bug which has to be filed.<br>
<br>
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.<br>
Please, let me know about the results.<br>
<br>
<br>
<blockquote type="cite"
cite="mid:6FF8CDAA-B83B-4BFC-8352-***@oracle.com">
<div dir="auto" class="">
<div dir="auto" class="">
<div class="">Assuming that crashes look like VM bugs I think
it make sense to integrate jvmti changes but *don't* enabled
jvmti module by default.</div>
</div>
</div>
</blockquote>
<br>
This one is a deadlock.<br>
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.<br>
So, I'm curious if you observed crashes as well.<br>
<br>
<br>
<blockquote type="cite"
cite="mid:6FF8CDAA-B83B-4BFC-8352-***@oracle.com">
<div dir="auto" class="">
<div dir="auto" class="">
<div class=""> And add to more tests with jvmti enabled.</div>
<div class="">So anyone could easily run them to reproduce
crashes.  This test would be out of CI to don't introduce
any bugs. Does it make sense?</div>
<div class=""><br class="">
</div>
<div class="">Consider hang - I think that it might be product
bug since I don't see any locking on my monitors. But I am
not sure. Is it possible that any my code jvmti agent
prevent VM to get into safepoint?</div>
<div class="">Could we discuss it tomorrow or his week when
you have a time?</div>
</div>
</div>
</blockquote>
<br>
Yes, of course.<br>
Let's find some time tomorrow.<br>
<br>
<br>
<blockquote type="cite"
cite="mid:6FF8CDAA-B83B-4BFC-8352-***@oracle.com">
<div dir="auto" class="">
<div dir="auto" class="">
<div class=""> Any suggestion how to diagnose deadlock would
be great.</div>
</div>
</div>
</blockquote>
<br>
Analysis of stack traces is needed.<br>
It is non-trivial in this particular case as there are so many
threads executed at the same time.<br>
<br>
<br>
<blockquote type="cite"
cite="mid:6FF8CDAA-B83B-4BFC-8352-***@oracle.com">
<div dir="auto" class="">
<div dir="auto" class="">
<div class="">Part of stack trace with 2 my threads only:</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">Thread 136 (Thread 0x2ae494100700 (LWP
28023)):</div>
<div class="">#0  0x00002ae3927b5945 in
<a class="moz-txt-link-abbreviated" href="mailto:pthread_cond_wait@@GLIBC_2.3.2">pthread_cond_wait@@GLIBC_2.3.2</a> () from
/lib64/libpthread.so.0</div>
<div class="">#1  0x00002ae393ba8d63 in
os::PlatformEvent::park (this=***@entry=0x2ae454005800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897</div>
<div class="">#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399</div>
<div class="">#3  Monitor::IWait
(this=***@entry=0x2ae398023c10,
Self=***@entry=0x2ae454004800, timo=***@entry=0) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\</div>
<div class="">8</div>
<div class="">#4  0x00002ae393b51f2e in Monitor::wait
(this=***@entry=0x2ae398023c10,
no_safepoint_check=&lt;optimized out&gt;,
timeout=***@entry=0,
as_suspend_equivalent=***@en\</div>
<div class="">try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106</div>
<div class="">#5  0x00002ae393de7867 in VMThread::execute
(op=***@entry=0x2ae4940ffb10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657</div>
<div class="">#6  0x00002ae393d6a3bd in
JavaThread::java_suspend (this=***@entry=0x2ae3985f2000)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321</div>
<div class="">#7  0x00002ae3939ad7e1 in
JvmtiSuspendControl::suspend
(java_thread=***@entry=0x2ae3985f2000) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\</div>
<div class="">47</div>
<div class="">#8  0x00002ae3939887ae in
JvmtiEnv::SuspendThread (this=***@entry=0x2ae39801b270,
java_thread=0x2ae3985f2000) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\</div>
<div class="">nv.cpp:955</div>
<div class="">#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\</div>
<div class="">/jvmtiEnter.cpp:527</div>
<div class="">#10 0x00002ae394d973ee in agent_sampler
(jvmti=0x2ae39801b270, env=&lt;optimized out&gt;,
p=&lt;optimized out&gt;) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\</div>
<div class="">hensink/process/stress/modules/libJvmtiStressModule.c:274</div>
<div class="">#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85</div>
<div class="">#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=&lt;optimized
out&gt;) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79</div>
<div class="">#13 0x00002ae393d7338a in
JavaThread::thread_main_inner
(this=***@entry=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795</div>
<div class="">#14 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775</div>
<div class="">#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698</div>
<div class="">#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0</div>
<div class="">#17 0x00002ae392cc234d in clone () from
/lib64/libc.so.6</div>
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">Thread 152 (Thread 0x2ae427060700 (LWP
27995)):</div>
<div class="">#0  0x00002ae3927b5945 in
<a class="moz-txt-link-abbreviated" href="mailto:pthread_cond_wait@@GLIBC_2.3.2">pthread_cond_wait@@GLIBC_2.3.2</a> () from
/lib64/libpthread.so.0</div>
<div class="">#1  0x00002ae393ba8d63 in
os::PlatformEvent::park (this=***@entry=0x2ae3985e7400)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897</div>
<div class="">#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399</div>
<div class="">#3  Monitor::IWait
(this=***@entry=0x2ae398023c10,
Self=***@entry=0x2ae3985e6000, timo=***@entry=0) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\</div>
<div class="">8</div>
<div class="">#4  0x00002ae393b51f2e in Monitor::wait
(this=***@entry=0x2ae398023c10,
no_safepoint_check=&lt;optimized out&gt;,
timeout=***@entry=0,
as_suspend_equivalent=***@en\</div>
<div class="">try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106</div>
<div class="">#5  0x00002ae393de7867 in VMThread::execute
(op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657</div>
<div class="">#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
(this=***@entry=0x2ae6bc000d80,
event_type=***@entry=JVMTI_EVENT_SINGLE_STEP,
enabled=***@entry=tr\</div>
<div class="">ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312</div>
<div class="">#7  0x00002ae393997acf in
recompute_env_thread_enabled (state=0x2ae6bc000cd0,
ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\</div>
<div class="">r.cpp:490</div>
<div class="">#8
 JvmtiEventControllerPrivate::recompute_thread_enabled
(state=***@entry=0x2ae6bc000cd0) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\</div>
<div class="">:523</div>
<div class="">#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598</div>
<div class="">#10 0x00002ae39399a244 in set_user_enabled
(enabled=true, event_type=JVMTI_EVENT_SINGLE_STEP,
thread=0x0, env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\</div>
<div class="">re/prims/jvmtiEventController.cpp:818</div>
<div class="">#11 JvmtiEventController::set_user_enabled
(env=0x2ae39801b270, thread=0x0,
event_type=JVMTI_EVENT_SINGLE_STEP, enabled=&lt;optimized
out&gt;) at /scratch/lmesnik/ws/hs-bigapps/open/src/\</div>
<div class="">hotspot/share/prims/jvmtiEventController.cpp:963</div>
</div>
<div class="">
<div class="">#12 0x00002ae393987d2d in
JvmtiEnv::SetEventNotificationMode
(this=***@entry=0x2ae39801b270,
mode=***@entry=JVMTI_ENABLE,
event_type=***@entry=JVMTI_EVENT_SINGLE_STEP, eve\</div>
<div class="">nt_thread=***@entry=0x0) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543</div>
<div class="">#13 0x00002ae3939414eb in
jvmti_SetEventNotificationMode (env=0x2ae39801b270,
mode=***@entry=JVMTI_ENABLE,
event_type=***@entry=JVMTI_EVENT_SINGLE_STEP,
event_thread=event_\</div>
<div class="">***@entry=0x0) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389</div>
<div class="">#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519</div>
<div class="">#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=&lt;optimized out&gt;, this=&lt;optimized out&gt;) at
/scratch/lmesnik/ws/h\</div>
<div class="">s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697</div>
<div class="">#16 0x00002ae3a43ef257 in ?? ()</div>
<div class="">#17 0x00002ae3a43eede1 in ?? ()</div>
<div class="">#18 0x00002ae42705f878 in ?? ()</div>
<div class="">#19 0x00002ae40ad334e0 in ?? ()</div>
<div class="">#20 0x00002ae42705f8e0 in ?? ()</div>
<div class="">#21 0x00002ae40ad33c68 in ?? ()</div>
<div class="">#22 0x0000000000000000 in ?? ()</div>
</div>
</div>
</div>
</blockquote>
<br>
Thanks,<br>
Serguei<br>
<br>
<blockquote type="cite"
cite="mid:6FF8CDAA-B83B-4BFC-8352-***@oracle.com">
<div dir="auto" class="">
<div dir="auto" class="">
<div class=""> </div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<div dir="auto" class="">
<div dir="auto" class="">
<div class=""><br class="">
</div>
<div class="">Leonid</div>
<div class=""><br class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Oct 9, 2018, at 4:52 PM, <a
href="mailto:***@oracle.com" class=""
moz-do-not-send="true">***@oracle.com</a>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">Hi Leonid,<br class="">
<br class="">
There is an existing bug:<br class="">
   <a
href="https://bugs.openjdk.java.net/browse/JDK-8043571"
class="" moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8043571</a><br
class="">
<br class="">
Thanks,<br class="">
Serguei<br class="">
<br class="">
<br class="">
On 10/9/18 16:11, Leonid Mesnik wrote:<br class="">
<blockquote type="cite" class="">Hi<br class="">
<br class="">
During fixing kitchensink I get<br class="">
assert(_cur_stack_depth == count_frames()) failed:
cur_stack_depth out of sync<br class="">
<br class="">
Do you know if i might be bug in my jvmti agent?<br
class="">
<br class="">
Leonid<br class="">
<br class="">
<br class="">
#<br class="">
# A fatal error has been detected by the Java
Runtime Environment:<br class="">
#<br class="">
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962<br class="">
#  assert(_cur_stack_depth == count_frames())
failed: cur_stack_depth out of sync<br class="">
#<br class="">
# JRE version: Java(TM) SE Runtime Environment
(12.0) (fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)<br
class="">
# Java VM: Java HotSpot(TM) 64-Bit Server VM
(fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps,
mixed mode, tiered, compressed oops, g1 gc,
linux-amd64)<br class="">
# Core dump will be written. Default location:
Core dumps may be processed with
"/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
%P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)<br
class="">
#<br class="">
# If you would like to submit a bug report, please
visit:<br class="">
#   <a
href="http://bugreport.java.com/bugreport/crash.jsp"
class="" moz-do-not-send="true">http://bugreport.java.com/bugreport/crash.jsp</a><br
class="">
#<br class="">
<br class="">
---------------  S U M M A R Y ------------<br
class="">
<br class="">
Command Line: -XX:MaxRAMPercentage=2
-XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr
-XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener
-XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties<br
class="">
<br class="">
Host: <a href="http://scaaa118.us.oracle.com"
class="" moz-do-not-send="true">scaaa118.us.oracle.com</a>,
Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 32
cores, 235G, Oracle Linux Server release 7.3<br
class="">
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time:
31 seconds (0d 0h 0m 31s)<br class="">
<br class="">
---------------  T H R E A D  ---------------<br
class="">
<br class="">
Current thread (0x00002af3dc6ac800):  VMThread "VM
Thread" [stack:
0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0<br class="">
<br class="">
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k<br
class="">
Native frames: (J=compiled Java code, A=aot
compiled Java code, j=interpreted, Vv=VM code,
C=native code)<br class="">
V  [libjvm.so+0x18c4923]
 VMError::report_and_die(int, char const*, char
const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned
long)+0x2c3<br class="">
V  [libjvm.so+0x18c56ef]
 VMError::report_and_die(Thread*, void*, char
const*, int, char const*, char const*,
__va_list_tag*)+0x2f<br class="">
V  [libjvm.so+0xb55aa0]  report_vm_error(char
const*, int, char const*, char const*, ...)+0x100<br
class="">
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e<br
class="">
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27<br
class="">
V  [libjvm.so+0x119af99]
 VM_UpdateForPopTopFrame::doit()+0xb9<br class="">
V  [libjvm.so+0x1908982]
 VM_Operation::evaluate()+0x132<br class="">
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*)
[clone .constprop.51]+0x18e<br class="">
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0<br
class="">
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3<br
class="">
V  [libjvm.so+0x14e8300]
 thread_native_entry(Thread*)+0x100<br class="">
<br class="">
VM_Operation (0x00002af4d8502910):
UpdateForPopTopFrame, mode: safepoint, requested
by thread 0x00002af4dc008800<br class="">
</blockquote>
<br class="">
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>
David Holmes
2018-10-23 07:43:16 UTC
Permalink
Hi Serguei,

The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.

David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might be
blocked because it invoke 2 jvmti methods. Can jvmti agent invoke
jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread is
being suspended.
Both are blocked at a safepoint which is Okay in general but not Okay if
they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in the
function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this safepoint is
waiting for?
I think, this safepoint operation is waiting for all threads that are
blocked on the JvmtiThreadState_lock.
  - grabbed the monitor JvmtiThreadState_lock
  - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
  - blocked on the monitor JvmtiThreadState_lock
  - can not reach the blocked at a safepoint state (all threads have to
reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off the
single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially result
in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test would
be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't see
any locking on my monitors. But I am not sure. Is it possible that any
my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many threads
executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth
out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P
%I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal -XX:+StartAttachListener
-XX:NativeMemoryTracking=detail -XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h
0m 31s)
---------------  T H R E A D  ---------------
0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0, _nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char const*,
char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*,
char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*, void*,
char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone .constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
s***@oracle.com
2018-10-23 07:58:11 UTC
Permalink
Hi David,

You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.

Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might be
blocked because it invoke 2 jvmti methods. Can jvmti agent invoke
jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread is
being suspended.
Both are blocked at a safepoint which is Okay in general but not Okay
if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in
the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this safepoint
is waiting for?
I think, this safepoint operation is waiting for all threads that are
blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads have
to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't
see any locking on my monitors. But I am not sure. Is it possible
that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270)
at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth
out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P
%I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h
0m 31s)
---------------  T H R E A D  ---------------
0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*, void*,
char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
David Holmes
2018-10-23 08:34:12 UTC
Permalink
Hi Serguei,

The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks what
it's stuck on.

David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might be
blocked because it invoke 2 jvmti methods. Can jvmti agent invoke
jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread is
being suspended.
Both are blocked at a safepoint which is Okay in general but not Okay
if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in
the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this safepoint
is waiting for?
I think, this safepoint operation is waiting for all threads that are
blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads have
to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't
see any locking on my monitors. But I am not sure. Is it possible
that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270)
at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth
out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P
%I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h
0m 31s)
---------------  T H R E A D  ---------------
0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*, void*,
char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
Robbin Ehn
2018-10-23 14:38:30 UTC
Permalink
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a safepoint
operation. There's no real way to tell from the stacks what it's stuck on.
I cannot find a thread that is not considered safepoint safe or is_ext_suspended
(thread 146). So the handshake should go through. The handshake will log a
warning after a while, is there such warning from the handshake operation?

There are several threads competing with e.g. Threads_lock, and threads waiting
for GC and several other VM ops, could it just be really slow?

/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks enabled,
so all JavaThreads blocked trying to acquire it will be _thread_blocked and
so safepoint-safe and so won't be holding up the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might be blocked
because it invoke 2 jvmti methods. Can jvmti agent invoke jvmti methods
from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread is being
suspended.
Both are blocked at a safepoint which is Okay in general but not Okay if
they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10, this=<synthetic
pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10, this=<synthetic
pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10, this=<synthetic
pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in the
function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this safepoint is
waiting for?
I think, this safepoint operation is waiting for all threads that are
blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads have to
reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off the
single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to integrate
jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially result in
both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test would be
out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't see any
locking on my monitors. But I am not sure. Is it possible that any my code
jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many threads
executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in JvmtiEventControllerPrivate::recompute_enabled ()
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270, thread=0x0,
event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth out of
sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
#  assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth out
of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode, tiered,
compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h"
(or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal -XX:+StartAttachListener
-XX:NativeMemoryTracking=detail -XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED -Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>, Intel(R)
release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h 0m 31s)
---------------  T H R E A D  ---------------
0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0, _nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],  sp=0x00002af44f208720,
 free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char const*, char
const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char
const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*, void*, char
const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char const*,
char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]  JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]  VMThread::evaluate_operation(VM_Operation*)
[clone .constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
VM_Operation (0x00002af4d8502910): UpdateForPopTopFrame, mode: safepoint,
requested by thread 0x00002af4dc008800
s***@oracle.com
2018-10-23 16:09:28 UTC
Permalink
Hi David and Robbin,

I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already explained
but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked
threads.

Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks what
it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through. The
handshake will log a warning after a while, is there such warning from
the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just be
really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up
the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might
be blocked because it invoke 2 jvmti methods. Can jvmti agent
invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread
is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock
in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I
don't see any locking on my monitors. But I am not sure. Is it
possible that any my code jvmti agent prevent VM to get into
safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP,
enabled=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may
be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g
%t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Oracle Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
s***@oracle.com
2018-10-23 23:18:02 UTC
Permalink
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.

Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked
threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks
what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through.
The handshake will log a warning after a while, is there such warning
from the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just be
really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up
the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might
be blocked because it invoke 2 jvmti methods. Can jvmti agent
invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another
thread is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock
in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn
off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I
don't see any locking on my monitors. But I am not sure. Is it
possible that any my code jvmti agent prevent VM to get into
safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP,
enabled=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
# A fatal error has been detected by the Java Runtime
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may
be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g
%t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Oracle Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java
code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int,
char const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
David Holmes
2018-10-23 23:56:30 UTC
Permalink
Hi Serguei, Robbin,

One thing I noticed which Robbin should be able to expand upon is that
Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete
and is blocked here:

// Wait for a release_stable_list() call before we check again. No
// safepoint check, no timeout, and not as suspend equivalent flag
// because this JavaThread is not on the Threads list.
ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
0,
!Mutex::_as_suspend_equivalent_flag);

As the comment says this thread is no longer on the Threads_list, but
the VM_HandshakeAllThreads is not a safepoint operation and does not
hold the Threads_lock, so is it possible this thread was captured by the
JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads,
before it got removed? If so we'd be hung waiting it for it handshake as
it's not in a "safepoint-safe" or suspend-equivalent state.

David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked
threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks
what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through.
The handshake will log a warning after a while, is there such warning
from the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just be
really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up
the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might
be blocked because it invoke 2 jvmti methods. Can jvmti agent
invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another
thread is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock
in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn
off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I
don't see any locking on my monitors. But I am not sure. Is it
possible that any my code jvmti agent prevent VM to get into
safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP,
enabled=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
# A fatal error has been detected by the Java Runtime
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may
be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g
%t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Oracle Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds
(0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java
code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int,
char const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
David Holmes
2018-10-23 23:58:19 UTC
Permalink
I should have looked further before sending this. Many threads are in
smr_delete.

David
Post by David Holmes
Hi Serguei, Robbin,
One thing I noticed which Robbin should be able to expand upon is that
Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete
 // Wait for a release_stable_list() call before we check again. No
 // safepoint check, no timeout, and not as suspend equivalent flag
 // because this JavaThread is not on the Threads list.
 ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                        0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but
the VM_HandshakeAllThreads is not a safepoint operation and does not
hold the Threads_lock, so is it possible this thread was captured by the
JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads,
before it got removed? If so we'd be hung waiting it for it handshake as
it's not in a "safepoint-safe" or suspend-equivalent state.
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked
threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks
what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through.
The handshake will log a warning after a while, is there such
warning from the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just be
really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint
checks enabled, so all JavaThreads blocked trying to acquire it
will be _thread_blocked and so safepoint-safe and so won't be
holding up the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test
might be blocked because it invoke 2 jvmti methods. Can jvmti
agent invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another
thread is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor
JvmtiThreadState_lock in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads
that are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn
off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense
to integrate jvmti changes but *don't* enabled jvmti module by
default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This
test would be out of CI to don't introduce any bugs. Does it
make sense?
Consider hang - I think that it might be product bug since I
don't see any locking on my monitors. But I am not sure. Is it
possible that any my code jvmti agent prevent VM to get into
safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP,
enabled=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may
be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g
%t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Oracle Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds
(0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java
code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int,
char const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
VM_Operation (0x00002af4d8502910): UpdateForPopTopFrame,
mode: safepoint, requested by thread 0x00002af4dc008800
s***@oracle.com
2018-10-24 00:00:43 UTC
Permalink
Okay, thanks!
Serguei
Post by David Holmes
I should have looked further before sending this. Many threads are in
smr_delete.
David
Post by David Holmes
Hi Serguei, Robbin,
One thing I noticed which Robbin should be able to expand upon is
that Thread 101 is terminating and has called
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but
the VM_HandshakeAllThreads is not a safepoint operation and does not
hold the Threads_lock, so is it possible this thread was captured by
the JavaThreadIteratorWithHandle being used by
VM_HandshakeAllThreads, before it got removed? If so we'd be hung
waiting it for it handshake as it's not in a "safepoint-safe" or
suspend-equivalent state.
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the
blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks
what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through.
The handshake will log a warning after a while, is there such
warning from the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just be
really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint
checks enabled, so all JavaThreads blocked trying to acquire it
will be _thread_blocked and so safepoint-safe and so won't be
holding up the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test
might be blocked because it invoke 2 jvmti methods. Can jvmti
agent invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another
thread is being suspended.
Both are blocked at a safepoint which is Okay in general but
not Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run
(this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor
JvmtiThreadState_lock in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads
that are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all
threads have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn
off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense
to integrate jvmti changes but *don't* enabled jvmti module
by default.
This one is a deadlock.
However, the root cause is a race condition that can
potentially result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This
test would be out of CI to don't introduce any bugs. Does it
make sense?
Consider hang - I think that it might be product bug since I
don't see any locking on my monitors. But I am not sure. Is
it possible that any my code jvmti agent prevent VM to get
into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler
(jvmti=0x2ae39801b270, env=<optimized out>, p=<optimized
out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
(op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled
(env=0x2ae39801b270, thread=0x0,
event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized out>)
at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps
may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p
%u %g %t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2
-XX:MaxRAMPercentage=50 -XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com
<http://scaaa118.us.oracle.com>, Intel(R) Xeon(R) CPU
release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds
(0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java
code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned
char*, void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int,
char const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
VM_Operation (0x00002af4d8502910): UpdateForPopTopFrame,
mode: safepoint, requested by thread 0x00002af4dc008800
Robbin Ehn
2018-10-24 07:18:49 UTC
Permalink
One thing I noticed which Robbin should be able to expand upon is that Thread
101 is terminating and has called ThreadsSMRSupport::smr_delete and is
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but the
VM_HandshakeAllThreads is not a safepoint operation and does not hold the
Threads_lock, so is it possible this thread was captured by the
JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads, before it
got removed? If so we'd be hung waiting it for it handshake as it's not in a
"safepoint-safe" or suspend-equivalent state.
In short:
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of the
Threads_list, scans the list and process handshakes on behalf of safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed

# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but must stick
around if it is on any snapshots of alive. When it is no on any list it will
cancel the handshake.

Since VM thread during the handshake takes a new snapshot every iteration any
exiting can proceed since it will not be a the new snapshot. Thus cancel the
handshake and VM thread can exit the loop (if this was the last handshake).

Constraint:
If any thread grabs a snapshot of threads list and later tries to take a lock
which is 'used' by VM Thread or inside the handshake we can deadlock.

Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive, this could
deadlock AFAICT. Since the thread will rest on VMOperationRequest_lock with a
Threads list snapshot but VM thread cannot finishes handshake until that
snapshot is released.

I suggest first step is to add something like this patch below and fix the
obvious ones first.

Note, I have not verified that is the problem you are seeing, I'm saying that
this seem to be real issue. And considering how the stack traces looks, it may
be this.

You want me going through this, just assign a bug if there is one?

/Robbin

diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.hpp Wed Oct 24 09:13:17 2018 +0200
@@ -167,2 +167,6 @@
}
+ public:
+ bool have_threads_list();
+ private:
+
// This field is enabled via -XX:+EnableThreadSMRStatistics:
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp Wed Oct 24 09:13:17 2018 +0200
@@ -111,2 +111,6 @@

+inline bool Thread::have_threads_list() {
+ return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp Wed Oct 24 09:13:17 2018 +0200
@@ -608,2 +608,3 @@
if (!t->is_VM_thread()) {
+ assert(t->have_threads_list(), "Deadlock if we have exiting threads and if
vm thread is running an VM op."); // fatal/guarantee
SkipGCALot sgcalot(t); // avoid re-entrant attempts to gc-a-lot
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already explained but
more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a safepoint
operation. There's no real way to tell from the stacks what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through. The
handshake will log a warning after a while, is there such warning from the
handshake operation?
There are several threads competing with e.g. Threads_lock, and threads
waiting for GC and several other VM ops, could it just be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might be
blocked because it invoke 2 jvmti methods. Can jvmti agent invoke
jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread is
being suspended.
Both are blocked at a safepoint which is Okay in general but not Okay
if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7is_ext_suspended800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in
the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this safepoint
is waiting for?
I think, this safepoint operation is waiting for all threads that are
blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads have
to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off the
single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't
see any locking on my monitors. But I am not sure. Is it possible
that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many threads
executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270)
at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth
out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P
%I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h
0m 31s)
---------------  T H R E A D  ---------------
0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char const*,
char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*,
char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*, void*,
char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
David Holmes
2018-10-24 07:46:33 UTC
Permalink
Thanks Robbin! So you're no allowed to request a VM operation if you
hold a ThreadsListHandle ? I suppose that is no different to not being
able to request a VM operation whilst holding the Threads_lock.

I suspect before ThreadSMR this may have been a case where we weren't
ensuring a target thread could not terminate, and now with SMR we're
ensuring that but potentially introducing a deadlock. I say potentially
because obviously we don't deadlock every time we suspend threads.

Cheers,
David
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon is
that Thread 101 is terminating and has called
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list,
but the VM_HandshakeAllThreads is not a safepoint operation and does
not hold the Threads_lock, so is it possible this thread was
captured by the JavaThreadIteratorWithHandle being used by
VM_HandshakeAllThreads, before it got removed? If so we'd be hung
waiting it for it handshake as it's not in a "safepoint-safe" or
suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of the
Threads_list, scans the list and process handshakes on behalf of safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but
must stick around if it is on any snapshots of alive. When it is no on
any list it will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every
iteration any exiting can proceed since it will not be a the new
snapshot. Thus cancel the handshake and VM thread can exit the loop (if
this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to take a
lock which is 'used' by VM Thread or inside the handshake we can deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive,
this could deadlock AFAICT. Since the thread will rest on
VMOperationRequest_lock with a Threads list snapshot but VM thread
cannot finishes handshake until that snapshot is released.
I suggest first step is to add something like this patch below and fix
the obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm saying
that this seem to be real issue. And considering how the stack traces
looks, it may be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp    Tue Oct 23 13:27:41 2018
+0200
+++ b/src/hotspot/share/runtime/thread.hpp    Wed Oct 24 09:13:17 2018
+0200
@@ -167,2 +167,6 @@
   }
+  bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp    Tue Oct 23 13:27:41
2018 +0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp    Wed Oct 24 09:13:17
2018 +0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+  return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
 inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp    Tue Oct 23 13:27:41 2018
+0200
+++ b/src/hotspot/share/runtime/vmThread.cpp    Wed Oct 24 09:13:17 2018
+0200
@@ -608,2 +608,3 @@
   if (!t->is_VM_thread()) {
+    assert(t->have_threads_list(), "Deadlock if we have exiting threads
and if vm thread is running an VM op."); // fatal/guarantee
     SkipGCALot sgcalot(t);    // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks
what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go
through. The handshake will log a warning after a while, is there
such warning from the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just be
really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint
checks enabled, so all JavaThreads blocked trying to acquire
it will be _thread_blocked and so safepoint-safe and so won't
be holding up the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test
might be blocked because it invoke 2 jvmti methods. Can
jvmti agent invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another
thread is being suspended.
Both are blocked at a safepoint which is Okay in general but
not Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run
(this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor
JvmtiThreadState_lock in the function
ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads
that are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all
threads have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you
turn off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make
sense to integrate jvmti changes but *don't* enabled jvmti
module by default.
This one is a deadlock.
However, the root cause is a race condition that can
potentially result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This
test would be out of CI to don't introduce any bugs. Does it
make sense?
Consider hang - I think that it might be product bug since I
don't see any locking on my monitors. But I am not sure. Is
it possible that any my code jvmti agent prevent VM to get
into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so
many threads executed at the same time.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler
(jvmti=0x2ae39801b270, env=<optimized out>, p=<optimized
out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
(op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled
(env=0x2ae39801b270, thread=0x0,
event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized out>)
at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
# A fatal error has been detected by the Java Runtime
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps
may be processed with "/usr/libexec/abrt-hook-ccpp %s %c
%p %u %g %t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2
-XX:MaxRAMPercentage=50 -XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC
-XX:+PrintFlagsFinal -XX:+StartAttachListener
-XX:NativeMemoryTracking=detail -XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com
<http://scaaa118.us.oracle.com>, Intel(R) Xeon(R) CPU
release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31
seconds (0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java
code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int,
char const*, char const*, __va_list_tag*, Thread*,
unsigned char*, void*, void*, char const*, int, unsigned
long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int,
char const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]
 VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
VM_Operation (0x00002af4d8502910): UpdateForPopTopFrame,
mode: safepoint, requested by thread 0x00002af4dc008800
Robbin Ehn
2018-10-24 09:10:20 UTC
Permalink
Thanks Robbin! So you're no allowed to request a VM operation if you hold a
ThreadsListHandle ? I suppose that is no different to not being able to request
a VM operation whilst holding the Threads_lock.
Yes, exactly.

/Robbin
I suspect before ThreadSMR this may have been a case where we weren't ensuring a
target thread could not terminate, and now with SMR we're ensuring that but
potentially introducing a deadlock. I say potentially because obviously we don't
deadlock every time we suspend threads.
Cheers,
David
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon is that
Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete and
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but the
VM_HandshakeAllThreads is not a safepoint operation and does not hold the
Threads_lock, so is it possible this thread was captured by the
JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads, before
it got removed? If so we'd be hung waiting it for it handshake as it's not
in a "safepoint-safe" or suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of the
Threads_list, scans the list and process handshakes on behalf of safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but must
stick around if it is on any snapshots of alive. When it is no on any list it
will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every iteration any
exiting can proceed since it will not be a the new snapshot. Thus cancel the
handshake and VM thread can exit the loop (if this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to take a lock
which is 'used' by VM Thread or inside the handshake we can deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive, this
could deadlock AFAICT. Since the thread will rest on VMOperationRequest_lock
with a Threads list snapshot but VM thread cannot finishes handshake until
that snapshot is released.
I suggest first step is to add something like this patch below and fix the
obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm saying that
this seem to be real issue. And considering how the stack traces looks, it may
be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.hpp    Wed Oct 24 09:13:17 2018 +0200
@@ -167,2 +167,6 @@
    }
+  bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp    Tue Oct 23 13:27:41 2018
+0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp    Wed Oct 24 09:13:17 2018
+0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+  return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
  inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp    Wed Oct 24 09:13:17 2018 +0200
@@ -608,2 +608,3 @@
    if (!t->is_VM_thread()) {
+    assert(t->have_threads_list(), "Deadlock if we have exiting threads and
if vm thread is running an VM op."); // fatal/guarantee
      SkipGCALot sgcalot(t);    // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already explained
but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks what
it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through. The
handshake will log a warning after a while, is there such warning from
the handshake operation?
There are several threads competing with e.g. Threads_lock, and threads
waiting for GC and several other VM ops, could it just be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might be
blocked because it invoke 2 jvmti methods. Can jvmti agent invoke
jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread
is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7is_ext_suspended800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in
the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't
see any locking on my monitors. But I am not sure. Is it possible
that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270)
at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
%P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d
0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*, void*,
char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
s***@oracle.com
2018-10-24 09:00:37 UTC
Permalink
Hi Robbin and David,

There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces.
But there is an instance of the JvmtiEnv::SuspendThread which seems to
be supporting your theory:

Thread 136 (Thread 0x2ae494100700 (LWP 28023)):
#0  0x00002ae3927b5945 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
(this=***@entry=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::IWait (this=***@entry=0x2ae398023c10,
Self=***@entry=0x2ae454004800, timo=***@entry=0) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:768
#4  0x00002ae393b51f2e in Monitor::wait (this=***@entry=0x2ae398023c10,
no_safepoint_check=<optimized out>, timeout=***@entry=0,
as_suspend_equivalent=***@entry=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=***@entry=0x2ae4940ffb10)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
(this=***@entry=0x2ae3985f2000) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
(java_thread=***@entry=0x2ae3985f2000) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:847
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
(this=***@entry=0x2ae39801b270, java_thread=0x2ae3985f2000) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
(this=***@entry=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6


The JavaThread::suspend() call from the stack trace above also holds a
ThreadsListHandle while executing VMThread::execute(&vm_suspend):

void JavaThread::java_suspend() {
  ThreadsListHandle tlh;
  if (!tlh.includes(this) || threadObj() == NULL || is_exiting()) {
    return;
  }

  { MutexLockerEx ml(SR_lock(), Mutex::_no_safepoint_check_flag);
    if (!is_external_suspend()) {
      // a racing resume has cancelled us; bail out now
      return;
    }

    // suspend is done
    uint32_t debug_bits = 0;
    // Warning: is_ext_suspend_completed() may temporarily drop the
    // SR_lock to allow the thread to reach a stable thread state if
    // it is currently in a transient thread state.
    if (is_ext_suspend_completed(false /* !called_by_wait */,
                                 SuspendRetryDelay, &debug_bits)) {
      return;
    }
  }

  VM_ThreadSuspend vm_suspend;
  VMThread::execute(&vm_suspend);
}


I'll check with Leonid tomorrow if we can check if your patch below can
fix this deadlock.

Thanks,
Serguei
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon is
that Thread 101 is terminating and has called
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list,
but the VM_HandshakeAllThreads is not a safepoint operation and
does not hold the Threads_lock, so is it possible this thread was
captured by the JavaThreadIteratorWithHandle being used by
VM_HandshakeAllThreads, before it got removed? If so we'd be hung
waiting it for it handshake as it's not in a "safepoint-safe" or
suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of
the Threads_list, scans the list and process handshakes on behalf of
safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but
must stick around if it is on any snapshots of alive. When it is no on
any list it will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every
iteration any exiting can proceed since it will not be a the new
snapshot. Thus cancel the handshake and VM thread can exit the loop
(if this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to take
a lock which is 'used' by VM Thread or inside the handshake we can
deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive,
this could deadlock AFAICT. Since the thread will rest on
VMOperationRequest_lock with a Threads list snapshot but VM thread
cannot finishes handshake until that snapshot is released.
I suggest first step is to add something like this patch below and fix
the obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm
saying that this seem to be real issue. And considering how the stack
traces looks, it may be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp    Tue Oct 23 13:27:41 2018
+0200
+++ b/src/hotspot/share/runtime/thread.hpp    Wed Oct 24 09:13:17 2018
+0200
@@ -167,2 +167,6 @@
   }
+  bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp    Tue Oct 23
13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp    Wed Oct 24
09:13:17 2018 +0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+  return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
 inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp    Tue Oct 23 13:27:41
2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp    Wed Oct 24 09:13:17
2018 +0200
@@ -608,2 +608,3 @@
   if (!t->is_VM_thread()) {
+    assert(t->have_threads_list(), "Deadlock if we have exiting
threads and if vm thread is running an VM op."); // fatal/guarantee
     SkipGCALot sgcalot(t);    // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the
stacks what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go
through. The handshake will log a warning after a while, is
there such warning from the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just
be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint
checks enabled, so all JavaThreads blocked trying to acquire
it will be _thread_blocked and so safepoint-safe and so won't
be holding up the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test
might be blocked because it invoke 2 jvmti methods. Can
jvmti agent invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another
thread is being suspended.
Both are blocked at a safepoint which is Okay in general but
not Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  ciEnv::cache_jvmti_state
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in
CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  ciEnv::cache_jvmti_state
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in
CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run
(this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor
JvmtiThreadState_lock in the function
ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads
that are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all
threads have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you
turn off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make
sense to integrate jvmti changes but *don't* enabled jvmti
module by default.
This one is a deadlock.
However, the root cause is a race condition that can
potentially result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This
test would be out of CI to don't introduce any bugs. Does
it make sense?
Consider hang - I think that it might be product bug since
I don't see any locking on my monitors. But I am not sure.
Is it possible that any my code jvmti agent prevent VM to
get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so
many threads executed at the same time.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
no_safepoint_check=<optimized out>,
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
java_thread=0x2ae3985f2000) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler
(jvmti=0x2ae39801b270, env=<optimized out>, p=<optimized
out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
no_safepoint_check=<optimized out>,
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
(op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled
(env=0x2ae39801b270, thread=0x0,
event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in
JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
# A fatal error has been detected by the Java Runtime
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps,
mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps
may be processed with "/usr/libexec/abrt-hook-ccpp %s %c
%p %u %g %t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2
-XX:MaxRAMPercentage=50 -XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com
<http://scaaa118.us.oracle.com>, Intel(R) Xeon(R) CPU
release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31
seconds (0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM
Thread" [stack: 0x00002af44f10a000,0x00002af44f20a000]
[id=13962] _threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java
code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int,
char const*, char const*, __va_list_tag*, Thread*,
unsigned char*, void*, void*, char const*, int, unsigned
long)+0x2c3
V  [libjvm.so+0x18c56ef]
 VMError::report_and_die(Thread*, void*, char const*,
int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*,
int, char const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]
 VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
VM_Operation (0x00002af4d8502910): UpdateForPopTopFrame,
mode: safepoint, requested by thread 0x00002af4dc008800
Robbin Ehn
2018-10-24 09:18:32 UTC
Permalink
Hi Serguei,
Post by s***@oracle.com
Hi Robbin and David,
There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces.
But there is an instance of the JvmtiEnv::SuspendThread which seems to be
Sorry, I did mean any place we take a ThreadsListHandle and call ::execute with
that handle still alive. That was just first one I saw and used it as an example.
Post by s***@oracle.com
void JavaThread::java_suspend() {
You are correct in this particular case it seems to be this one, but we must fix
_all_ of them. That why I suggested adding the assert. And maybe an additional
assert in Mutex on having a threadslist while locking the Threads_lock.
Post by s***@oracle.com
I'll check with Leonid tomorrow if we can check if your patch below can fix this
deadlock.
It wouldn't fix the deadlock it would assert, so we still need to locate and fix
all the places. Maybe it's just a couple.

/Robbin
Post by s***@oracle.com
Thanks,
Serguei
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon is that
Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete and
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but the
VM_HandshakeAllThreads is not a safepoint operation and does not hold the
Threads_lock, so is it possible this thread was captured by the
JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads, before
it got removed? If so we'd be hung waiting it for it handshake as it's not
in a "safepoint-safe" or suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of the
Threads_list, scans the list and process handshakes on behalf of safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but must
stick around if it is on any snapshots of alive. When it is no on any list it
will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every iteration any
exiting can proceed since it will not be a the new snapshot. Thus cancel the
handshake and VM thread can exit the loop (if this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to take a lock
which is 'used' by VM Thread or inside the handshake we can deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive, this
could deadlock AFAICT. Since the thread will rest on VMOperationRequest_lock
with a Threads list snapshot but VM thread cannot finishes handshake until
that snapshot is released.
I suggest first step is to add something like this patch below and fix the
obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm saying that
this seem to be real issue. And considering how the stack traces looks, it may
be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.hpp    Wed Oct 24 09:13:17 2018 +0200
@@ -167,2 +167,6 @@
   }
+  bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp    Tue Oct 23 13:27:41 2018
+0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp    Wed Oct 24 09:13:17 2018
+0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+  return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
 inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp    Wed Oct 24 09:13:17 2018 +0200
@@ -608,2 +608,3 @@
   if (!t->is_VM_thread()) {
+    assert(t->have_threads_list(), "Deadlock if we have exiting threads and
if vm thread is running an VM op."); // fatal/guarantee
     SkipGCALot sgcalot(t);    // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already explained
but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks what
it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through. The
handshake will log a warning after a while, is there such warning from
the handshake operation?
There are several threads competing with e.g. Threads_lock, and threads
waiting for GC and several other VM ops, could it just be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might be
blocked because it invoke 2 jvmti methods. Can jvmti agent invoke
jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread
is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984c7is_ext_suspended800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in
the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't
see any locking on my monitors. But I am not sure. Is it possible
that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270)
at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
%P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d
0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*, void*,
char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
Robbin Ehn
2018-10-24 10:45:41 UTC
Permalink
Hi sorry, the assert should be

assert(!t->have_threads_list(),....)

We should not have a threads list :)

/Robbin
Post by David Holmes
Hi Serguei,
Post by s***@oracle.com
Hi Robbin and David,
There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces.
But there is an instance of the JvmtiEnv::SuspendThread which seems to be
Sorry, I did mean any place we take a ThreadsListHandle and call ::execute with
that handle still alive. That was just first one I saw and used it as an example.
Post by s***@oracle.com
void JavaThread::java_suspend() {
You are correct in this particular case it seems to be this one, but we must fix
_all_ of them. That why I suggested adding the assert. And maybe an additional
assert in Mutex on having a threadslist while locking the Threads_lock.
Post by s***@oracle.com
I'll check with Leonid tomorrow if we can check if your patch below can fix
this deadlock.
It wouldn't fix the deadlock it would assert, so we still need to locate and fix
all the places. Maybe it's just a couple.
/Robbin
Post by s***@oracle.com
Thanks,
Serguei
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon is that
Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete and
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but the
VM_HandshakeAllThreads is not a safepoint operation and does not hold the
Threads_lock, so is it possible this thread was captured by the
JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads, before
it got removed? If so we'd be hung waiting it for it handshake as it's not
in a "safepoint-safe" or suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of the
Threads_list, scans the list and process handshakes on behalf of safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but must
stick around if it is on any snapshots of alive. When it is no on any list it
will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every iteration any
exiting can proceed since it will not be a the new snapshot. Thus cancel the
handshake and VM thread can exit the loop (if this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to take a lock
which is 'used' by VM Thread or inside the handshake we can deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive, this
could deadlock AFAICT. Since the thread will rest on VMOperationRequest_lock
with a Threads list snapshot but VM thread cannot finishes handshake until
that snapshot is released.
I suggest first step is to add something like this patch below and fix the
obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm saying that
this seem to be real issue. And considering how the stack traces looks, it
may be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.hpp    Wed Oct 24 09:13:17 2018 +0200
@@ -167,2 +167,6 @@
   }
+  bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp    Tue Oct 23 13:27:41 2018
+0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp    Wed Oct 24 09:13:17 2018
+0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+  return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
 inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp    Wed Oct 24 09:13:17 2018 +0200
@@ -608,2 +608,3 @@
   if (!t->is_VM_thread()) {
+    assert(t->have_threads_list(), "Deadlock if we have exiting threads and
if vm thread is running an VM op."); // fatal/guarantee
     SkipGCALot sgcalot(t);    // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already explained
but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks what
it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through. The
handshake will log a warning after a while, is there such warning from
the handshake operation?
There are several threads competing with e.g. Threads_lock, and threads
waiting for GC and several other VM ops, could it just be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might
be blocked because it invoke 2 jvmti methods. Can jvmti agent
invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread
is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock
in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't
see any locking on my monitors. But I am not sure. Is it possible
that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
%P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d
0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
s***@oracle.com
2018-10-24 21:16:29 UTC
Permalink
Leonid confirmed this deadlock is not reproducible if the Kitchensink
agent_sampler is disabled.
Also, applying the patch from Robbin (with agent_sampler enabled) hit
new assert
that has caught another case in JvmtiEnv::GetStackTrace with the same
pattern:

With proposed patch issue reproduced with hs_err (file in attachement):
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:607),
pid=26188, tid=26325
# assert(!t->have_threads_list()) failed: Deadlock if we have exiting
threads and if vm thread is running an VM op.
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (slowdebug build
12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (slowdebug
12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps, mixed mode,
sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I
%h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/core.26188)
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#

--------------- S U M M A R Y ------------

Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal -XX:+StartAttachListener
-XX:NativeMemoryTracking=detail -XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/kitchensink.final.properties

Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 32 cores, 235G, Oracle
Linux Server release 7.3
Time: Wed Oct 24 13:28:30 2018 PDT elapsed time: 3 seconds (0d 0h 0m 3s)

--------------- T H R E A D ---------------

Current thread (0x00002b9f68006000): JavaThread "Jvmti-AgentSampler"
daemon [_thread_in_vm, id=26325,
stack(0x00002b9f88808000,0x00002b9f88909000)]
_threads_hazard_ptr=0x00002b9f68008e30, _nested_threads_hazard_ptr_cnt=0

Stack: [0x00002b9f88808000,0x00002b9f88909000], sp=0x00002b9f88907440,
free space=1021k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x12a04bb] VMError::report_and_die(int, char const*, char
const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char
const*, int, unsigned long)+0x6a5
V [libjvm.so+0x129fdb3] VMError::report_and_die(Thread*, void*, char
const*, int, char const*, char const*, __va_list_tag*)+0x57
V [libjvm.so+0x8ca5ab] report_vm_error(char const*, int, char const*,
char const*, ...)+0x152
V [libjvm.so+0x12e485b] VMThread::execute(VM_Operation*)+0x99
V [libjvm.so+0xdbec54] JvmtiEnv::GetStackTrace(JavaThread*, int, int,
_jvmtiFrameInfo*, int*)+0xc0
V [libjvm.so+0xd677cf] jvmti_GetStackTrace+0x2c2
C [libJvmtiStressModule.so+0x302d] trace_stack+0xa9
C [libJvmtiStressModule.so+0x3daf] agent_sampler+0x21f
V [libjvm.so+0xddf595] JvmtiAgentThread::call_start_function()+0x67
V [libjvm.so+0xddf52a]
JvmtiAgentThread::start_function_wrapper(JavaThread*, Thread*)+0xf2
V [libjvm.so+0x1218945] JavaThread::thread_main_inner()+0x17f
V [libjvm.so+0x12187ad] JavaThread::run()+0x273
V [libjvm.so+0x100e4ee] thread_native_entry(Thread*)+0x192



Leonid attached the full hs_err log to the bug report.

Thanks,
Serguei
Post by s***@oracle.com
Hi Robbin and David,
There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces.
But there is an instance of the JvmtiEnv::SuspendThread which seems to
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:768
#4  0x00002ae393b51f2e in Monitor::wait
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:847
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
The JavaThread::suspend() call from the stack trace above also holds a
void JavaThread::java_suspend() {
  ThreadsListHandle tlh;
  if (!tlh.includes(this) || threadObj() == NULL || is_exiting()) {
    return;
  }
  { MutexLockerEx ml(SR_lock(), Mutex::_no_safepoint_check_flag);
    if (!is_external_suspend()) {
      // a racing resume has cancelled us; bail out now
      return;
    }
    // suspend is done
    uint32_t debug_bits = 0;
    // Warning: is_ext_suspend_completed() may temporarily drop the
    // SR_lock to allow the thread to reach a stable thread state if
    // it is currently in a transient thread state.
    if (is_ext_suspend_completed(false /* !called_by_wait */,
                                 SuspendRetryDelay, &debug_bits)) {
      return;
    }
  }
  VM_ThreadSuspend vm_suspend;
  VMThread::execute(&vm_suspend);
}
I'll check with Leonid tomorrow if we can check if your patch below
can fix this deadlock.
Thanks,
Serguei
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon is
that Thread 101 is terminating and has called
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list,
but the VM_HandshakeAllThreads is not a safepoint operation and
does not hold the Threads_lock, so is it possible this thread was
captured by the JavaThreadIteratorWithHandle being used by
VM_HandshakeAllThreads, before it got removed? If so we'd be hung
waiting it for it handshake as it's not in a "safepoint-safe" or
suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of
the Threads_list, scans the list and process handshakes on behalf of
safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but
must stick around if it is on any snapshots of alive. When it is no
on any list it will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every
iteration any exiting can proceed since it will not be a the new
snapshot. Thus cancel the handshake and VM thread can exit the loop
(if this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to
take a lock which is 'used' by VM Thread or inside the handshake we
can deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive,
this could deadlock AFAICT. Since the thread will rest on
VMOperationRequest_lock with a Threads list snapshot but VM thread
cannot finishes handshake until that snapshot is released.
I suggest first step is to add something like this patch below and
fix the obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm
saying that this seem to be real issue. And considering how the stack
traces looks, it may be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp    Tue Oct 23 13:27:41
2018 +0200
+++ b/src/hotspot/share/runtime/thread.hpp    Wed Oct 24 09:13:17
2018 +0200
@@ -167,2 +167,6 @@
   }
+  bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp    Tue Oct 23
13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp    Wed Oct 24
09:13:17 2018 +0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+  return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
 inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp    Tue Oct 23 13:27:41
2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp    Wed Oct 24 09:13:17
2018 +0200
@@ -608,2 +608,3 @@
   if (!t->is_VM_thread()) {
+    assert(t->have_threads_list(), "Deadlock if we have exiting
threads and if vm thread is running an VM op."); // fatal/guarantee
     SkipGCALot sgcalot(t);    // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not
a safepoint operation. There's no real way to tell from the
stacks what it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go
through. The handshake will log a warning after a while, is
there such warning from the handshake operation?
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just
be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint
checks enabled, so all JavaThreads blocked trying to acquire
it will be _thread_blocked and so safepoint-safe and so
won't be holding up the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test
might be blocked because it invoke 2 jvmti methods. Can
jvmti agent invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another
thread is being suspended.
Both are blocked at a safepoint which is Okay in general
but not Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker
(mutex=0x2ae398024f10, this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  ciEnv::cache_jvmti_state
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in
CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker
(mutex=0x2ae398024f10, this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  ciEnv::cache_jvmti_state
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in
CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker
(mutex=0x2ae398024f10, this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run
(this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor
JvmtiThreadState_lock in the function
ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all
threads that are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all
threads have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you
turn off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make
sense to integrate jvmti changes but *don't* enabled jvmti
module by default.
This one is a deadlock.
However, the root cause is a race condition that can
potentially result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.
 This test would be out of CI to don't introduce any bugs.
Does it make sense?
Consider hang - I think that it might be product bug since
I don't see any locking on my monitors. But I am not sure.
Is it possible that any my code jvmti agent prevent VM to
get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so
many threads executed at the same time.
() from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
no_safepoint_check=<optimized out>,
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
java_thread=0x2ae3985f2000) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler
(jvmti=0x2ae39801b270, env=<optimized out>, p=<optimized
out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
() from /lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
no_safepoint_check=<optimized out>,
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
(op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled
(env=0x2ae39801b270, thread=0x0,
event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in
JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
# A fatal error has been detected by the Java Runtime
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps,
mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core
dumps may be processed with "/usr/libexec/abrt-hook-ccpp
%s %c %p %u %g %t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2
-XX:MaxRAMPercentage=50 -XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com
<http://scaaa118.us.oracle.com>, Intel(R) Xeon(R) CPU
release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31
seconds (0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM
Thread" [stack: 0x00002af44f10a000,0x00002af44f20a000]
[id=13962] _threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled
Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int,
char const*, char const*, __va_list_tag*, Thread*,
unsigned char*, void*, void*, char const*, int, unsigned
long)+0x2c3
V  [libjvm.so+0x18c56ef]
 VMError::report_and_die(Thread*, void*, char const*,
int, char const*, char const*, __va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*,
int, char const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]
 VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]
 thread_native_entry(Thread*)+0x100
VM_Operation (0x00002af4d8502910): UpdateForPopTopFrame,
mode: safepoint, requested by thread 0x00002af4dc008800
Robbin Ehn
2018-10-25 13:15:42 UTC
Permalink
Hi, here is a fix, which allows ThreadsList to be used over a VM operation.

http://cr.openjdk.java.net/~rehn/8212933/v1_handshak_vm_cancel/webrev/

Please test it out.

/Robbin
Post by s***@oracle.com
Leonid confirmed this deadlock is not reproducible if the Kitchensink
agent_sampler is disabled.
Also, applying the patch from Robbin (with agent_sampler enabled) hit new assert
#
#
# Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:607), pid=26188,
tid=26325
# assert(!t->have_threads_list()) failed: Deadlock if we have exiting threads
and if vm thread is running an VM op.
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (slowdebug build
12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (slowdebug
12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps, mixed mode, sharing,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be processed with
"/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/core.26188)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
--------------- S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal -XX:+StartAttachListener
-XX:NativeMemoryTracking=detail -XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/kitchensink.final.properties
Server release 7.3
Time: Wed Oct 24 13:28:30 2018 PDT elapsed time: 3 seconds (0d 0h 0m 3s)
--------------- T H R E A D ---------------
Current thread (0x00002b9f68006000): JavaThread "Jvmti-AgentSampler" daemon
[_thread_in_vm, id=26325, stack(0x00002b9f88808000,0x00002b9f88909000)]
_threads_hazard_ptr=0x00002b9f68008e30, _nested_threads_hazard_ptr_cnt=0
Stack: [0x00002b9f88808000,0x00002b9f88909000], sp=0x00002b9f88907440, free
space=1021k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted,
Vv=VM code, C=native code)
V [libjvm.so+0x12a04bb] VMError::report_and_die(int, char const*, char const*,
__va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int,
unsigned long)+0x6a5
V [libjvm.so+0x129fdb3] VMError::report_and_die(Thread*, void*, char const*,
int, char const*, char const*, __va_list_tag*)+0x57
V [libjvm.so+0x8ca5ab] report_vm_error(char const*, int, char const*, char
const*, ...)+0x152
V [libjvm.so+0x12e485b] VMThread::execute(VM_Operation*)+0x99
V [libjvm.so+0xdbec54] JvmtiEnv::GetStackTrace(JavaThread*, int, int,
_jvmtiFrameInfo*, int*)+0xc0
V [libjvm.so+0xd677cf] jvmti_GetStackTrace+0x2c2
C [libJvmtiStressModule.so+0x302d] trace_stack+0xa9
C [libJvmtiStressModule.so+0x3daf] agent_sampler+0x21f
V [libjvm.so+0xddf595] JvmtiAgentThread::call_start_function()+0x67
V [libjvm.so+0xddf52a] JvmtiAgentThread::start_function_wrapper(JavaThread*,
Thread*)+0xf2
V [libjvm.so+0x1218945] JavaThread::thread_main_inner()+0x17f
V [libjvm.so+0x12187ad] JavaThread::run()+0x273
V [libjvm.so+0x100e4ee] thread_native_entry(Thread*)+0x192
Leonid attached the full hs_err log to the bug report.
Thanks,
Serguei
Post by s***@oracle.com
Hi Robbin and David,
There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces.
But there is an instance of the JvmtiEnv::SuspendThread which seems to be
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:768
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:847
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270, env=<optimized
out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800,
__the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
The JavaThread::suspend() call from the stack trace above also holds a
void JavaThread::java_suspend() {
  ThreadsListHandle tlh;
  if (!tlh.includes(this) || threadObj() == NULL || is_exiting()) {
    return;
  }
  { MutexLockerEx ml(SR_lock(), Mutex::_no_safepoint_check_flag);
    if (!is_external_suspend()) {
      // a racing resume has cancelled us; bail out now
      return;
    }
    // suspend is done
    uint32_t debug_bits = 0;
    // Warning: is_ext_suspend_completed() may temporarily drop the
    // SR_lock to allow the thread to reach a stable thread state if
    // it is currently in a transient thread state.
    if (is_ext_suspend_completed(false /* !called_by_wait */,
                                 SuspendRetryDelay, &debug_bits)) {
      return;
    }
  }
  VM_ThreadSuspend vm_suspend;
  VMThread::execute(&vm_suspend);
}
I'll check with Leonid tomorrow if we can check if your patch below can fix
this deadlock.
Thanks,
Serguei
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon is that
Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete and
  // Wait for a release_stable_list() call before we check again. No
  // safepoint check, no timeout, and not as suspend equivalent flag
  // because this JavaThread is not on the Threads list.
  ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
                                         0,
!Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but the
VM_HandshakeAllThreads is not a safepoint operation and does not hold the
Threads_lock, so is it possible this thread was captured by the
JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads, before
it got removed? If so we'd be hung waiting it for it handshake as it's not
in a "safepoint-safe" or suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of the
Threads_list, scans the list and process handshakes on behalf of safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but must
stick around if it is on any snapshots of alive. When it is no on any list it
will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every iteration any
exiting can proceed since it will not be a the new snapshot. Thus cancel the
handshake and VM thread can exit the loop (if this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to take a lock
which is 'used' by VM Thread or inside the handshake we can deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive, this
could deadlock AFAICT. Since the thread will rest on VMOperationRequest_lock
with a Threads list snapshot but VM thread cannot finishes handshake until
that snapshot is released.
I suggest first step is to add something like this patch below and fix the
obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm saying that
this seem to be real issue. And considering how the stack traces looks, it
may be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.hpp    Wed Oct 24 09:13:17 2018 +0200
@@ -167,2 +167,6 @@
   }
+  bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp    Tue Oct 23 13:27:41 2018
+0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp    Wed Oct 24 09:13:17 2018
+0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+  return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
 inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp    Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp    Wed Oct 24 09:13:17 2018 +0200
@@ -608,2 +608,3 @@
   if (!t->is_VM_thread()) {
+    assert(t->have_threads_list(), "Deadlock if we have exiting threads and
if vm thread is running an VM op."); // fatal/guarantee
     SkipGCALot sgcalot(t);    // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already explained
but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks what
it's stuck on.
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through. The
handshake will log a warning after a while, is there such warning from
the handshake operation?
There are several threads competing with e.g. Threads_lock, and threads
waiting for GC and several other VM ops, could it just be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up the
safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might
be blocked because it invoke 2 jvmti methods. Can jvmti agent
invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread
is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock
in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't
see any locking on my monitors. But I am not sure. Is it possible
that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized
out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug
build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode,
tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
%P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d
0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]  JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
Leonid Mesnik
2018-10-25 20:11:07 UTC
Permalink
Testing is still in progress, out of 30min tests passed and 8 hours tasks are still running.
http://java-dev.se.oracle.com:10067/mdash/jobs/lmesnik-ks-short-test-20181025-1842-7762?search=result.status%3APASSED

I haven't run any other tests.

Leonid
Post by Robbin Ehn
Hi, here is a fix, which allows ThreadsList to be used over a VM operation.
http://cr.openjdk.java.net/~rehn/8212933/v1_handshak_vm_cancel/webrev/
Please test it out.
/Robbin
Leonid confirmed this deadlock is not reproducible if the Kitchensink agent_sampler is disabled.
Also, applying the patch from Robbin (with agent_sampler enabled) hit new assert
#
#
# Internal Error (/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:607), pid=26188, tid=26325
# assert(!t->have_threads_list()) failed: Deadlock if we have exiting threads and if vm thread is running an VM op.
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (slowdebug build 12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (slowdebug 12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/core.26188) #
# http://bugreport.java.com/bugreport/crash.jsp
#
--------------- S U M M A R Y ------------
Time: Wed Oct 24 13:28:30 2018 PDT elapsed time: 3 seconds (0d 0h 0m 3s)
--------------- T H R E A D ---------------
Current thread (0x00002b9f68006000): JavaThread "Jvmti-AgentSampler" daemon [_thread_in_vm, id=26325, stack(0x00002b9f88808000,0x00002b9f88909000)] _threads_hazard_ptr=0x00002b9f68008e30, _nested_threads_hazard_ptr_cnt=0
Stack: [0x00002b9f88808000,0x00002b9f88909000], sp=0x00002b9f88907440, free space=1021k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x12a04bb] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x6a5
V [libjvm.so+0x129fdb3] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x57
V [libjvm.so+0x8ca5ab] report_vm_error(char const*, int, char const*, char const*, ...)+0x152
V [libjvm.so+0x12e485b] VMThread::execute(VM_Operation*)+0x99
V [libjvm.so+0xdbec54] JvmtiEnv::GetStackTrace(JavaThread*, int, int, _jvmtiFrameInfo*, int*)+0xc0
V [libjvm.so+0xd677cf] jvmti_GetStackTrace+0x2c2
C [libJvmtiStressModule.so+0x302d] trace_stack+0xa9
C [libJvmtiStressModule.so+0x3daf] agent_sampler+0x21f
V [libjvm.so+0xddf595] JvmtiAgentThread::call_start_function()+0x67
V [libjvm.so+0xddf52a] JvmtiAgentThread::start_function_wrapper(JavaThread*, Thread*)+0xf2
V [libjvm.so+0x1218945] JavaThread::thread_main_inner()+0x17f
V [libjvm.so+0x12187ad] JavaThread::run()+0x273
V [libjvm.so+0x100e4ee] thread_native_entry(Thread*)+0x192
Leonid attached the full hs_err log to the bug report.
Thanks,
Serguei
Post by s***@oracle.com
Hi Robbin and David,
There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces.
#2 0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#9 0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270, thread=0x2ae49929fdf8) at /scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270, env=<optimized out>, p=<optimized out>) at /scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800, __the_thread__=<optimized out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
void JavaThread::java_suspend() {
ThreadsListHandle tlh;
if (!tlh.includes(this) || threadObj() == NULL || is_exiting()) {
return;
}
{ MutexLockerEx ml(SR_lock(), Mutex::_no_safepoint_check_flag);
if (!is_external_suspend()) {
// a racing resume has cancelled us; bail out now
return;
}
// suspend is done
uint32_t debug_bits = 0;
// Warning: is_ext_suspend_completed() may temporarily drop the
// SR_lock to allow the thread to reach a stable thread state if
// it is currently in a transient thread state.
if (is_ext_suspend_completed(false /* !called_by_wait */,
SuspendRetryDelay, &debug_bits)) {
return;
}
}
VM_ThreadSuspend vm_suspend;
VMThread::execute(&vm_suspend);
}
I'll check with Leonid tomorrow if we can check if your patch below can fix this deadlock.
Thanks,
Serguei
Post by Robbin Ehn
Post by David Holmes
// Wait for a release_stable_list() call before we check again. No
// safepoint check, no timeout, and not as suspend equivalent flag
// because this JavaThread is not on the Threads list.
ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
0, !Mutex::_as_suspend_equivalent_flag);
As the comment says this thread is no longer on the Threads_list, but the VM_HandshakeAllThreads is not a safepoint operation and does not hold the Threads_lock, so is it possible this thread was captured by the JavaThreadIteratorWithHandle being used by VM_HandshakeAllThreads, before it got removed? If so we'd be hung waiting it for it handshake as it's not in a "safepoint-safe" or suspend-equivalent state.
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot of the Threads_list, scans the list and process handshakes on behalf of safe threads.
Releases snapshot and Threads_lock and checks if all handshakes are completed
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list, but must stick around if it is on any snapshots of alive. When it is no on any list it will cancel the handshake.
Since VM thread during the handshake takes a new snapshot every iteration any exiting can proceed since it will not be a the new snapshot. Thus cancel the handshake and VM thread can exit the loop (if this was the last handshake).
If any thread grabs a snapshot of threads list and later tries to take a lock which is 'used' by VM Thread or inside the handshake we can deadlock.
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle alive, this could deadlock AFAICT. Since the thread will rest on VMOperationRequest_lock with a Threads list snapshot but VM thread cannot finishes handshake until that snapshot is released.
I suggest first step is to add something like this patch below and fix the obvious ones first.
Note, I have not verified that is the problem you are seeing, I'm saying that this seem to be real issue. And considering how the stack traces looks, it may be this.
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.hpp Wed Oct 24 09:13:17 2018 +0200
@@ -167,2 +167,6 @@
}
+ bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/thread.inline.hpp Wed Oct 24 09:13:17 2018 +0200
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+ return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list) {
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp Tue Oct 23 13:27:41 2018 +0200
+++ b/src/hotspot/share/runtime/vmThread.cpp Wed Oct 24 09:13:17 2018 +0200
@@ -608,2 +608,3 @@
if (!t->is_VM_thread()) {
+ assert(t->have_threads_list(), "Deadlock if we have exiting threads and if vm thread is running an VM op."); // fatal/guarantee
SkipGCALot sgcalot(t); // avoid re-entrant attempts to gc-a-lot
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already explained but more sophisticated.
I suspect a scenario with JvmtiThreadState_lock that the flag Monitor::_safepoint_check_always does not help much.
It can be verified by checking what monitors are used by the blocked threads.
Thanks,
Serguei
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a safepoint operation. There's no real way to tell from the stacks what it's stuck on.
I cannot find a thread that is not considered safepoint safe or is_ext_suspended (thread 146). So the handshake should go through. The handshake will log a warning after a while, is there such warning from the handshake operation?
There are several threads competing with e.g. Threads_lock, and threads waiting for GC and several other VM ops, could it just be really slow?
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks enabled, so all JavaThreads blocked trying to acquire it will be _thread_blocked and so safepoint-safe and so won't be holding up the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid <http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with stack attached. Seems that test might be blocked because it invoke 2 jvmti methods. Can jvmti agent invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread is being suspended.
Both are blocked at a safepoint which is Okay in general but not Okay if they hold any lock.
For instance, the thread #152 is holding the monitor JvmtiThreadState.
#2 0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3 Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4 0x00002ae393b512c1 in lock (Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
#6 0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10, this=<synthetic pointer>) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#9 0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984c7800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
#2 0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3 Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4 0x00002ae393b512c1 in lock (Self=0x2ae3984ca800, this=0x2ae398024f10) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
#6 0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10, this=<synthetic pointer>) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#9 0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae3984ca800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
#2 0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3 Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4 0x00002ae393b512c1 in lock (Self=0x2ae4600c2800, this=0x2ae398024f10) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
#6 0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10, this=<synthetic pointer>) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7 thread_started (thread=0x2ae4600c2800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8 JvmtiEventController::thread_started (thread=0x2ae4600c2800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae4600c2800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that are blocked on the JvmtiThreadState_lock.
- grabbed the monitor JvmtiThreadState_lock
- blocked in the VM_GetCurrentLocation in the function JvmtiEnvThreadState::reset_current_location()
- blocked on the monitor JvmtiThreadState_lock
- can not reach the blocked at a safepoint state (all threads have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes. This test would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I don't see any locking on my monitors. But I am not sure. Is it possible that any my code jvmti agent prevent VM to get into safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many threads executed at the same time.
#2 0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
8
try=false) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
47
nv.cpp:955
#9 0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270, thread=0x2ae49929fdf8) at /scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270, env=<optimized out>, p=<optimized out>) at /scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper (thread=0x2ae454004800, __the_thread__=<optimized out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry (thread=0x2ae454004800) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
#2 0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
8
try=false) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5 0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
ue) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7 0x00002ae393997acf in recompute_env_thread_enabled (state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
:523
#9 0x00002ae393998168 in JvmtiEventControllerPrivate::recompute_enabled () at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true, event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0, env=0x2ae39801b270) at /scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270, thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP, enabled=<optimized out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#14 0x00002ae394d97989 in enable_events () at /scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration (env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
# Internal Error (/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277), pid=13926, tid=13962
# assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (fastdebug build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
--------------- S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50 -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC -XX:+PrintFlagsFinal -XX:+StartAttachListener -XX:NativeMemoryTracking=detail -XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED -Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir -Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home -agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so applications.kitchensink.process.stress.Main /scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Time: Tue Oct 9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h 0m 31s)
--------------- T H R E A D ---------------
Current thread (0x00002af3dc6ac800): VMThread "VM Thread" [stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962] _threads_hazard_ptr=0x00002af4ac090eb0, _nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000], sp=0x00002af44f208720, free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x18c4923] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x2c3
V [libjvm.so+0x18c56ef] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x2f
V [libjvm.so+0xb55aa0] report_vm_error(char const*, int, char const*, char const*, ...)+0x100
V [libjvm.so+0x11f2cfe] JvmtiThreadState::cur_stack_depth()+0x14e
V [libjvm.so+0x11f3257] JvmtiThreadState::update_for_pop_top_frame()+0x27
V [libjvm.so+0x119af99] VM_UpdateForPopTopFrame::doit()+0xb9
V [libjvm.so+0x1908982] VM_Operation::evaluate()+0x132
V [libjvm.so+0x19040be] VMThread::evaluate_operation(VM_Operation*) [clone .constprop.51]+0x18e
V [libjvm.so+0x1904960] VMThread::loop()+0x4c0
V [libjvm.so+0x1904f53] VMThread::run()+0xd3
V [libjvm.so+0x14e8300] thread_native_entry(Thread*)+0x100
VM_Operation (0x00002af4d8502910): UpdateForPopTopFrame, mode: safepoint, requested by thread 0x00002af4dc008800
Robbin Ehn
2018-10-25 20:24:24 UTC
Permalink
Great, thanks.

I did some local stress tests and t1-5.

I'll rfr it tomorrow if KS passes.

/Robbin
Post by Leonid Mesnik
Testing is still in progress, out of 30min tests passed and 8 hours tasks are still running.
http://java-dev.se.oracle.com:10067/mdash/jobs/lmesnik-ks-short-test-20181025-1842-7762?search=result.status%3APASSED
I haven't run any other tests.
Leonid
Post by Robbin Ehn
Hi, here is a fix, which allows ThreadsList to be used over a VM
operation.
http://cr.openjdk.java.net/~rehn/8212933/v1_handshak_vm_cancel/webrev/
Post by Robbin Ehn
Please test it out.
/Robbin
Post by s***@oracle.com
Leonid confirmed this deadlock is not reproducible if the
Kitchensink agent_sampler is disabled.
Post by Robbin Ehn
Post by s***@oracle.com
Also, applying the patch from Robbin (with agent_sampler enabled)
hit new assert
Post by Robbin Ehn
Post by s***@oracle.com
that has caught another case in JvmtiEnv::GetStackTrace with the
With proposed patch issue reproduced with hs_err (file in
#
#
# Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:607),
pid=26188, tid=26325
Post by Robbin Ehn
Post by s***@oracle.com
# assert(!t->have_threads_list()) failed: Deadlock if we have
exiting threads and if vm thread is running an VM op.
Post by Robbin Ehn
Post by s***@oracle.com
#
# JRE version: Java(TM) SE Runtime Environment (12.0) (slowdebug
build 12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps)
Post by Robbin Ehn
Post by s***@oracle.com
# Java VM: Java HotSpot(TM) 64-Bit Server VM (slowdebug
12-internal+0-2018-10-24-2022348.lmesnik.hs-bigapps, mixed mode,
sharing, tiered, compressed oops, g1 gc, linux-amd64)
Post by Robbin Ehn
Post by s***@oracle.com
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I
%h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/core.26188)
#
Post by Robbin Ehn
Post by s***@oracle.com
# http://bugreport.java.com/bugreport/crash.jsp
#
--------------- S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false
-XX:-PrintVMOptions -XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal -XX:+StartAttachListener
-XX:NativeMemoryTracking=detail -XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSampler_java/scratch/0/kitchensink.final.properties
Linux Server release 7.3
Post by Robbin Ehn
Post by s***@oracle.com
Time: Wed Oct 24 13:28:30 2018 PDT elapsed time: 3 seconds (0d 0h 0m
3s)
Post by Robbin Ehn
Post by s***@oracle.com
--------------- T H R E A D ---------------
Current thread (0x00002b9f68006000): JavaThread "Jvmti-AgentSampler"
daemon [_thread_in_vm, id=26325,
stack(0x00002b9f88808000,0x00002b9f88909000)]
_threads_hazard_ptr=0x00002b9f68008e30,
_nested_threads_hazard_ptr_cnt=0
Post by Robbin Ehn
Post by s***@oracle.com
Stack: [0x00002b9f88808000,0x00002b9f88909000],
sp=0x00002b9f88907440, free space=1021k
Post by Robbin Ehn
Post by s***@oracle.com
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
Post by Robbin Ehn
Post by s***@oracle.com
V [libjvm.so+0x12a04bb] VMError::report_and_die(int, char const*,
char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*,
char const*, int, unsigned long)+0x6a5
Post by Robbin Ehn
Post by s***@oracle.com
V [libjvm.so+0x129fdb3] VMError::report_and_die(Thread*, void*, char
const*, int, char const*, char const*, __va_list_tag*)+0x57
Post by Robbin Ehn
Post by s***@oracle.com
V [libjvm.so+0x8ca5ab] report_vm_error(char const*, int, char
const*, char const*, ...)+0x152
Post by Robbin Ehn
Post by s***@oracle.com
V [libjvm.so+0x12e485b] VMThread::execute(VM_Operation*)+0x99
V [libjvm.so+0xdbec54] JvmtiEnv::GetStackTrace(JavaThread*, int,
int, _jvmtiFrameInfo*, int*)+0xc0
Post by Robbin Ehn
Post by s***@oracle.com
V [libjvm.so+0xd677cf] jvmti_GetStackTrace+0x2c2
C [libJvmtiStressModule.so+0x302d] trace_stack+0xa9
C [libJvmtiStressModule.so+0x3daf] agent_sampler+0x21f
V [libjvm.so+0xddf595] JvmtiAgentThread::call_start_function()+0x67
V [libjvm.so+0xddf52a]
JvmtiAgentThread::start_function_wrapper(JavaThread*, Thread*)+0xf2
Post by Robbin Ehn
Post by s***@oracle.com
V [libjvm.so+0x1218945] JavaThread::thread_main_inner()+0x17f
V [libjvm.so+0x12187ad] JavaThread::run()+0x273
V [libjvm.so+0x100e4ee] thread_native_entry(Thread*)+0x192
Leonid attached the full hs_err log to the bug report.
Thanks,
Serguei
Post by s***@oracle.com
Hi Robbin and David,
There is no JvmtiEnv::SuspendThreadList call in the dumped stack
traces.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
But there is an instance of the JvmtiEnv::SuspendThread which seems
/lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#1 0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#2 0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:768
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#4 0x00002ae393b51f2e in Monitor::wait
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#5 0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#6 0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#7 0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:847
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#8 0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:955
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#9 0x00002ae39393a8c6 in jvmti_SuspendThread (env=0x2ae39801b270,
thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:527
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:274
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#11 0x00002ae3939ab24d in call_start_function (this=0x2ae454004800)
at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
The JavaThread::suspend() call from the stack trace above also
holds a ThreadsListHandle while executing
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
void JavaThread::java_suspend() {
ThreadsListHandle tlh;
if (!tlh.includes(this) || threadObj() == NULL || is_exiting()) {
return;
}
{ MutexLockerEx ml(SR_lock(), Mutex::_no_safepoint_check_flag);
if (!is_external_suspend()) {
// a racing resume has cancelled us; bail out now
return;
}
// suspend is done
uint32_t debug_bits = 0;
// Warning: is_ext_suspend_completed() may temporarily drop the
// SR_lock to allow the thread to reach a stable thread state
if
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
// it is currently in a transient thread state.
if (is_ext_suspend_completed(false /* !called_by_wait */,
SuspendRetryDelay, &debug_bits)) {
return;
}
}
VM_ThreadSuspend vm_suspend;
VMThread::execute(&vm_suspend);
}
I'll check with Leonid tomorrow if we can check if your patch below
can fix this deadlock.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Thanks,
Serguei
Post by Robbin Ehn
Post by David Holmes
One thing I noticed which Robbin should be able to expand upon
is that Thread 101 is terminating and has called
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
// Wait for a release_stable_list() call before we check
again. No
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
// safepoint check, no timeout, and not as suspend equivalent
flag
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
// because this JavaThread is not on the Threads list.
ThreadsSMRSupport::delete_lock()->wait(Mutex::_no_safepoint_check_flag,
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
0,
!Mutex::_as_suspend_equivalent_flag);
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
As the comment says this thread is no longer on the
Threads_list, but the VM_HandshakeAllThreads is not a safepoint
operation and does not hold the Threads_lock, so is it possible this
thread was captured by the JavaThreadIteratorWithHandle being used by
VM_HandshakeAllThreads, before it got removed? If so we'd be hung
waiting it for it handshake as it's not in a "safepoint-safe" or
suspend-equivalent state.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
# VM Thread
VM Thread is in a loop, takes Threads_lock, takes a new snapshot
of the Threads_list, scans the list and process handshakes on behalf of
safe threads.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Releases snapshot and Threads_lock and checks if all handshakes
are completed
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
# An exiting thread
A thread exiting thread removes it self from _THE_ threads list,
but must stick around if it is on any snapshots of alive. When it is no
on any list it will cancel the handshake.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Since VM thread during the handshake takes a new snapshot every
iteration any exiting can proceed since it will not be a the new
snapshot. Thus cancel the handshake and VM thread can exit the loop (if
this was the last handshake).
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
If any thread grabs a snapshot of threads list and later tries to
take a lock which is 'used' by VM Thread or inside the handshake we can
deadlock.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Considering that looking at e.g. : JvmtiEnv::SuspendThreadList
Which calls VMThread::execute(&tsj); with a ThreadsListHandle
alive, this could deadlock AFAICT. Since the thread will rest on
VMOperationRequest_lock with a Threads list snapshot but VM thread
cannot finishes handshake until that snapshot is released.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
I suggest first step is to add something like this patch below and
fix the obvious ones first.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Note, I have not verified that is the problem you are seeing, I'm
saying that this seem to be real issue. And considering how the stack
traces looks, it may be this.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
You want me going through this, just assign a bug if there is one?
/Robbin
diff -r 622fd3608374 src/hotspot/share/runtime/thread.hpp
--- a/src/hotspot/share/runtime/thread.hpp Tue Oct 23 13:27:41
2018 +0200
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
+++ b/src/hotspot/share/runtime/thread.hpp Wed Oct 24 09:13:17
2018 +0200
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
@@ -167,2 +167,6 @@
}
+ bool have_threads_list();
+
diff -r 622fd3608374 src/hotspot/share/runtime/thread.inline.hpp
--- a/src/hotspot/share/runtime/thread.inline.hpp Tue Oct 23
13:27:41 2018 +0200
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
+++ b/src/hotspot/share/runtime/thread.inline.hpp Wed Oct 24
09:13:17 2018 +0200
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
@@ -111,2 +111,6 @@
+inline bool Thread::have_threads_list() {
+ return OrderAccess::load_acquire(&_threads_hazard_ptr) != NULL;
+}
+
inline void Thread::set_threads_hazard_ptr(ThreadsList* new_list)
{
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
diff -r 622fd3608374 src/hotspot/share/runtime/vmThread.cpp
--- a/src/hotspot/share/runtime/vmThread.cpp Tue Oct 23
13:27:41 2018 +0200
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
+++ b/src/hotspot/share/runtime/vmThread.cpp Wed Oct 24
09:13:17 2018 +0200
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
@@ -608,2 +608,3 @@
if (!t->is_VM_thread()) {
+ assert(t->have_threads_list(), "Deadlock if we have exiting
threads and if vm thread is running an VM op."); // fatal/guarantee
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
SkipGCALot sgcalot(t); // avoid re-entrant attempts to
gc-a-lot
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
David
-----
Post by s***@oracle.com
Please, skip it - sorry for the noise.
It is hard to prove anything with current dump.
Thanks,
Serguei
Post by s***@oracle.com
Hi David and Robbin,
I have an idea that needs to be checked.
It can be almost the same deadlock scenario that I've already
explained but more sophisticated.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
I suspect a scenario with JvmtiThreadState_lock that the flag
Monitor::_safepoint_check_always does not help much.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
It can be verified by checking what monitors are used by the
blocked threads.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Thanks,
Serguei
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is
not a safepoint operation. There's no real way to tell from the stacks
what it's stuck on.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
I cannot find a thread that is not considered safepoint safe
or is_ext_suspended (thread 146). So the handshake should go through.
The handshake will log a warning after a while, is there such warning
from the handshake operation?
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
There are several threads competing with e.g. Threads_lock,
and threads waiting for GC and several other VM ops, could it just be
really slow?
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with
safepoint checks enabled, so all JavaThreads blocked trying to acquire
it will be _thread_blocked and so safepoint-safe and so won't be
holding up the safepoint.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different
symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Also it might hangs with stack attached. Seems that
test might be blocked because it invoke 2 jvmti methods. Can jvmti
agent invoke jvmti methods from different threads?
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Yes, in general.
However, you have to be careful when using debugging
features.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Below, one thread is enabling single stepping while
another thread is being suspended.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Both are blocked at a safepoint which is Okay in general
but not Okay if they hold any lock.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Also, I see a couple of more threads that are
() from /lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#1 0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#2 0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#3 Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#4 0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#6 0x00002ae39350510c in MutexLocker
(mutex=0x2ae398024f10, this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#7 ciEnv::cache_jvmti_state
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#8 0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#9 0x00002ae3935d4f48 in
CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
() from /lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#1 0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#2 0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#3 Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#4 0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#6 0x00002ae39350510c in MutexLocker
(mutex=0x2ae398024f10, this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#7 ciEnv::cache_jvmti_state
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#8 0x00002ae3935d3294 in
CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#9 0x00002ae3935d4f48 in
CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#11 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#13 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
() from /lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#1 0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#2 0x00002ae393b50920 in ParkCommon (timo=0,
ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#3 Monitor::ILock (this=0x2ae398024f10,
Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#4 0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#6 0x00002ae393999682 in MutexLocker
(mutex=0x2ae398024f10, this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#7 thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#8 JvmtiEventController::thread_started
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#9 0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#10 0x00002ae393d737d8 in JavaThread::run
(this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#12 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor
JvmtiThreadState_lock in the function ciEnv::cache_jvmti_state().
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Also, there are many threads (like #51) that are
executing JvmtiExport::post_thread_start and blocked on the same
monitor.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread
this safepoint is waiting for?
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
I think, this safepoint operation is waiting for all
threads that are blocked on the JvmtiThreadState_lock.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
- grabbed the monitor JvmtiThreadState_lock
- blocked in the VM_GetCurrentLocation in the
function JvmtiEnvThreadState::reset_current_location()
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
- blocked on the monitor JvmtiThreadState_lock
- can not reach the blocked at a safepoint state (all
threads have to reach this state for this safepoint to happen)
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if
you turn off the single stepping for thread #152.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make
sense to integrate jvmti changes but *don't* enabled jvmti module by
default.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
This one is a deadlock.
However, the root cause is a race condition that can
potentially result in both deadlocks and crashes.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.
This test would be out of CI to don't introduce any bugs. Does it make
sense?
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Consider hang - I think that it might be product bug
since I don't see any locking on my monitors. But I am not sure. Is it
possible that any my code jvmti agent prevent VM to get into safepoint?
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Could we discuss it tomorrow or his week when you have
a time?
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are
so many threads executed at the same time.
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#0 0x00002ae3927b5945 in
#1 0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#2 0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
8
#4 0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#5 0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#6 0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#7 0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
47
#8 0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
nv.cpp:955
#9 0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler
(jvmti=0x2ae39801b270, env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#14 0x00002ae393d736c6 in JavaThread::run
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#17 0x00002ae392cc234d in clone () from
/lib64/libc.so.6
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#0 0x00002ae3927b5945 in
#1 0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#2 0x00002ae393b50cf8 in ParkCommon (timo=0,
ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
8
#4 0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#5 0x00002ae393de7867 in VMThread::execute
(op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#6 0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#7 0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
r.cpp:490
#8
JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
:523
#9 0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#10 0x00002ae39399a244 in set_user_enabled
(enabled=true, event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled
(env=0x2ae39801b270, thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP,
enabled=<optimized out>) at /scratch/lmesnik/ws/hs-bigapps/open/src/\
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#13 0x00002ae3939414eb in
jvmti_SetEventNotificationMode (env=0x2ae39801b270,
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at /scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
Do you know if i might be bug in my jvmti agent?
Leonid
#
# A fatal error has been detected by the Java Runtime
#
# Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
# Java VM: Java HotSpot(TM) 64-Bit Server VM
(fastdebug 12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
# Core dump will be written. Default location: Core
dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g
%t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
#
# If you would like to submit a bug report, please
# http://bugreport.java.com/bugreport/crash.jsp
#
--------------- S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2
-XX:MaxRAMPercentage=50 -XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal -XX:+StartAttachListener
-XX:NativeMemoryTracking=detail -XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
Host: scaaa118.us.oracle.com
2.90GHz, 32 cores, 235G, Oracle Linux Server release 7.3
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
Time: Tue Oct 9 16:06:07 2018 PDT elapsed time: 31
seconds (0d 0h 0m 31s)
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
--------------- T H R E A D ---------------
Current thread (0x00002af3dc6ac800): VMThread "VM
Thread" [stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
Stack: [0x00002af44f10a000,0x00002af44f20a000],
sp=0x00002af44f208720, free space=1017k
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
Native frames: (J=compiled Java code, A=aot compiled
Java code, j=interpreted, Vv=VM code, C=native code)
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x18c4923]
VMError::report_and_die(int, char const*, char const*, __va_list_tag*,
Thread*, unsigned char*, void*, void*, char const*, int, unsigned
long)+0x2c3
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x18c56ef]
VMError::report_and_die(Thread*, void*, char const*, int, char const*,
char const*, __va_list_tag*)+0x2f
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0xb55aa0] report_vm_error(char const*,
int, char const*, char const*, ...)+0x100
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x11f2cfe]
JvmtiThreadState::cur_stack_depth()+0x14e
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x11f3257]
JvmtiThreadState::update_for_pop_top_frame()+0x27
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x119af99]
VM_UpdateForPopTopFrame::doit()+0xb9
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x1908982]
VM_Operation::evaluate()+0x132
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x19040be]
VMThread::evaluate_operation(VM_Operation*) [clone .constprop.51]+0x18e
Post by Robbin Ehn
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by Robbin Ehn
Post by David Holmes
Post by s***@oracle.com
Post by David Holmes
Post by s***@oracle.com
Post by s***@oracle.com
Post by s***@oracle.com
V [libjvm.so+0x1904960] VMThread::loop()+0x4c0
V [libjvm.so+0x1904f53] VMThread::run()+0xd3
V [libjvm.so+0x14e8300]
thread_native_entry(Thread*)+0x100
UpdateForPopTopFrame, mode: safepoint, requested by thread
0x00002af4dc008800
s***@oracle.com
2018-10-23 23:15:43 UTC
Permalink
Hi David and Robbin,
Post by Robbin Ehn
Hi,
Post by David Holmes
Hi Serguei,
The VMThread is executing VM_HandshakeAllThreads which is not a
safepoint operation. There's no real way to tell from the stacks what
it's stuck on.
Good point.
We agreed with Leonid that he will try to reproduce this deadlock with
fewer kitchensink modules and modes.
It would really help if there are less threads to analyze.
Post by Robbin Ehn
I cannot find a thread that is not considered safepoint safe or
is_ext_suspended (thread 146). So the handshake should go through. The
handshake will log a warning after a while, is there such warning from
the handshake operation?
There is Thread #136 that executes JvmtiSuspendControl::suspend().
Not sure if it helps.
Post by Robbin Ehn
There are several threads competing with e.g. Threads_lock, and
threads waiting for GC and several other VM ops, could it just be
really slow?
No idea yet.

Thanks,
Serguei
Post by Robbin Ehn
/Robbin
Post by David Holmes
David
Post by s***@oracle.com
Hi David,
You are right, thanks.
It means, this deadlock needs more analysis.
For completeness, the stack traces are in attachments.
Thanks,
Serguei
Post by David Holmes
Hi Serguei,
The JvmtiThreadState_lock is always acquired with safepoint checks
enabled, so all JavaThreads blocked trying to acquire it will be
_thread_blocked and so safepoint-safe and so won't be holding up
the safepoint.
David
Post by s***@oracle.com
Hi,
I've added the seviceability-dev mailing list.
It can be interesting for the SVC folks. :)
Hi
Seems last version also crashes with 2 other different symptoms.
http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status%3Afailed+AND+-state%3Ainvalid
<http://java.se.oracle.com:10065/mdash/jobs/lmesnik-ks8-20181021-0638-7157/results?search=status:failed+AND+-state:invalid>
Also it might hangs with  stack attached.  Seems that test might
be blocked because it invoke 2 jvmti methods. Can jvmti agent
invoke jvmti methods from different threads?
Yes, in general.
However, you have to be careful when using debugging features.
Below, one thread is enabling single stepping while another thread
is being suspended.
Both are blocked at a safepoint which is Okay in general but not
Okay if they hold any lock.
For instance, the thread #152 is holding the monitor
JvmtiThreadState.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984c9100) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock
(Self=0x2ae3984c7is_ext_suspended800, this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984c7800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae3984cbb00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae3984ca800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae39350510c in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/ci/ciEnv.cpp:229
#8  0x00002ae3935d3294 in CompileBroker::invoke_compiler_on_method
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:2084
#9  0x00002ae3935d4f48 in CompileBroker::compiler_thread_loop () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/compiler/compileBroker.cpp:1798
#10 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#11 0x00002ae393d736c6 in JavaThread::run (this=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#12 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae3984ca800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#13 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50920 in ParkCommon (timo=0, ev=0x2ae460061c00) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
#3  Monitor::ILock (this=0x2ae398024f10, Self=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:461
#4  0x00002ae393b512c1 in lock (Self=0x2ae4600c2800,
this=0x2ae398024f10) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:910
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:919
#6  0x00002ae393999682 in MutexLocker (mutex=0x2ae398024f10,
this=<synthetic pointer>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutexLocker.hpp:182
#7  thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:668
#8  JvmtiEventController::thread_started (thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:1027
#9  0x00002ae39399f3a0 in JvmtiExport::post_thread_start
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiExport.cpp:1395
#10 0x00002ae393d737d8 in JavaThread::run (this=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1764
#11 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae4600c2800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#12 0x00002ae3927b1e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00002ae392cc234d in clone () from /lib64/libc.so.6
These two thread are blocked on the monitor JvmtiThreadState_lock
in the function ciEnv::cache_jvmti_state().
Also, there are many threads (like #51) that are executing
JvmtiExport::post_thread_start and blocked on the same monitor.
Now, the question is why this safepoint can not start?
What thread is blocking it? Or in reverse, what thread this
safepoint is waiting for?
I think, this safepoint operation is waiting for all threads that
are blocked on the JvmtiThreadState_lock.
   - grabbed the monitor JvmtiThreadState_lock
   - blocked in the VM_GetCurrentLocation in the function
JvmtiEnvThreadState::reset_current_location()
   - blocked on the monitor JvmtiThreadState_lock
   - can not reach the blocked at a safepoint state (all threads
have to reach this state for this safepoint to happen)
It seems to me, this is a bug which has to be filed.
My guess is that this will stop to reproduce after if you turn off
the single stepping for thread #152.
Please, let me know about the results.
Assuming that crashes look like VM bugs I think it make sense to
integrate jvmti changes but *don't* enabled jvmti module by default.
This one is a deadlock.
However, the root cause is a race condition that can potentially
result in both deadlocks and crashes.
So, I'm curious if you observed crashes as well.
And add to more tests with jvmti enabled.
So anyone could easily run them to reproduce crashes.  This test
would be out of CI to don't introduce any bugs. Does it make sense?
Consider hang - I think that it might be product bug since I
don't see any locking on my monitors. But I am not sure. Is it
possible that any my code jvmti agent prevent VM to get into
safepoint?
Could we discuss it tomorrow or his week when you have a time?
Yes, of course.
Let's find some time tomorrow.
Any suggestion how to diagnose deadlock would be great.
Analysis of stack traces is needed.
It is non-trivial in this particular case as there are so many
threads executed at the same time.
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae454005800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae393d6a3bd in JavaThread::java_suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:2321
#7  0x00002ae3939ad7e1 in JvmtiSuspendControl::suspend
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:8\
47
#8  0x00002ae3939887ae in JvmtiEnv::SuspendThread
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiE\
nv.cpp:955
#9  0x00002ae39393a8c6 in jvmti_SuspendThread
(env=0x2ae39801b270, thread=0x2ae49929fdf8) at
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles\
/jvmtiEnter.cpp:527
#10 0x00002ae394d973ee in agent_sampler (jvmti=0x2ae39801b270,
env=<optimized out>, p=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitc\
hensink/process/stress/modules/libJvmtiStressModule.c:274
#11 0x00002ae3939ab24d in call_start_function
(this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:85
#12 JvmtiAgentThread::start_function_wrapper
(thread=0x2ae454004800, __the_thread__=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiImpl.cpp:79
#13 0x00002ae393d7338a in JavaThread::thread_main_inner
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1795
#14 0x00002ae393d736c6 in JavaThread::run (this=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/thread.cpp:1775
#15 0x00002ae393ba0070 in thread_native_entry
(thread=0x2ae454004800) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/linux/os_linux.cpp:698
#16 0x00002ae3927b1e25 in start_thread () from
/lib64/libpthread.so.0
#17 0x00002ae392cc234d in clone () from /lib64/libc.so.6
/lib64/libpthread.so.0
#1  0x00002ae393ba8d63 in os::PlatformEvent::park
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/os/posix/os_posix.cpp:1897
#2  0x00002ae393b50cf8 in ParkCommon (timo=0, ev=0x2ae3985e7400) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:399
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:76\
8
#4  0x00002ae393b51f2e in Monitor::wait
try=false) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/mutex.cpp:1106
#5  0x00002ae393de7867 in VMThread::execute (op=0x2ae42705f500) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/runtime/vmThread.cpp:657
#6  0x00002ae3939965f3 in
JvmtiEnvThreadState::reset_current_location
ue) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnvThreadState.cpp:312
#7  0x00002ae393997acf in recompute_env_thread_enabled
(state=0x2ae6bc000cd0, ets=0x2ae6bc000d80) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventControlle\
r.cpp:490
#8  JvmtiEventControllerPrivate::recompute_thread_enabled
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp\
:523
#9  0x00002ae393998168 in
JvmtiEventControllerPrivate::recompute_enabled () at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEventController.cpp:598
#10 0x00002ae39399a244 in set_user_enabled (enabled=true,
event_type=JVMTI_EVENT_SINGLE_STEP, thread=0x0,
env=0x2ae39801b270) at
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/sha\
re/prims/jvmtiEventController.cpp:818
#11 JvmtiEventController::set_user_enabled (env=0x2ae39801b270,
thread=0x0, event_type=JVMTI_EVENT_SINGLE_STEP,
enabled=<optimized out>) at
/scratch/lmesnik/ws/hs-bigapps/open/src/\
hotspot/share/prims/jvmtiEventController.cpp:963
#12 0x00002ae393987d2d in JvmtiEnv::SetEventNotificationMode
/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiEnv.cpp:543
#13 0x00002ae3939414eb in jvmti_SetEventNotificationMode
event_thread=event_\
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnter.cpp:5389
#14 0x00002ae394d97989 in enable_events () at
/scratch/lmesnik/ws/hs-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:519
#15 0x00002ae394d98070 in
Java_applications_kitchensink_process_stress_modules_JvmtiStressModule_startIteration
(env=<optimized out>, this=<optimized out>) at
/scratch/lmesnik/ws/h\
s-bigapps/closed/test/hotspot/jtreg/applications/kitchensink/process/stress/modules/libJvmtiStressModule.c:697
#16 0x00002ae3a43ef257 in ?? ()
#17 0x00002ae3a43eede1 in ?? ()
#18 0x00002ae42705f878 in ?? ()
#19 0x00002ae40ad334e0 in ?? ()
#20 0x00002ae42705f8e0 in ?? ()
#21 0x00002ae40ad33c68 in ?? ()
#22 0x0000000000000000 in ?? ()
Thanks,
Serguei
Leonid
Post by s***@oracle.com
Hi Leonid,
https://bugs.openjdk.java.net/browse/JDK-8043571
Thanks,
Serguei
Post by s***@oracle.com
Hi
During fixing kitchensink I get
cur_stack_depth out of sync
Do you know if i might be bug in my jvmti agent?
Leonid
#
#
#  Internal Error
(/scratch/lmesnik/ws/hs-bigapps/open/src/hotspot/share/prims/jvmtiThreadState.cpp:277),
pid=13926, tid=13962
cur_stack_depth out of sync
#
# JRE version: Java(TM) SE Runtime Environment (12.0)
(fastdebug build
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
12-internal+0-2018-10-08-2342517.lmesnik.hs-bigapps, mixed
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may
be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g
%t e %P %I %h" (or dumping to
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/core.13926)
#
# http://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=2 -XX:MaxRAMPercentage=50
-XX:+CrashOnOutOfMemoryError
-Djava.net.preferIPv6Addresses=false -XX:-PrintVMOptions
-XX:+DisplayVMOutputToStderr -XX:+UsePerfData
-Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags
-XX:+DisableExplicitGC -XX:+PrintFlagsFinal
-XX:+StartAttachListener -XX:NativeMemoryTracking=detail
-XX:+FlightRecorder
--add-exports=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED
--add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED
-Djava.io.tmpdir=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/java.io.tmpdir
-Duser.home=/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/user.home
-agentpath:/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so
applications.kitchensink.process.stress.Main
/scratch/lmesnik/ws/hs-bigapps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_KitchensinkSanity_java/scratch/0/kitchensink.final.properties
Host: scaaa118.us.oracle.com <http://scaaa118.us.oracle.com>,
Oracle Linux Server release 7.3
Time: Tue Oct  9 16:06:07 2018 PDT elapsed time: 31 seconds (0d 0h 0m 31s)
---------------  T H R E A D  ---------------
Current thread (0x00002af3dc6ac800):  VMThread "VM Thread"
[stack: 0x00002af44f10a000,0x00002af44f20a000] [id=13962]
_threads_hazard_ptr=0x00002af4ac090eb0,
_nested_threads_hazard_ptr_cnt=0
Stack: [0x00002af44f10a000,0x00002af44f20a000],
 sp=0x00002af44f208720,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x18c4923]  VMError::report_and_die(int, char
const*, char const*, __va_list_tag*, Thread*, unsigned char*,
void*, void*, char const*, int, unsigned long)+0x2c3
V  [libjvm.so+0x18c56ef]  VMError::report_and_die(Thread*,
void*, char const*, int, char const*, char const*,
__va_list_tag*)+0x2f
V  [libjvm.so+0xb55aa0]  report_vm_error(char const*, int, char
const*, char const*, ...)+0x100
V  [libjvm.so+0x11f2cfe]
 JvmtiThreadState::cur_stack_depth()+0x14e
V  [libjvm.so+0x11f3257]
 JvmtiThreadState::update_for_pop_top_frame()+0x27
V  [libjvm.so+0x119af99]  VM_UpdateForPopTopFrame::doit()+0xb9
V  [libjvm.so+0x1908982]  VM_Operation::evaluate()+0x132
V  [libjvm.so+0x19040be]
 VMThread::evaluate_operation(VM_Operation*) [clone
.constprop.51]+0x18e
V  [libjvm.so+0x1904960]  VMThread::loop()+0x4c0
V  [libjvm.so+0x1904f53]  VMThread::run()+0xd3
V  [libjvm.so+0x14e8300]  thread_native_entry(Thread*)+0x100
safepoint, requested by thread 0x00002af4dc008800
Loading...