Bnd Launcher prefers to terminate using System.exit()?

Hi,

I recently observed that my OSGi framework application was always taking 5 extra seconds to exit cleanly. Specifically, that’s 5 seconds between my main function returning and my shutdown hooks starting. I would typically infer that the JVM is waiting for a non-daemon thread to exit for this time, except that the only likely candidate would be a TimerTask inside the SCR (2.1.28, 2.2.0), and I have already confirmed that the Felix framework has STOPPED.

I have obviously noticed that Bnd’s Launcher does not take 5 seconds just to stop, and that it also contains these lines:

// We exit, even if there are non-daemon threads active
// though we've reported those
if (exitcode != LauncherConstants.RETURN_INSTEAD_OF_EXIT)
        System.exit(exitcode);

So are these 5 seconds a “known thing” please, with System.exit() as the accepted workaround? There really shouldn’t be any non-daemon threads running after my framework has stopped, and I thought I’d ensured that there weren’t, but…

Thanks for any advice here,
Cheers,
Chris

By default the framework exits with a 0. Only when you register a Callable with the main property you can provide an exit code. It seems unlikely that you do this so the delay is in the framework stopping. You might want to take a look at your log, I expect a deactivator (SCR or Bundle-Activator) is hanging.

Hi,

Thanks for replying. Our Framework.waitForStop() call has already returned FrameworkEvent.STOPPED by this point. I’ve used VisualVM to take a thread-dump during this 5 second delay, and there’s really only one non-daemon thread still running:

"Timer-0" #34 prio=5 os_prio=0 cpu=8.30ms elapsed=14.94s tid=0x00007f46f176f000 nid=0x378e in Object.wait()  [0x00007f46928ce000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(java.base@11.0.14.1/Native Method)
        - waiting on <no object reference available>
        at java.util.TimerThread.mainLoop(java.base@11.0.14.1/Timer.java:553)
        - waiting to re-lock in wait() <0x0000000610402e30> (a java.util.TaskQueue)
        at java.util.TimerThread.run(java.base@11.0.14.1/Timer.java:506)

   Locked ownable synchronizers:
        - None

I have been assuming that Timer-0 is from the changeCountTimer inside org.apache.felix.scr.impl.ComponentRegistry.

Cheers,
Chris

Edit: Timer-0 is from the SCR’s ComponentRegistry, which I proved using this diff:

diff --git a/scr/src/main/java/org/apache/felix/scr/impl/ComponentRegistry.java b/scr/src/main/java/org/apache/felix/scr/impl/ComponentRegistry.java
index 7169f8758e..eb83ff0f7b 100644
--- a/scr/src/main/java/org/apache/felix/scr/impl/ComponentRegistry.java
+++ b/scr/src/main/java/org/apache/felix/scr/impl/ComponentRegistry.java
@@ -732,7 +732,7 @@ public class ComponentRegistry
             final Timer timer;
             synchronized ( this.changeCountTimerLock ) {
                 if ( this.changeCountTimer == null ) {
-                    this.changeCountTimer = new Timer();
+                    this.changeCountTimer = new Timer("SCR Component Registry");
                 }
                 timer = this.changeCountTimer;
             }

Raised with Felix as FELIX-6511.

Still don’t understand why the launcher does not just exit(0)? You need to jump through some hoops to not get this behavior?

I am not using Bnd’s Launcher here. However, Bnd’s Launcher clearly doesn’t have this problem because it does terminate using System.exit(). I was therefore wondering if the Launcher was consciously working around this bug in the SCR.

Cheers,
Chris

Nope. When I wrote the launcher (now 10 years ago or so) I realized there were always developers forgetting to set the daemon flag so exit at that point was the most reliable.

Then for testing purposes it was necessary to use the launcher and not do an exit, that was why we have the special case.