AO race condition leads to AO deadlock on DB migration

Description

When we migrate between DBs, we first create AO promises:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 at com.atlassian.activeobjects.osgi.TenantAwareActiveObjects$1$1.apply(TenantAwareActiveObjects.java:84) at com.atlassian.activeobjects.osgi.TenantAwareActiveObjects$1$1.apply(TenantAwareActiveObjects.java:79) at com.atlassian.util.concurrent.Promises$Of$1.apply(Promises.java:263) at com.atlassian.util.concurrent.Promises$2.onSuccess(Promises.java:185) at com.google.common.util.concurrent.Futures$6.run(Futures.java:1319) at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) at com.google.common.util.concurrent.ExecutionList.add(ExecutionList.java:101) at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:170) at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1322) at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1258) at com.atlassian.util.concurrent.Promises$Of.then(Promises.java:249) at com.atlassian.util.concurrent.Promises$Of.done(Promises.java:239) at com.atlassian.util.concurrent.Promises$Of.flatMap(Promises.java:260) at com.atlassian.activeobjects.osgi.TenantAwareActiveObjects$1.load(TenantAwareActiveObjects.java:79) at com.atlassian.activeobjects.osgi.TenantAwareActiveObjects$1.load(TenantAwareActiveObjects.java:74) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) - locked <0x773e> (a com.google.common.cache.LocalCache$StrongEntry) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) at com.google.common.cache.LocalCache.get(LocalCache.java:3937) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4830) at com.atlassian.activeobjects.osgi.TenantAwareActiveObjects.restartActiveObjects(TenantAwareActiveObjects.java:159) at com.atlassian.activeobjects.osgi.ActiveObjectsServiceFactory.onHotRestart(ActiveObjectsServiceFactory.java:272) at sun.reflect.GeneratedMethodAccessor779.invoke(Unknown Source:-1) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.atlassian.event.internal.SingleParameterMethodListenerInvoker.invoke(SingleParameterMethodListenerInvoker.java:32) at com.atlassian.event.internal.AsynchronousAbleEventDispatcher$1$1.run(AsynchronousAbleEventDispatcher.java:38) at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:299) at com.atlassian.event.internal.AsynchronousAbleEventDispatcher.dispatch(AsynchronousAbleEventDispatcher.java:88) at com.atlassian.event.internal.LockFreeEventPublisher$Publisher.dispatch(LockFreeEventPublisher.java:222) at com.atlassian.event.internal.LockFreeEventPublisher.publish(LockFreeEventPublisher.java:95) at com.atlassian.fisheye.event.FisheyeEventPublisher$EventPublication.publish(FisheyeEventPublisher.java:63) at com.atlassian.fisheye.event.FisheyeEventPublisher.publish(FisheyeEventPublisher.java:35) at com.atlassian.crucible.actions.admin.database.DBEditHelper.startNewDB(DBEditHelper.java:440) at com.atlassian.crucible.actions.admin.database.DBEditHelper.changeDB(DBEditHelper.java:209) at com.atlassian.crucible.actions.admin.database.DBEditHelper.migrateToDB(DBEditHelper.java:332) at com.atlassian.crucible.actions.admin.database.MigrateDatabaseAction$AsynchronousMigrater.migrateToDB(MigrateDatabaseAction.java:83)

Then we almost immediately kill the executor:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 at java.util.concurrent.ThreadPoolExecutor.shutdownNow(ThreadPoolExecutor.java:1416) at com.atlassian.sal.core.executor.ThreadLocalDelegateExecutorService.shutdownNow(ThreadLocalDelegateExecutorService.java:38) at com.atlassian.activeobjects.osgi.ActiveObjectsServiceFactory.startCleaning(ActiveObjectsServiceFactory.java:222) at com.atlassian.activeobjects.backup.ActiveObjectsDatabaseCleaner.doCleanup(ActiveObjectsDatabaseCleaner.java:56) at com.atlassian.activeobjects.backup.ActiveObjectsDatabaseCleaner.cleanup(ActiveObjectsDatabaseCleaner.java:45) at com.atlassian.dbexporter.importer.TableDefinitionImporter$DatabaseCleanerAroundImporter.before(TableDefinitionImporter.java:128) at com.atlassian.dbexporter.importer.AbstractImporter.importNode(AbstractImporter.java:41) at com.atlassian.dbexporter.DbImporter.importData(DbImporter.java:69) at com.atlassian.activeobjects.backup.ActiveObjectsBackup.restore(ActiveObjectsBackup.java:151) at sun.reflect.GeneratedMethodAccessor780.invoke(Unknown Source:-1) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.atlassian.applinks.host.OsgiServiceProxyFactory$DynamicServiceInvocationHandler.invoke(OsgiServiceProxyFactory.java:92) at com.sun.proxy.$Proxy144.restore(Unknown Source:-1) at com.atlassian.crucible.actions.admin.database.DBEditHelper.migrateToDB(DBEditHelper.java:339) at com.atlassian.crucible.actions.admin.database.MigrateDatabaseAction$AsynchronousMigrater.migrateToDB(MigrateDatabaseAction.java:83)

In effect, we end up with promises that never get executed and the instance blocks when it needs to use AO.
Rather than using ThreadPoolExecutor#shutdownNow we should use ThreadPoolExecutor#shutdown.

Environment

None

Testing Notes

Add notes...

Status

Assignee

Kamil Cichy

Reporter

Kamil Cichy

Labels

None

Add-on Type

None

Team

None

CC

None

Risk factor

None

QA Kickoff Status

None

QA Demo Status

None

Affects versions

1.1.3
1.2.2

Priority

Minor
Configure