8460: Fix deadlock at shutdown by closing event stream before unmounting.
authorTom Clegg <tom@curoverse.com>
Tue, 13 Dec 2016 17:46:02 +0000 (12:46 -0500)
committerTom Clegg <tom@curoverse.com>
Tue, 13 Dec 2016 17:46:02 +0000 (12:46 -0500)
If llfuse shuts down while a thread is in a "with
llfuse.lock_released" block, the llfuse lock cannot be reacquired, so
the "with" block waits forever instead of exiting. The event listener
thread lands in this state easily because handling a "collection
updated" event makes network requests in a lock_released block.

This deadlock bug started occurring frequently in
tests.test_token_expiry.TokenExpiryTest when using arvados-ws.

services/fuse/arvados_fuse/__init__.py
services/fuse/arvados_fuse/command.py
services/fuse/tests/mount_test_base.py

index 63a5513f35d02457d672544d3c46f8ce65a71d7f..4c1731f319e231a5ea26ddab2c8c1b0f6be5c38a 100644 (file)
@@ -399,7 +399,6 @@ class Operations(llfuse.Operations):
                 parent.invalidate()
                 parent.update()
 
-
     @catch_exceptions
     def getattr(self, inode):
         if inode not in self.inodes:
index f2948f9e45f295b43544615ad75c764c35856053..ffcfc6500f5c5ac31289da3c404166b386f74374 100644 (file)
@@ -126,6 +126,8 @@ class Mount(object):
         return self
 
     def __exit__(self, exc_type, exc_value, traceback):
+        if self.operations.events:
+            self.operations.events.close(timeout=self.args.unmount_timeout)
         subprocess.call(["fusermount", "-u", "-z", self.args.mountpoint])
         self.llfuse_thread.join(timeout=self.args.unmount_timeout)
         if self.llfuse_thread.is_alive():
index 20192f9d84302e1d9967136bcfa19e03ef7012dd..1319aebdccaa1e9dcc0f3e4323fa66340cc05a7f 100644 (file)
@@ -66,6 +66,8 @@ class MountTestBase(unittest.TestCase):
 
     def tearDown(self):
         if self.llfuse_thread:
+            if self.operations.events:
+                self.operations.events.close(timeout=10)
             subprocess.call(["fusermount", "-u", "-z", self.mounttmp])
             t0 = time.time()
             self.llfuse_thread.join(timeout=10)