fsmonitor: acquire localrepo.wlock prior to emitting hg.update state
authorWez Furlong <wez@fb.com>
Thu, 18 May 2017 12:48:07 -0700
changeset 32334 6e0d1043e8fc
parent 32333 566cfe9cbbb9
child 32335 354329178dee
fsmonitor: acquire localrepo.wlock prior to emitting hg.update state we see some weird things in the watchman logs where the mercurial process is seemingly confused about which hg.update state it is publishing through watchman. On closer examination, we're seeing conflicting pids for the clients involved and this implies a race. To resolve this, we extend the wlock around the state-enter/state-leave events that are emitted to watchman. Test Plan: Some manual testing: In one window, run this, and then checkout a different rev: ``` $ watchman -p -j <<<'["subscribe", "/data/users/wez/fbsource", "wez", {"expression": ["name", ".hg/updatestate"]}]' { "version": "4.9.0", "subscribe": "wez", "clock": "c:1495034090:814028:1:312576" } { "state-enter": "hg.update", "version": "4.9.0", "clock": "c:1495034090:814028:1:312596", "unilateral": true, "subscription": "wez", "metadata": { "status": "ok", "distance": 125, "rev": "a1275d79ffa6c58b53116c8ec401c275ca6c1e2a", "partial": false }, "root": "/data/users/wez/fbsource" } { "root": "/data/users/wez/fbsource", "metadata": { "status": "ok", "distance": 125, "rev": "a1275d79ffa6c58b53116c8ec401c275ca6c1e2a", "partial": false }, "subscription": "wez", "unilateral": true, "version": "4.9.0", "clock": "c:1495034090:814028:1:312627", "state-leave": "hg.update" } ``` Tailed the watchman log file and looked for invalid state assertion errors, then ran my `rebase-all` script to update/rebase all of my heads. Didn't trigger the error condition (but couldn't reliably trigger it previously anyway), and the output captured above shows that the states are being emitted correctly.
hgext/fsmonitor/__init__.py
--- a/hgext/fsmonitor/__init__.py	Fri May 19 13:12:42 2017 +0200
+++ b/hgext/fsmonitor/__init__.py	Thu May 18 12:48:07 2017 -0700
@@ -600,14 +600,25 @@
         self.node = node
         self.distance = distance
         self.partial = partial
+        self._lock = None
 
     def __enter__(self):
+        # We explicitly need to take a lock here, before we proceed to update
+        # watchman about the update operation, so that we don't race with
+        # some other actor.  merge.update is going to take the wlock almost
+        # immediately anyway, so this is effectively extending the lock
+        # around a couple of short sanity checks.
+        self._lock = self.repo.wlock()
         self._state('state-enter')
         return self
 
     def __exit__(self, type_, value, tb):
-        status = 'ok' if type_ is None else 'failed'
-        self._state('state-leave', status=status)
+        try:
+            status = 'ok' if type_ is None else 'failed'
+            self._state('state-leave', status=status)
+        finally:
+            if self._lock:
+                self._lock.release()
 
     def _state(self, cmd, status='ok'):
         if not util.safehasattr(self.repo, '_watchmanclient'):