fsmonitor: acquire localrepo.wlock prior to emitting hg.update state
we see some weird things in the watchman logs where the mercurial
process is seemingly confused about which hg.update state it is publishing
through watchman.
On closer examination, we're seeing conflicting pids for the clients involved
and this implies a race.
To resolve this, we extend the wlock around the state-enter/state-leave
events that are emitted to watchman.
Test Plan:
Some manual testing:
In one window, run this, and then checkout a different rev:
```
$ watchman -p -j <<<'["subscribe", "/data/users/wez/fbsource", "wez", {"expression": ["name", ".hg/updatestate"]}]'
{
"version": "4.9.0",
"subscribe": "wez",
"clock": "c:
1495034090:814028:1:312576"
}
{
"state-enter": "hg.update",
"version": "4.9.0",
"clock": "c:
1495034090:814028:1:312596",
"unilateral": true,
"subscription": "wez",
"metadata": {
"status": "ok",
"distance": 125,
"rev": "
a1275d79ffa6c58b53116c8ec401c275ca6c1e2a",
"partial": false
},
"root": "/data/users/wez/fbsource"
}
{
"root": "/data/users/wez/fbsource",
"metadata": {
"status": "ok",
"distance": 125,
"rev": "
a1275d79ffa6c58b53116c8ec401c275ca6c1e2a",
"partial": false
},
"subscription": "wez",
"unilateral": true,
"version": "4.9.0",
"clock": "c:
1495034090:814028:1:312627",
"state-leave": "hg.update"
}
```
Tailed the watchman log file and looked for invalid state assertion errors,
then ran my `rebase-all` script to update/rebase all of my heads.
Didn't trigger the error condition (but couldn't reliably trigger it previously
anyway), and the output captured above shows that the states are being emitted
correctly.
--- a/hgext/fsmonitor/__init__.py Fri May 19 13:12:42 2017 +0200
+++ b/hgext/fsmonitor/__init__.py Thu May 18 12:48:07 2017 -0700
@@ -600,14 +600,25 @@
self.node = node
self.distance = distance
self.partial = partial
+ self._lock = None
def __enter__(self):
+ # We explicitly need to take a lock here, before we proceed to update
+ # watchman about the update operation, so that we don't race with
+ # some other actor. merge.update is going to take the wlock almost
+ # immediately anyway, so this is effectively extending the lock
+ # around a couple of short sanity checks.
+ self._lock = self.repo.wlock()
self._state('state-enter')
return self
def __exit__(self, type_, value, tb):
- status = 'ok' if type_ is None else 'failed'
- self._state('state-leave', status=status)
+ try:
+ status = 'ok' if type_ is None else 'failed'
+ self._state('state-leave', status=status)
+ finally:
+ if self._lock:
+ self._lock.release()
def _state(self, cmd, status='ok'):
if not util.safehasattr(self.repo, '_watchmanclient'):