run-tests: wait for test threads after first error
authorGregory Szorc <gregory.szorc@gmail.com>
Sat, 28 Mar 2015 19:39:03 -0700
changeset 24507 a0668a587c04
parent 24506 60bbb4079c28
child 24508 fbe2fb71a6e6
run-tests: wait for test threads after first error The test runner has the ability to stop on first error. Tests are executed in new Python threads. The test runner starts new threads when it has capacity to do so. Before this patch, the "stop on first error" logic would return immediately from the "run tests" function, without waiting on test threads to complete. There was thus a race between the test runner thread doing cleanup work and the test thread performing activity. For example, the test thread could be in the middle of executing a test shell script and the test runner could remove the test's temporary directory. Depending on timing, this could result in any number of output from the test runner. This patch eliminates the race condition by having the test runner explicitly wait for test threads to complete before continuing. I discovered this issue as I modified the test harness in a subsequent patch and was reliably able to tickle the race condition.
tests/run-tests.py
tests/test-run-tests.t
--- a/tests/run-tests.py	Sat Mar 28 00:21:30 2015 -0700
+++ b/tests/run-tests.py	Sat Mar 28 19:39:03 2015 -0700
@@ -1390,16 +1390,19 @@
                 done.put(('!', test, 'run-test raised an error, see traceback'))
                 raise
 
+        stoppedearly = False
+
         try:
             while tests or running:
                 if not done.empty() or running == self._jobs or not tests:
                     try:
                         done.get(True, 1)
+                        running -= 1
                         if result and result.shouldStop:
+                            stoppedearly = True
                             break
                     except queue.Empty:
                         continue
-                    running -= 1
                 if tests and not running == self._jobs:
                     test = tests.pop(0)
                     if self._loop:
@@ -1413,6 +1416,18 @@
                                          args=(test, result))
                     t.start()
                     running += 1
+
+            # If we stop early we still need to wait on started tests to
+            # finish. Otherwise, there is a race between the test completing
+            # and the test's cleanup code running. This could result in the
+            # test reporting incorrect.
+            if stoppedearly:
+                while running:
+                    try:
+                        done.get(True, 1)
+                        running -= 1
+                    except queue.Empty:
+                        continue
         except KeyboardInterrupt:
             for test in runtests:
                 test.abort()
--- a/tests/test-run-tests.t	Sat Mar 28 00:21:30 2015 -0700
+++ b/tests/test-run-tests.t	Sat Mar 28 19:39:03 2015 -0700
@@ -265,7 +265,8 @@
    this test is still more bytes than success.
   
   Failed test-failure*.t: output changed (glob)
-  # Ran 2 tests, 0 skipped, 0 warned, 1 failed.
+  Failed test-nothing.t: output changed
+  # Ran 2 tests, 0 skipped, 0 warned, 2 failed.
   python hash seed: * (glob)
   [1]