comparison mercurial/obsolete.py @ 18068:4bec77e62c00

obsolete: compute successors set Successors set are an important part of obsolescence. It is necessary to detect and solve divergence situation. This changeset add a core function to compute them, a debug command to audit them and solid test on the concept. Check function docstring for details about the concept.
author Pierre-Yves David <pierre-yves.david@logilab.fr>
date Thu, 13 Dec 2012 15:38:43 +0100
parents e02feadd15ea
children f84e731cbd20
comparison
equal deleted inserted replaced
18067:6f62e005781d 18068:4bec77e62c00
400 for suc in mark[1]: 400 for suc in mark[1]:
401 if suc not in seen: 401 if suc not in seen:
402 seen.add(suc) 402 seen.add(suc)
403 remaining.add(suc) 403 remaining.add(suc)
404 404
405 def successorssets(repo, initialnode, cache=None):
406 """Return all set of successors of initial nodes
407
408 Successors set of changeset A are a group of revision that succeed A. It
409 succeed A as a consistent whole, each revision being only partial
410 replacement. Successors set contains non-obsolete changeset only.
411
412 In most cases a changeset A have zero (changeset pruned) or a single
413 successors set that contains a single successor (changeset A replaced by
414 A')
415
416 When changeset is split, it results successors set containing more than
417 a single element. Divergent rewriting will result in multiple successors
418 sets.
419
420 They are returned as a list of tuples containing all valid successors sets.
421
422 Final successors unknown locally are considered plain prune (obsoleted
423 without successors).
424
425 The optional `cache` parameter is a dictionary that may contains
426 precomputed successors sets. It is meant to reuse the computation of
427 previous call to `successorssets` when multiple calls are made at the same
428 time. The cache dictionary is updated in place. The caller is responsible
429 for its live spawn. Code that makes multiple calls to `successorssets`
430 *must* use this cache mechanism or suffer terrible performances."""
431
432 succmarkers = repo.obsstore.successors
433
434 # Stack of nodes we search successors sets for
435 toproceed = [initialnode]
436 # set version of above list for fast loop detection
437 # element added to "toproceed" must be added here
438 stackedset = set(toproceed)
439 if cache is None:
440 cache = {}
441
442 # This while loop is the flattened version of a recursive search for
443 # successors sets
444 #
445 # def successorssets(x):
446 # successors = directsuccessors(x)
447 # ss = [[]]
448 # for succ in directsuccessors(x):
449 # # product as in itertools cartesian product
450 # ss = product(ss, successorssets(succ))
451 # return ss
452 #
453 # But we can not use plain recursive calls here:
454 # - that would blow the python call stack
455 # - obsolescence markers may have cycles, we need to handle them.
456 #
457 # The `toproceed` list act as our call stack. Every node we search
458 # successors set for are stacked there.
459 #
460 # The `stackedset` is set version of this stack used to check if a node is
461 # already stacked. This check is used to detect cycles and prevent infinite
462 # loop.
463 #
464 # successors set of all nodes are stored in the `cache` dictionary.
465 #
466 # After this while loop ends we use the cache to return the successors sets
467 # for the node requested by the caller.
468 while toproceed:
469 # Every iteration tries to compute the successors sets of the topmost
470 # node of the stack: CURRENT.
471 #
472 # There are four possible outcomes:
473 #
474 # 1) We already know the successors sets of CURRENT:
475 # -> mission accomplished, pop it from the stack.
476 # 2) Node is not obsolete:
477 # -> the node is its own successors sets. Add it to the cache.
478 # 3) We do not know successors set of direct successors of CURRENT:
479 # -> We add those successors to the stack.
480 # 4) We know successors sets of all direct successors of CURRENT:
481 # -> We can compute CURRENT successors set and add it to the
482 # cache.
483 #
484 current = toproceed[-1]
485 if current in cache:
486 # case (1): We already know the successors sets
487 stackedset.remove(toproceed.pop())
488 elif current not in succmarkers:
489 # case (2): The node is not obsolete.
490 if current in repo:
491 # We have a valid last successors.
492 cache[current] = [(current,)]
493 else:
494 # Final obsolete version is unknown locally.
495 # Do not count that as a valid successors
496 cache[current] = []
497 else:
498 # cases (3) and (4)
499 #
500 # We proceed in two phases. Phase 1 aims to distinguish case (3)
501 # from case (4):
502 #
503 # For each direct successors of CURRENT, we check whether its
504 # successors sets are known. If they are not, we stack the
505 # unknown node and proceed to the next iteration of the while
506 # loop. (case 3)
507 #
508 # During this step, we may detect obsolescence cycles: a node
509 # with unknown successors sets but already in the call stack.
510 # In such a situation, we arbitrary set the successors sets of
511 # the node to nothing (node pruned) to break the cycle.
512 #
513 # If no break was encountered we proceeed to phase 2.
514 #
515 # Phase 2 computes successors sets of CURRENT (case 4); see details
516 # in phase 2 itself.
517 #
518 # Note the two levels of iteration in each phase.
519 # - The first one handles obsolescence markers using CURRENT as
520 # precursor (successors markers of CURRENT).
521 #
522 # Having multiple entry here means divergence.
523 #
524 # - The second one handles successors defined in each marker.
525 #
526 # Having none means pruned node, multiple successors means split,
527 # single successors are standard replacement.
528 #
529 for mark in succmarkers[current]:
530 for suc in mark[1]:
531 if suc not in cache:
532 if suc in stackedset:
533 # cycle breaking
534 cache[suc] = []
535 else:
536 # case (3) If we have not computed successors sets
537 # of one of those successors we add it to the
538 # `toproceed` stack and stop all work for this
539 # iteration.
540 toproceed.append(suc)
541 stackedset.add(suc)
542 break
543 else:
544 continue
545 break
546 else:
547 # case (4): we know all successors sets of all direct
548 # successors
549 #
550 # Successors set contributed by each marker depends on the
551 # successors sets of all its "successors" node.
552 #
553 # Each different marker is a divergence in the obsolescence
554 # history. It contributes successors sets dictinct from other
555 # markers.
556 #
557 # Within a marker, a successor may have divergent successors
558 # sets. In such a case, the marker will contribute multiple
559 # divergent successors sets. If multiple successors have
560 # divergents successors sets, a cartesian product is used.
561 succssets = []
562 for mark in succmarkers[current]:
563 # successors sets contributed by this marker
564 markss = [[]]
565 for suc in mark[1]:
566 productresult = []
567 for prefix in markss:
568 for suffix in cache[suc]:
569 newss = list(prefix)
570 for part in suffix:
571 # do not duplicated entry in successors set
572 # first entry wins.
573 if part not in newss:
574 newss.append(part)
575 productresult.append(newss)
576 markss = productresult
577 succssets.extend(markss)
578 cache[current] = list(set(tuple(r) for r in succssets if r))
579 return cache[initialnode]
580
405 def _knownrevs(repo, nodes): 581 def _knownrevs(repo, nodes):
406 """yield revision numbers of known nodes passed in parameters 582 """yield revision numbers of known nodes passed in parameters
407 583
408 Unknown revisions are silently ignored.""" 584 Unknown revisions are silently ignored."""
409 torev = repo.changelog.nodemap.get 585 torev = repo.changelog.nodemap.get