Mercurial > hg
view contrib/automation/README.rst @ 42285:65b3ef162b39
automation: initial support for running Linux tests
Building on top of our Windows automation support, this commit
implements support for performing automated tasks on remote Linux
machines. Specifically, we implement support for running tests
on ephemeral EC2 instances. This seems to be a worthwhile place
to start, as building packages on Linux is more or less a solved
problem because we already have facilities for building in Docker
containers, which provide "good enough" reproducibility guarantees.
The new `run-tests-linux` command works similarly to
`run-tests-windows`: it ensures an AMI with hg dependencies is
available, provisions a temporary EC2 instance with this AMI, pushes
local changes to that instance via SSH, then invokes `run-tests.py`.
Using this new command, I am able to run the entire test harness
substantially faster then I am on my local machine courtesy of
access to massive core EC2 instances:
wall: 16:20 ./run-tests.py -l (i7-6700K)
wall: 14:00 automation.py run-tests-linux --ec2-instance c5.2xlarge
wall: 8:30 automation.py run-tests-linux --ec2-instance m5.4xlarge
wall: 8:04 automation.py run-tests-linux --ec2-instance c5.4xlarge
wall: 4:30 automation.py run-tests-linux --ec2-instance c5.9xlarge
wall: 3:57 automation.py run-tests-linux --ec2-instance m5.12xlarge
wall: 3:05 automation.py run-tests-linux --ec2-instance m5.24xlarge
wall: 3:02 automation.py run-tests-linux --ec2-instance c5.18xlarge
~3 minute wall time to run pretty much the entire test harness is
not too bad!
The AMIs install multiple versions of Python. And the run-tests-linux
command specifies which one to use:
automation.py run-tests-linux --python system3
automation.py run-tests-linux --python 3.5
automation.py run-tests-linux --python pypy2.7
By default, the system Python 2.7 is used. Using this functionality,
I was able to identity some unexpected test failures on PyPy!
Included in the feature is support for running with alternate
filesystems. You can simply pass --filesystem to the command to
specify the type of filesystem to run tests on. When the ephemeral
instance is started, a new filesystem will be created and tests
will run from it:
wall: 4:30 automation.py run-tests-linux --ec2-instance c5.9xlarge
wall: 4:20 automation.py run-tests-linux --ec2-instance c5d.9xlarge --filesystem xfs
wall: 4:24 automation.py run-tests-linux --ec2-instance c5d.9xlarge --filesystem tmpfs
wall: 4:26 automation.py run-tests-linux --ec2-instance c5d.9xlarge --filesystem ext4
We also support multiple Linux distributions:
$ automation.py run-tests-linux --distro debian9
total time: 298.1s; setup: 60.7s; tests: 237.5s; setup overhead: 20.4%
$ automation.py run-tests-linux --distro ubuntu18.04
total time: 286.1s; setup: 61.3s; tests: 224.7s; setup overhead: 21.4%
$ automation.py run-tests-linux --distro ubuntu18.10
total time: 278.5s; setup: 58.2s; tests: 220.3s; setup overhead: 20.9%
$ automation.py run-tests-linux --distro ubuntu19.04
total time: 265.8s; setup: 42.5s; tests: 223.3s; setup overhead: 16.0%
Debian and Ubuntu are supported because those are what I use and am
most familiar with. It should be easy enough to add support for other
distros.
Unlike the Windows AMIs, Linux EC2 instances bill per second. So
the cost to instantiating an ephemeral instance isn't as severe.
That being said, there is some overhead, as it takes several dozen
seconds for the instance to boot, push local changes, and build
Mercurial. During this time, the instance is largely CPU idle and
wasting money. Even with this inefficiency, running tests is
relatively cheap: $0.15-$0.25 per full test run. A machine running
tests as efficiently as these EC2 instances would cost say $6,000, so
you can run the test harness a >20,000 times for the cost of an
equivalent machine. Running tests in EC2 is almost certainly cheaper
than buying a beefy machine for developers to use :)
# no-check-commit because foo_bar function names
Differential Revision: https://phab.mercurial-scm.org/D6319
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sat, 27 Apr 2019 11:48:26 -0700 |
parents | b05a3e28cf24 |
children | 92593d72e10b |
line wrap: on
line source
==================== Mercurial Automation ==================== This directory contains code and utilities for building and testing Mercurial on remote machines. The ``automation.py`` Script ============================ ``automation.py`` is an executable Python script (requires Python 3.5+) that serves as a driver to common automation tasks. When executed, the script will *bootstrap* a virtualenv in ``<source-root>/build/venv-automation`` then re-execute itself using that virtualenv. So there is no need for the caller to have a virtualenv explicitly activated. This virtualenv will be populated with various dependencies (as defined by the ``requirements.txt`` file). To see what you can do with this script, simply run it:: $ ./automation.py Local State =========== By default, local state required to interact with remote servers is stored in the ``~/.hgautomation`` directory. We attempt to limit persistent state to this directory. Even when performing tasks that may have side-effects, we try to limit those side-effects so they don't impact the local system. e.g. when we SSH into a remote machine, we create a temporary directory for the SSH config so the user's known hosts file isn't updated. AWS Integration =============== Various automation tasks integrate with AWS to provide access to resources such as EC2 instances for generic compute. This obviously requires an AWS account and credentials to work. We use the ``boto3`` library for interacting with AWS APIs. We do not employ any special functionality for telling ``boto3`` where to find AWS credentials. See https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html for how ``boto3`` works. Once you have configured your environment such that ``boto3`` can find credentials, interaction with AWS should *just work*. .. hint:: Typically you have a ``~/.aws/credentials`` file containing AWS credentials. If you manage multiple credentials, you can override which *profile* to use at run-time by setting the ``AWS_PROFILE`` environment variable. Resource Management ------------------- Depending on the task being performed, various AWS services will be accessed. This of course requires AWS credentials with permissions to access these services. The following AWS services can be accessed by automation tasks: * EC2 * IAM * Simple Systems Manager (SSM) Various resources will also be created as part of performing various tasks. This also requires various permissions. The following AWS resources can be created by automation tasks: * EC2 key pairs * EC2 security groups * EC2 instances * IAM roles and instance profiles * SSM command invocations When possible, we prefix resource names with ``hg-`` so they can easily be identified as belonging to Mercurial. .. important:: We currently assume that AWS accounts utilized by *us* are single tenancy. Attempts to have discrete users of ``automation.py`` (including sharing credentials across machines) using the same AWS account can result in them interfering with each other and things breaking. Cost of Operation ----------------- ``automation.py`` tries to be frugal with regards to utilization of remote resources. Persistent remote resources are minimized in order to keep costs in check. For example, EC2 instances are often ephemeral and only live as long as the operation being performed. Under normal operation, recurring costs are limited to: * Storage costs for AMI / EBS snapshots. This should be just a few pennies per month. When running EC2 instances, you'll be billed accordingly. Default instance types vary by operation. We try to be respectful of your money when choosing defaults. e.g. for Windows instances which are billed per hour, we use e.g. ``t3.medium`` instances, which cost ~$0.07 per hour. For operations that scale well to many CPUs like running Linux tests, we may use a more powerful instance like ``c5.9xlarge``. However, since Linux instances are billed per second and the cost of running an e.g. ``c5.9xlarge`` for half the time of a ``c5.4xlarge`` is roughly the same, the choice is justified. .. note:: When running Windows EC2 instances, AWS bills at the full hourly cost, even if the instance doesn't run for a full hour (per-second billing doesn't apply to Windows AMIs). Managing Remote Resources ------------------------- Occassionally, there may be an error purging a temporary resource. Or you may wish to forcefully purge remote state. Commands can be invoked to manually purge remote resources. To terminate all EC2 instances that we manage:: $ automation.py terminate-ec2-instances To purge all EC2 resources that we manage:: $ automation.py purge-ec2-resources Remote Machine Interfaces ========================= The code that connects to a remote machine and executes things is theoretically machine agnostic as long as the remote machine conforms to an *interface*. In other words, to perform actions like running tests remotely or triggering packaging, it shouldn't matter if the remote machine is an EC2 instance, a virtual machine, etc. This section attempts to document the interface that remote machines need to provide in order to be valid *targets* for remote execution. These interfaces are often not ideal nor the most flexible. Instead, they have often evolved as the requirements of our automation code have evolved. Linux ----- Remote Linux machines expose an SSH server on port 22. The SSH server must allow the ``hg`` user to authenticate using the SSH key generated by the automation code. The ``hg`` user should be part of the ``hg`` group and it should have ``sudo`` access without password prompting. The SSH channel must support SFTP to facilitate transferring files from client to server. ``/bin/bash`` must be executable and point to a bash shell executable. The ``/hgdev`` directory must exist and all its content owned by ``hg::hg``. The ``/hgdev/pyenv`` directory should contain an installation of ``pyenv``. Various Python distributions should be installed. The exact versions shouldn't matter. ``pyenv global`` should have been run so ``/hgdev/pyenv/shims/`` is populated with redirector scripts that point to the appropriate Python executable. The ``/hgdev/venv-bootstrap`` directory must contain a virtualenv with Mercurial installed. The ``/hgdev/venv-bootstrap/bin/hg`` executable is referenced by various scripts and the client. The ``/hgdev/src`` directory MUST contain a clone of the Mercurial source code. The state of the working directory is not important. In order to run tests, the ``/hgwork`` directory will be created. This may require running various ``mkfs.*`` executables and ``mount`` to provision a new filesystem. This will require elevated privileges via ``sudo``. Various dependencies to run the Mercurial test harness are also required. Documenting them is beyond the scope of this document. Various tests also require other optional dependencies and missing dependencies will be printed by the test runner when a test is skipped.