automation: perform tasks on remote machines
Sometimes you don't have access to a machine in order to
do something. For example, you may not have access to a Windows
machine required to build Windows binaries or run tests on that
platform.
This commit introduces a pile of code intended to help
"automate" common tasks, like building release artifacts.
In its current form, the automation code provides functionality
for performing tasks on Windows EC2 instances.
The hgautomation.aws module provides functionality for integrating
with AWS. It manages EC2 resources such as IAM roles, EC2
security groups, AMIs, and instances.
The hgautomation.windows module provides a higher-level
interface for performing tasks on remote Windows machines.
The hgautomation.cli module provides a command-line interface to
these higher-level primitives.
I attempted to structure Windows remote machine interaction
around Windows Remoting / PowerShell. This is kinda/sorta like
SSH + shell, but for Windows. In theory, most of the functionality
is cloud provider agnostic, as we should be able to use any
established WinRM connection to interact with a remote. In
reality, we're tightly coupled to AWS at the moment because
I didn't want to prematurely add abstractions for a 2nd cloud
provider. (1 was hard enough to implement.)
In the aws module is code for creating an image with a fully
functional Mercurial development environment. It contains VC9,
VC2017, msys, and other dependencies. The image is fully capable
of building all the existing Mercurial release artifacts and
running tests.
There are a few things that don't work. For example, running
Windows tests with Python 3. But building the Windows release
artifacts does work. And that was an impetus for this work.
(Although we don't yet support code signing.)
Getting this functionality to work was extremely time consuming.
It took hours debugging permissions failures and other wonky
behavior due to PowerShell Remoting. (The permissions model for
PowerShell is crazy and you brush up against all kinds of
issues because of the user/privileges of the user running
the PowerShell and the permissions of the PowerShell session
itself.)
The functionality around AWS resource management could use some
improving. In theory we support shared tenancy via resource
name prefixing. In reality, we don't offer a way to configure
this.
Speaking of AWS resource management, I thought about using a tool
like Terraform to manage resources. But at our scale, writing a
few dozen lines of code to manage resources seemed acceptable.
Maybe we should reconsider this if things grow out of control.
Time will tell.
Currently, emphasis is placed on Windows. But I only started
there because it was likely to be the most difficult to implement.
It should be relatively trivial to automate tasks on remote Linux
machines. In fact, I have a ~1 year old script to run tests on a
remote EC2 instance. I will likely be porting that to this new
"framework" in the near future.
# no-check-commit because foo_bar functions
Differential Revision: https://phab.mercurial-scm.org/D6142
#!/usr/bin/env python
#
# Dumps output generated by Mercurial's command server in a formatted style to a
# given file or stderr if '-' is specified. Output is also written in its raw
# format to stdout.
#
# $ ./hg serve --cmds pipe | ./contrib/debugcmdserver.py -
# o, 52 -> 'capabilities: getencoding runcommand\nencoding: UTF-8'
from __future__ import absolute_import, print_function
import struct
import sys
if len(sys.argv) != 2:
print('usage: debugcmdserver.py FILE')
sys.exit(1)
outputfmt = '>cI'
outputfmtsize = struct.calcsize(outputfmt)
if sys.argv[1] == '-':
log = sys.stderr
else:
log = open(sys.argv[1], 'a')
def read(size):
data = sys.stdin.read(size)
if not data:
raise EOFError
sys.stdout.write(data)
sys.stdout.flush()
return data
try:
while True:
header = read(outputfmtsize)
channel, length = struct.unpack(outputfmt, header)
log.write('%s, %-4d' % (channel, length))
if channel in 'IL':
log.write(' -> waiting for input\n')
else:
data = read(length)
log.write(' -> %r\n' % data)
log.flush()
except EOFError:
pass
finally:
if log != sys.stderr:
log.close()