Tools for AWS Cloud Management in MagellanMapper#

MagellanMapper provides a few tools for basic management of an Amazon Web Servicies (AWS) platform for running pipelines with cloud computing. These tools serve as wrappers to the AWS Boto3 Python-based client management to simplify common tasks related to MagellanMapper.

Dependencies#

  • awscli: AWS Command Line Interface for basic up/downloading of images and processed files S3. Install via Pip.

  • boto3: AWS Python client to manage EC2 instances.

Launch EC2 Instances#

You can launch a new EC2 instance of a custom AMI via MagellanMapper, such as an AMI with MagellanMapper pre-installed as described below.

To start an instance, use this command:

python -u -m magmap.io.aws --ec2_start "Name" "ami-xxxxxxxx" "m5.4xlarge" \
  "subnet-xxxxxxxx" "sg-xxxxxxxx" "UserName" 50,2000 [2]
  • Name is your name of choice

  • ami is your previously saved AMI with MagellanMapper

  • m5.4xlarge is the instance type, which can be changed depending on your performance requirements

  • subnet is your subnet group

  • sg is your security group

  • UserName is the user name whose security key will be uploaded for SSH access

  • 50,2000 creates a 50GB swap and 2000GB data drive, which can be changed depending on your needs

  • 2 starts two instances (optional, defaults to 1)

To log into the server with graphical support, SSH into your server instance with port forwarding to allow VNC access:

ssh -L 5900:localhost:5900 -i [your_aws_pem] ec2-user@[your_server_ip]

Start a graphical server (eg vncserver) to run ImageJ/Fiji for stitching or for Mayavi dependency setup. Now you can access the server graphically using a VNC cient pointing to localhost with port 5900.

Set Up Volumes#

The setup_server.sh script sets up volumes on a new server instance:

bin/setup_server.sh -d [path_to_data_device] -w [path_to_swap_device] \
    -f [size_of_swap_file] -u [username]
  • -d is the main data drive, typically a drive large enough for your full image file

  • -w is a swap drive or file path

  • -f is a swap file size in GB, used if swap is set to a path rather than a device

  • -n will map device names to NVMe names, which allows drive names such as sdf to be mapped to the corresponding NVMe-style (eg /dev/nvme0n1) names

  • -u is the username, used to change ownership of the mounted drives; defaults to ec2-user and should be changed to ubuntu for Ubunut-based AMIs

  • -s to set up fresh drives including formatting; exclude when re-mounting drives that have already been formatted

Set up drives on a new server instance to format and mount data and swap drives or create swap files:

bin/setup_server.sh -d [path_to_data_device] -w [path_to_swap_device] \
    -f [size_of_swap_file] -u [username]

Set Up a Server with MagellanMapper#

After launching a server, you can set up MagellanMapper by downloading it within the server or by deploying custom files such as a local branch using the depoy.sh script.

Install/Update From Main Respository#

Typically graphical support (eg via vncserver) is required during installation for Mayavi and stitching in the standard setup, but you can alternatively run a lightweight install without GUI (see Readme).

See Installation for downloading and install MagellanMapper within the server.

Install/Update From Local Branch#

The deployment script allows deploying MagellanMapper using local files including custom modifications. It also downloads Fiji for the image stitching pipeline. Use this command to deploy local files to a server:

bin/deploy.sh -p [path_to_your_aws_pem] -i [server_ip] \
    -d [optional_file0] -d [optional_file1]
  • This script by default will:

    • Archive the MagellanMapper Git directory and scp it to the server, using your .pem file to access it

    • Download and install ImageJ/Fiji onto the server

    • Update Fiji and install BigStitcher for image stitching

  • To only update an existing MagellanMapper directory on the server, add -u

  • To add multiple files or folders such as .aws credentials, use the -d option as many times as you’d like

  • After running this script, log in and install MagellanMapper if it has not been previously set up

Run MagellanMapper on Server#

When returning to a server with MagellanMapper already set up, you’ll need to perform the following tasks:

  • Re-mount drives using setup_server.sh

  • Activate the Conda environment set up during installation

Now you can use the pipelines.sh script to perform tasks as whole images, such as this command to fully process a multi-tile image with tile stitching, import to Numpy array, and cell detection, with AWS S3 import/export and Slack notifications along the way, followed by server clean-up/shutdown:

bin/process_nohup.sh -d "out_experiment.txt" -o -- bin/pipelines.sh \
  -i "/data/HugeImage.czi" -a "my/s3/bucket" -n \
  "https://hooks.slack.com/services/my/incoming/webhook" -p full -c