Using the Amazon Elastic Block Storage (EBS) for Persistent Data Storage

You can create elastic block storage (EBS) from the AWS Console. Like EFS, EBS is used for persistent data storage, but offers better performance when using a large number of small files. Unlike EFS, EBS cannot be shared across multiple VMs. To share persistent data storage, you need to use EFS.

EBS is available in most regions with Amazon EC2 P3 instances.

Creating an EBS

  1. Open the EBS Volumes Console.

    Go to the main AWS console, click EC2, then expand Elastic Block Store from the side menu, if necessary, and click Volumes.

  2. Click Create Volume.
  3. Make selections at the Create Volume page.
    • Select General Purpose SSD (GP2) for the Volume Type.
    • Specify the volume size and Availability Zone.
    • (Optional) Add Tags.
    • Encryption is not needed if you are working with public datasets.
    • Snapshot ID is not needed.
  4. Review the options and then click Create Volume.

Attaching an EBS Volume to the EC2 Instance

  1. Once you have created the EBS volume, select the volume and then select Actions->Attach Volume.
  2. Specify your EC2-instance ID as well as a drive letter for the device name (for example, sdf), then click Attach.

    This creates a /dev/xvdf(or the driver letter that you picked) virtual disk on your EC2 instance.

    You can view the volume by running the lsblk command.
    ~$ lsblk 
    
    NAME     MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT 
    xvda     202:0    0  128G 0 disk 
    └─xvda1 202:1   0  128G 0 part /
    xvdf     202:16   0  250G 0 disk
  3. Create a filesystem on the EBS volume.
    ~# mkfs.ext4 /dev/xvdf 
    
    mke2fs 1.42.13 (17-May-2015) 
    Creating filesystem with 65536000 4k blocks and 16384000 inodes 
    Filesystem UUID: b0e3dee3-bf86-4e69-9488-cf4d4b57b367 
    Superblock backups stored on blocks:
     32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
     4096000, 7962624, 11239424, 20480000, 23887872 
    Allocating group tables: done 
    Writing inode tables: done 
    Creating journal (32768 blocks): done 
    Writing superblocks and filesystem accounting information: done 
  4. Mount the volume to a mount directory.
    ~# mount /dev/xvdf /data

    To mount the volume automatically every time the instance is stopped and restarted, add an entry to /etc/fstab. Refer to Amazon Documentation Making a Volume Available for Use

Deleting an EBS

Be aware that once you delete an EBS, you cannot undelete it.

  1. Open the EBS Volumes Console.

    Go to the main AWS console, click EC2, then expand Elastic Block Store from the side menu, if necessary, and click Volumes.

  2. Select your EBS.
  3. Detach the volume from the EC2 instance.

    Select Actions->Detach Volume, then click Yes, Detach from the confirmation dialog.

  4. Delect the storage volume.

    Select Actions->Delete Volume and then click Yes, Delete from the confirmation dialog.

Copying Datasets to the EBS Volume

Once you have created the EBS volume, you can upload datasets to the volume or copy the dataset from an existing EFS.

Uploading a Dataset to the EBS Volume

  1. Mount the EBS volume to /data.

    Issue the following to perform the one-time mount.

    sudo mkdir /data 
    sudo mount /dev/xvdf /data 
    sudo chmod 777 /data 
  2. Copy the dataset onto the EBS volume in /data.
    scp -i <.pem> -r local_dataset_dir/ ubuntu@<ec2-instance>:/data

Copying a Dataset from Existing EFS Storage to the EBS Volume

  1. Mount the EFS storage to /data, using the EFS storage DNS name.

    Issue the following to perform the one-time mount.

    sudo mkdir /efs
    sudo mount -t nfs4 -o \
      nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 \
      EFS-DNS-NAME:/ /efs
    sudo chmod 777 /efs
    sudo cp -r /efs/<dataset> to /data
     
  2. Copy the dataset from the EFS to the EBS volume..
    sudo cp -r /efs/<dataset> to /data

Managing an EBS Volume Using AWS CLI

It is recommended that you use the AWS Console for EBS management. If you need to manage EBS file systems with the CLI, NVIDIA has created scripts available on GitHub at https://github.com/nvidia/ngc-examples.

These scripts will let you perform basic EBS management and can serve as the basis for further automation.