For clean Markdown of any page, append .md to the page URL. For a complete documentation index, see https://docs.nvidia.com/aistore/blog/llms.txt. For full documentation content, see https://docs.nvidia.com/aistore/blog/llms-full.txt.

The goal now is to deploy our first ETL and have AIStore run it on each storage node, harnessing the distributed power (and close to data - meaning, **fast**). For the problem statement, background and terms, please see the previous post:

* [AIStore & ETL: Introduction](https://aiatscale.org/blog/2021/10/21/ais-etl-1)

To quickly get to the point, we'll assume that an instance of AIStore has been already deployed on Kubernetes. 

> Check out our dedicated [ais-k8s repository](https://github.com/NVIDIA/ais-k8s/) for the multiple easy ways to accomplish Kubernetes deployments.
> For a quick local deployment without the operator, refer to [deploy/dev/k8s](https://github.com/NVIDIA/aistore/blob/main/deploy/dev/k8s/README).

We'll be using PyTorch's `torchvision` to transform [ImageNet dataset](https://www.image-net.org/) - as illustrated:

![AIS-ETL Overview](https://files.buildwithfern.com/aistore.docs.buildwithfern.com/aistore/56b0bd5ae6b74aedeb7c23586009443d2ff0f292c3a29a83e4faec21a9cec787/pages/assets/ais_etl_series/ais-etl-overview.png)

Also, in the examples below you'll notice `ais` command. That's [AIStore’s CLI](https://aiatscale.org/docs/cli) tool providing unmatched (well, almost) convenience and ease-of-use. Most of the time, though, we'll show the equivalent `curl`.

> For a variety of practical reasons, `curl` proves to be handly in use cases - in `bash` and Python scripts on the client side. Big part of that is its (i.e., the `curl`'s) ubiquitous nature. Great tool, overall.

## The Dataset

The dataset we have here is derived from the original ImageNet and is only slightly different. Its training part exists under a `train/` directory, validation - under `val/`. Each `*.jpg` image has a corresponding `*.cls` object with the corresponding class number. Image are assigned to one of the 1000 classes; classes are represented as integers in the `[0 - 999]` range:

```console
$ ais ls ais://imagenet
NAME             SIZE
train/0000353.cls     1B
train/0000353.jpg     17.88KiB
...
val/1280048.cls       3B
val/1280048.jpg       82.78KiB
```

A random (non-transformed) image from the dataset (and again, notice `ais` [CLI](https://aiatscale.org/docs/cli) usage):

```console
$ ais get ais://imagenet/train/0278350.jpg
$ open 0278350.jpg
```

![example dog image](https://files.buildwithfern.com/aistore.docs.buildwithfern.com/aistore/92ac7740a6c238c58c1f207921f61163307864b203a0e2135b6ee5edda59292b/pages/assets/imagenet_pytorch_aistore/0278350.jpg)

And the associated *class*:

```console
$ ais object cat ais://imagenet/train/0278350.cls
217
```

## The Plan and the Code

The plan, essentially, is two-fold:

1. Deploy provided transformation code (called `code.py` below) as ETL K8s container aka *transformer*.
2. Drive *transformer* from the PyTorch-based client to transform requested objects (shards) as required.

In the end, each image from the dataset, before it reaches the model, goes through a series of the following (`code.py`) transformations:

```python
# `code.py`:
import io, sys
import torch
from PIL import Image
from torchvision import transforms

def img_to_bytes(img):
    buf = io.BytesIO()
    img = img.convert('RGB')
    img.save(buf, format='JPEG')
    return buf.getvalue()

preprocessing = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    transforms.ToPILImage(),
    transforms.Lambda(img_to_bytes),
])

if __name__ == "__main__":
    input_bytes = sys.stdin.buffer.read()
    image = Image.open(io.BytesIO(input_bytes)).convert('RGB')
    processed_bytes = preprocessing(image)
    sys.stdout.buffer.write(processed_bytes)
```

## Initializing

We will use `python3` (`python:3.9`) *runtime* to install `torch` and `torchvision` packages.

To make sure that `code.py` (above) can have its imports, the following (`deps.txt`) dependencies must be installed:

```
torch==1.6.0
torchvision==0.7.0
```

With transforming code and dependencies covered, we are now fully ready to initialize ETL in the cluster:

```console
$ ais etl init code \
  --name my-first-etl \
  --from-file code.py \
  --deps-file deps.txt \
  --runtime python3 \
  --comm-type io://
```

Notice the "elements" of this `ais` command:

* user-given name of the specific (`code.py`) transformation;
* the dependencies;
* the aforementioned `runtime`, and, finally -
* `--comm-type` ("communication type") option briefly already [mentioned](https://aiatscale.org/blog/2021/10/21/ais-etl-1) - and we'll discuss it in-depth in our future postings.

Two ways to check that ETL is up and running: `ais` CLI ("way") and `kubectl`:

```
$ ais etl list
NAME
my-first-etl

$ kubectl -n ais get pods | grep ‘my-first-etl’
ais   my-first-etl-iacjhrvc                            1/1     Running   0          50s
```

Recap what we just did:

1. We prepared Python3 code (`code.py`) and provided dependencies (`deps.txt`) to run it.
2. We started transformer in the AIS cluster to, subsequently, augment images from the ImageNet dataset.

## Transforming a single object

To get a single object, we will use the [AIStore’s CLI](https://aiatscale.org/docs/cli), and we will show equivalent `curl` commands which can be useful in certain situations.
Those commands are convenient for quick testing and specific solutions which are written e.g. in `bash`.

The original image:

```console
$ ais get ais://imagenet/train/0278350.jpg 0278350.jpg

# Equivalent `curl`:
$ curl -O 0278350.jpg https://aistore/v1/objects/imagenet/train/0278350.jpg
```

and, the image after the transformation:

```console
$ ais etl object my-first-etl ais://imagenet/train/0278350.jpg 0278350.jpg

# Equivalent `curl`:
$ curl -O 0278350.jpg https://aistore/v1/objects/imagenet/train/0278350.jpg?uuid=my-first-etl
```

Notice that the only difference between these two `curl` (and, respectively, `ais`) commands above - is the `uuid` parameter referencing existing and available (`my-first-etl`) transformation that we have just [previously initialized](#initializing).

Post-transform `0278350.jpg` image:

![example dog image transformed](https://files.buildwithfern.com/aistore.docs.buildwithfern.com/aistore/91b6ba1ca3820ff165e31f3d761dec694d0d73bfb8fb80c9316cd71a48ca8aa6/pages/assets/imagenet_pytorch_aistore/0278350-transformed.jpg)

### AIS/PyTorch connector

So far we have set up ETL and tried our first cluster-resident transformations. We can now start running a real training model. For this purpose, we have prepared a slightly modified version of the [PyTorch ImageNet example](https://github.com/pytorch/examples/tree/master/imagenet) that can be found [here](https://gist.github.com/VirrageS/7e2c80635e0efae3e63b5e3d5d2aaaf6). The script contains training and validation code for the ImageNet dataset.

Next step is to modify the script to utilize `my-first-etl` transformer.

The typical code for loading ImageNet from a local directory looks like this:

```python
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

train_loader = torch.utils.data.DataLoader(datasets.ImageFolder(
       os.path.join(args.data, 'train'),
       transforms.Compose([
           transforms.RandomResizedCrop(224),
           transforms.RandomHorizontalFlip(),
           transforms.ToTensor(),
           normalize,
       ]),
    ),
    batch_size=args.batch_size, shuffle=True,
    num_workers=args.workers, pin_memory=True)

val_loader = torch.utils.data.DataLoader(datasets.ImageFolder(
       os.path.join(args.data, 'val'),
       transforms.Compose([
           transforms.Resize(256),
           transforms.CenterCrop(224),
           transforms.ToTensor(),
           normalize,
       ]),
    ),
    batch_size=args.batch_size, shuffle=False,
    num_workers=args.workers, pin_memory=True)
```

> Full code for the example above is also available - see [ImageNet PyTorch training with `dataset.ImageFolder`](https://github.com/NVIDIA/aistore/blob/main/examples/etl-imagenet-dataset/train_pytorch.py).

In the world of PyTorch, all datasets are subclasses of [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset) with [`torchvision.datasets.ImageFolder`](https://pytorch.org/vision/stable/datasets.html) being the standard for handling datasets with labeled images.

To integrate with PyTorch and *offload* transformations to AIStore, we introduce `aistore.pytorch.Dataset` - the implementation of [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset).

With `aistore.pytorch.Dataset`, the example above works out as follows:

```python
import aistore
from aistore.client import Bck

...

train_loader = torch.utils.data.DataLoader(
    aistore.pytorch.Dataset(
        "http://aistore-sample-proxy:51080", # AIS IP address or hostname
	Bck("imagenet"),
        prefix="train/", transform_id="my-first-etl",
        transform_filter=lambda object_name: object_name.endswith('.jpg'),
    ),
    batch_size=args.batch_size, shuffle=True,
    num_workers=args.workers, pin_memory=True)

val_loader = torch.utils.data.DataLoader(
    aistore.pytorch.Dataset(
        "http://aistore-sample-proxy:51080", # AIS IP address or hostname
	Bck("imagenet"),
        prefix="val/", transform_id="my-second-etl", # We skipped setting up this ETL.
        transform_filter=lambda object_name: object_name.endswith('.jpg'),
    ),
    batch_size=args.batch_size, shuffle=False,
    num_workers=args.workers, pin_memory=True)
```

Complete code is available here:

* [ImageNet PyTorch training with `aistore.pytorch.Dataset`](https://github.com/NVIDIA/aistore/blob/main/examples/etl-imagenet-dataset/train_aistore.py)

## References

1. [AIStore & ETL: Introduction](https://aiatscale.org/blog/2021/10/21/ais-etl-1)
2. GitHub:
    - [AIStore](https://github.com/NVIDIA/aistore)
    - [AIS/Kubernetes Operator, AIS on bare-metal, Deployment Playbooks, Helm](https://github.com/NVIDIA/ais-k8s)
    - [AIS-ETL containers and specs](https://github.com/NVIDIA/ais-etl)
2. Documentation, blogs, videos:
    - https://aiatscale.org
    - https://github.com/NVIDIA/aistore/tree/main/docs

PS. Note that we have omitted setting-up ETL for the validation loader - leaving it as an exercise for the reader. To be continued...