# MinIO Batch Job
MinIO Batch jobs is an MinIO object management feature that lets you manage objects at scale. Jobs currently supported by MinIO

- Replicate objects between buckets on multiple sites

Upcoming Jobs

- Copy objects from NAS to MinIO
- Copy objects from HDFS to MinIO

## Replication Job
To perform replication via batch jobs, you create a job. The job consists of a job description YAML that describes

- Source location from where the objects must be copied from
- Target location from where the objects must be copied to
- Fine grained filtering is available to pick relevant objects from source to copy from

MinIO batch jobs framework also provides

- Retrying a failed job automatically driven by user input
- Monitoring job progress in real-time
- Send notifications upon completion or failure to user configured target

Following YAML describes the structure of a replication job, each value is documented and self-describing.

```yaml
replicate:
  apiVersion: v1
  # source of the objects to be replicated
  source:
	type: TYPE # valid values are "minio"
	bucket: BUCKET
	prefix: PREFIX
	# NOTE: if source is remote then target must be "local"
	# endpoint: ENDPOINT
	# credentials:
	#   accessKey: ACCESS-KEY
	#   secretKey: SECRET-KEY
	#   sessionToken: SESSION-TOKEN # Available when rotating credentials are used

  # target where the objects must be replicated
  target:
	type: TYPE # valid values are "minio"
	bucket: BUCKET
	prefix: PREFIX
	# NOTE: if target is remote then source must be "local"
	# endpoint: ENDPOINT
	# credentials:
	#   accessKey: ACCESS-KEY
	#   secretKey: SECRET-KEY
	#   sessionToken: SESSION-TOKEN # Available when rotating credentials are used

  # optional flags based filtering criteria
  # for all source objects
  flags:
	filter:
	  newerThan: "7d" # match objects newer than this value (e.g. 7d10h31s)
	  olderThan: "7d" # match objects older than this value (e.g. 7d10h31s)
	  createdAfter: "date" # match objects created after "date"
	  createdBefore: "date" # match objects created before "date"

	  ## NOTE: tags are not supported when "source" is remote.
	  # tags:
	  #   - key: "name"
	  #     value: "pick*" # match objects with tag 'name', with all values starting with 'pick'

	  ## NOTE: metadata filter not supported when "source" is non MinIO.
	  # metadata:
	  #   - key: "content-type"
	  #     value: "image/*" # match objects with 'content-type', with all values starting with 'image/'

	notify:
	  endpoint: "https://notify.endpoint" # notification endpoint to receive job status events
	  token: "Bearer xxxxx" # optional authentication token for the notification endpoint

	retry:
	  attempts: 10 # number of retries for the job before giving up
	  delay: "500ms" # least amount of delay between each retry
```

You can create and run multiple 'replication' jobs at a time there are no predefined limits set.

## Batch Jobs Terminology

### Job
A job is the basic unit of work for MinIO Batch Job. A job is a self describing YAML, once this YAML is submitted and evaluated - MinIO performs the requested actions on each of the objects obtained under the described criteria in job YAML file.

### Type
Type describes the job type, such as replicating objects between MinIO sites. Each job performs a single type of operation across all objects that match the job description criteria.

## Batch Jobs via Commandline
[mc](http://github.com/minio/mc) provides 'mc batch' command to create, start and manage submitted jobs.

```
NAME:
  mc batch - manage batch jobs

USAGE:
  mc batch COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]

COMMANDS:
  generate  generate a new batch job definition
  start     start a new batch job
  list, ls  list all current batch jobs
  status    summarize job events on MinIO server in real-time
  describe  describe job definition for a job
```

### Generate a job yaml
```
mc batch generate alias/ replicate
```

### Start the batch job (returns back the JID)
```
mc batch start alias/ ./replicate.yaml
Successfully start 'replicate' job `E24HH4nNMcgY5taynaPfxu` on '2022-09-26 17:19:06.296974771 -0700 PDT'
```

### List all batch jobs
```
mc batch list alias/
ID                      TYPE            USER            STARTED
E24HH4nNMcgY5taynaPfxu  replicate       minioadmin      1 minute ago
```

### List all 'replicate' batch jobs
```
mc batch list alias/ --type replicate
ID                      TYPE            USER            STARTED
E24HH4nNMcgY5taynaPfxu  replicate       minioadmin      1 minute ago
```

### Real-time 'status' for a batch job
```
mc batch status myminio/ E24HH4nNMcgY5taynaPfxu
●∙∙
Objects:        28766
Versions:       28766
Throughput:     3.0 MiB/s
Transferred:    406 MiB
Elapsed:        2m14.227222868s
CurrObjName:    share/doc/xml-core/examples/foo.xmlcatalogs
```

### 'describe' the batch job yaml.
```
mc batch describe myminio/ E24HH4nNMcgY5taynaPfxu
replicate:
  apiVersion: v1
...
```