mirror of https://github.com/minio/minio.git
153 lines
4.8 KiB
Markdown
153 lines
4.8 KiB
Markdown
# MinIO Batch Job
|
|
MinIO Batch jobs is an MinIO object management feature that lets you manage objects at scale. Jobs currently supported by MinIO
|
|
|
|
- Replicate objects between buckets on multiple sites
|
|
|
|
Upcoming Jobs
|
|
|
|
- Copy objects from NAS to MinIO
|
|
- Copy objects from HDFS to MinIO
|
|
|
|
## Replication Job
|
|
To perform replication via batch jobs, you create a job. The job consists of a job description YAML that describes
|
|
|
|
- Source location from where the objects must be copied from
|
|
- Target location from where the objects must be copied to
|
|
- Fine grained filtering is available to pick relevant objects from source to copy from
|
|
|
|
MinIO batch jobs framework also provides
|
|
|
|
- Retrying a failed job automatically driven by user input
|
|
- Monitoring job progress in real-time
|
|
- Send notifications upon completion or failure to user configured target
|
|
|
|
Following YAML describes the structure of a replication job, each value is documented and self-describing.
|
|
|
|
```yaml
|
|
replicate:
|
|
apiVersion: v1
|
|
# source of the objects to be replicated
|
|
source:
|
|
type: TYPE # valid values are "minio"
|
|
bucket: BUCKET
|
|
prefix: PREFIX
|
|
# NOTE: if source is remote then target must be "local"
|
|
# endpoint: ENDPOINT
|
|
# credentials:
|
|
# accessKey: ACCESS-KEY
|
|
# secretKey: SECRET-KEY
|
|
# sessionToken: SESSION-TOKEN # Available when rotating credentials are used
|
|
|
|
# target where the objects must be replicated
|
|
target:
|
|
type: TYPE # valid values are "minio"
|
|
bucket: BUCKET
|
|
prefix: PREFIX
|
|
# NOTE: if target is remote then source must be "local"
|
|
# endpoint: ENDPOINT
|
|
# credentials:
|
|
# accessKey: ACCESS-KEY
|
|
# secretKey: SECRET-KEY
|
|
# sessionToken: SESSION-TOKEN # Available when rotating credentials are used
|
|
|
|
# optional flags based filtering criteria
|
|
# for all source objects
|
|
flags:
|
|
filter:
|
|
newerThan: "7d" # match objects newer than this value (e.g. 7d10h31s)
|
|
olderThan: "7d" # match objects older than this value (e.g. 7d10h31s)
|
|
createdAfter: "date" # match objects created after "date"
|
|
createdBefore: "date" # match objects created before "date"
|
|
|
|
## NOTE: tags are not supported when "source" is remote.
|
|
# tags:
|
|
# - key: "name"
|
|
# value: "pick*" # match objects with tag 'name', with all values starting with 'pick'
|
|
|
|
## NOTE: metadata filter not supported when "source" is non MinIO.
|
|
# metadata:
|
|
# - key: "content-type"
|
|
# value: "image/*" # match objects with 'content-type', with all values starting with 'image/'
|
|
|
|
notify:
|
|
endpoint: "https://notify.endpoint" # notification endpoint to receive job status events
|
|
token: "Bearer xxxxx" # optional authentication token for the notification endpoint
|
|
|
|
retry:
|
|
attempts: 10 # number of retries for the job before giving up
|
|
delay: "500ms" # least amount of delay between each retry
|
|
```
|
|
|
|
You can create and run multiple 'replication' jobs at a time there are no predefined limits set.
|
|
|
|
## Batch Jobs Terminology
|
|
|
|
### Job
|
|
A job is the basic unit of work for MinIO Batch Job. A job is a self describing YAML, once this YAML is submitted and evaluated - MinIO performs the requested actions on each of the objects obtained under the described criteria in job YAML file.
|
|
|
|
### Type
|
|
Type describes the job type, such as replicating objects between MinIO sites. Each job performs a single type of operation across all objects that match the job description criteria.
|
|
|
|
## Batch Jobs via Commandline
|
|
[mc](http://github.com/minio/mc) provides 'mc batch' command to create, start and manage submitted jobs.
|
|
|
|
```
|
|
NAME:
|
|
mc batch - manage batch jobs
|
|
|
|
USAGE:
|
|
mc batch COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]
|
|
|
|
COMMANDS:
|
|
generate generate a new batch job definition
|
|
start start a new batch job
|
|
list, ls list all current batch jobs
|
|
status summarize job events on MinIO server in real-time
|
|
describe describe job definition for a job
|
|
```
|
|
|
|
### Generate a job yaml
|
|
```
|
|
mc batch generate alias/ replicate
|
|
```
|
|
|
|
### Start the batch job (returns back the JID)
|
|
```
|
|
mc batch start alias/ ./replicate.yaml
|
|
Successfully start 'replicate' job `E24HH4nNMcgY5taynaPfxu` on '2022-09-26 17:19:06.296974771 -0700 PDT'
|
|
```
|
|
|
|
### List all batch jobs
|
|
```
|
|
mc batch list alias/
|
|
ID TYPE USER STARTED
|
|
E24HH4nNMcgY5taynaPfxu replicate minioadmin 1 minute ago
|
|
```
|
|
|
|
### List all 'replicate' batch jobs
|
|
```
|
|
mc batch list alias/ --type replicate
|
|
ID TYPE USER STARTED
|
|
E24HH4nNMcgY5taynaPfxu replicate minioadmin 1 minute ago
|
|
```
|
|
|
|
### Real-time 'status' for a batch job
|
|
```
|
|
mc batch status myminio/ E24HH4nNMcgY5taynaPfxu
|
|
●∙∙
|
|
Objects: 28766
|
|
Versions: 28766
|
|
Throughput: 3.0 MiB/s
|
|
Transferred: 406 MiB
|
|
Elapsed: 2m14.227222868s
|
|
CurrObjName: share/doc/xml-core/examples/foo.xmlcatalogs
|
|
```
|
|
|
|
### 'describe' the batch job yaml.
|
|
```
|
|
mc batch describe myminio/ E24HH4nNMcgY5taynaPfxu
|
|
replicate:
|
|
apiVersion: v1
|
|
...
|
|
```
|