mirror of
https://github.com/minio/minio.git
synced 2025-01-18 02:03:15 -05:00
153 lines
4.8 KiB
Markdown
153 lines
4.8 KiB
Markdown
|
# MinIO Batch Job
|
||
|
MinIO Batch jobs is an MinIO object management feature that lets you manage objects at scale. Jobs currently supported by MinIO
|
||
|
|
||
|
- Replicate objects between buckets on multiple sites
|
||
|
|
||
|
Upcoming Jobs
|
||
|
|
||
|
- Copy objects from NAS to MinIO
|
||
|
- Copy objects from HDFS to MinIO
|
||
|
|
||
|
## Replication Job
|
||
|
To perform replication via batch jobs, you create a job. The job consists of a job description YAML that describes
|
||
|
|
||
|
- Source location from where the objects must be copied from
|
||
|
- Target location from where the objects must be copied to
|
||
|
- Fine grained filtering is available to pick relevant objects from source to copy from
|
||
|
|
||
|
MinIO batch jobs framework also provides
|
||
|
|
||
|
- Retrying a failed job automatically driven by user input
|
||
|
- Monitoring job progress in real-time
|
||
|
- Send notifications upon completion or failure to user configured target
|
||
|
|
||
|
Following YAML describes the structure of a replication job, each value is documented and self-describing.
|
||
|
|
||
|
```yaml
|
||
|
replicate:
|
||
|
apiVersion: v1
|
||
|
# source of the objects to be replicated
|
||
|
source:
|
||
|
type: TYPE # valid values are "minio"
|
||
|
bucket: BUCKET
|
||
|
prefix: PREFIX
|
||
|
# NOTE: if source is remote then target must be "local"
|
||
|
# endpoint: ENDPOINT
|
||
|
# credentials:
|
||
|
# accessKey: ACCESS-KEY
|
||
|
# secretKey: SECRET-KEY
|
||
|
# sessionToken: SESSION-TOKEN # Available when rotating credentials are used
|
||
|
|
||
|
# target where the objects must be replicated
|
||
|
target:
|
||
|
type: TYPE # valid values are "minio"
|
||
|
bucket: BUCKET
|
||
|
prefix: PREFIX
|
||
|
# NOTE: if target is remote then source must be "local"
|
||
|
# endpoint: ENDPOINT
|
||
|
# credentials:
|
||
|
# accessKey: ACCESS-KEY
|
||
|
# secretKey: SECRET-KEY
|
||
|
# sessionToken: SESSION-TOKEN # Available when rotating credentials are used
|
||
|
|
||
|
# optional flags based filtering criteria
|
||
|
# for all source objects
|
||
|
flags:
|
||
|
filter:
|
||
|
newerThan: "7d" # match objects newer than this value (e.g. 7d10h31s)
|
||
|
olderThan: "7d" # match objects older than this value (e.g. 7d10h31s)
|
||
|
createdAfter: "date" # match objects created after "date"
|
||
|
createdBefore: "date" # match objects created before "date"
|
||
|
|
||
|
## NOTE: tags are not supported when "source" is remote.
|
||
|
# tags:
|
||
|
# - key: "name"
|
||
|
# value: "pick*" # match objects with tag 'name', with all values starting with 'pick'
|
||
|
|
||
|
## NOTE: metadata filter not supported when "source" is non MinIO.
|
||
|
# metadata:
|
||
|
# - key: "content-type"
|
||
|
# value: "image/*" # match objects with 'content-type', with all values starting with 'image/'
|
||
|
|
||
|
notify:
|
||
|
endpoint: "https://notify.endpoint" # notification endpoint to receive job status events
|
||
|
token: "Bearer xxxxx" # optional authentication token for the notification endpoint
|
||
|
|
||
|
retry:
|
||
|
attempts: 10 # number of retries for the job before giving up
|
||
|
delay: "500ms" # least amount of delay between each retry
|
||
|
```
|
||
|
|
||
|
You can create and run multiple 'replication' jobs at a time there are no predefined limits set.
|
||
|
|
||
|
## Batch Jobs Terminology
|
||
|
|
||
|
### Job
|
||
|
A job is the basic unit of work for MinIO Batch Job. A job is a self describing YAML, once this YAML is submitted and evaluated - MinIO performs the requested actions on each of the objects obtained under the described criteria in job YAML file.
|
||
|
|
||
|
### Type
|
||
|
Type describes the job type, such as replicating objects between MinIO sites. Each job performs a single type of operation across all objects that match the job description criteria.
|
||
|
|
||
|
## Batch Jobs via Commandline
|
||
|
[mc](http://github.com/minio/mc) provides 'mc batch' command to create, start and manage submitted jobs.
|
||
|
|
||
|
```
|
||
|
NAME:
|
||
|
mc batch - manage batch jobs
|
||
|
|
||
|
USAGE:
|
||
|
mc batch COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]
|
||
|
|
||
|
COMMANDS:
|
||
|
generate generate a new batch job definition
|
||
|
start start a new batch job
|
||
|
list, ls list all current batch jobs
|
||
|
status summarize job events on MinIO server in real-time
|
||
|
describe describe job definition for a job
|
||
|
```
|
||
|
|
||
|
### Generate a job yaml
|
||
|
```
|
||
|
mc batch generate alias/ replicate
|
||
|
```
|
||
|
|
||
|
### Start the batch job (returns back the JID)
|
||
|
```
|
||
|
mc batch start alias/ ./replicate.yaml
|
||
|
Successfully start 'replicate' job `E24HH4nNMcgY5taynaPfxu` on '2022-09-26 17:19:06.296974771 -0700 PDT'
|
||
|
```
|
||
|
|
||
|
### List all batch jobs
|
||
|
```
|
||
|
mc batch list alias/
|
||
|
ID TYPE USER STARTED
|
||
|
E24HH4nNMcgY5taynaPfxu replicate minioadmin 1 minute ago
|
||
|
```
|
||
|
|
||
|
### List all 'replicate' batch jobs
|
||
|
```
|
||
|
mc batch list alias/ --type replicate
|
||
|
ID TYPE USER STARTED
|
||
|
E24HH4nNMcgY5taynaPfxu replicate minioadmin 1 minute ago
|
||
|
```
|
||
|
|
||
|
### Real-time 'status' for a batch job
|
||
|
```
|
||
|
mc batch status myminio/ E24HH4nNMcgY5taynaPfxu
|
||
|
●∙∙
|
||
|
Objects: 28766
|
||
|
Versions: 28766
|
||
|
Throughput: 3.0 MiB/s
|
||
|
Transferred: 406 MiB
|
||
|
Elapsed: 2m14.227222868s
|
||
|
CurrObjName: share/doc/xml-core/examples/foo.xmlcatalogs
|
||
|
```
|
||
|
|
||
|
### 'describe' the batch job yaml.
|
||
|
```
|
||
|
mc batch describe myminio/ E24HH4nNMcgY5taynaPfxu
|
||
|
replicate:
|
||
|
apiVersion: v1
|
||
|
...
|
||
|
```
|