Skip to main content

MinIO Object Storage (S3-Compatible) – Complete Guide

MinIO is a high-performance, Kubernetes-native, S3-compatible object storage system focused on simplicity, scalability, and speed. It is widely used in modern data infrastructure and MLOps workflows for storing: model artifacts, datasets, feature store snapshots, logs, parquet/Delta/Iceberg table files, vector embeddings, and analytical query outputs.

This guide covers everything from architectural concepts to hands-on setup, advanced operations, security hardening, performance tuning, automation, and troubleshooting – with practical examples.

When To Use MinIO

Choose MinIO when you need standards-compliant S3 APIs, strong consistency, horizontal scalability, erasure-coded durability, and simple ops across on-prem, edge, or multi-cloud.

1. Core Concepts

ConceptDescription
Object StorageStores immutable objects (data + metadata) grouped into buckets.
S3 CompatibilityMinIO implements Amazon S3 APIs + signatures (v2/v4) for broad ecosystem integration.
Erasure CodingProtects against disk/node failures using Reed-Solomon parity (e.g. 4+2, 8+8).
Distributed ModeAggregates storage across multiple servers / drives for capacity + resiliency.
Strong ConsistencyAll read-after-writes reflect latest state (important for ML metadata correctness).
Tenant IsolationMulti-tenant via separate MinIO deployments or bucket/policy boundaries.
Identity & PolicyUsers, groups, STS, OIDC, LDAP, Azure AD, OpenShift auth supported.
EncryptionSSE-S3, SSE-KMS, SSE-C; TLS; optional external KMS (Vault, AWS KMS, HashiCorp).
Subsystem RotationRolling upgrades with minimal downtime.

2. Typical MLOps Use Cases

  1. Model Artifact Registry (store .pt, .onnx, .pkl, .h5, .gguf).
  2. Training Dataset Lake (images, parquet, JSONL, audio, video segments).
  3. Feature Store Backups (Hudi/Iceberg/Delta table files).
  4. Experiment Tracking Outputs (weights, metrics, logs, tensorboards).
  5. Vector Index Snapshots (FAISS, ScaNN, Milvus export layers).
  6. ETL Staging for Spark / Trino / DuckDB / Polars.
  7. Data Versioning (DVC, LakeFS layering over MinIO, Quilt, Pachyderm backends).

3. Architecture Overview

┌──────────────────────────────────────────────────────────┐
│ AI / Data Platform Stack │
│ (Spark, Ray, Airflow, Kubeflow, Trino, MLflow, Feast) │
└───────────────▲───────────────────────────────▲─────────┘
│ S3 API / SDK │ Batch & Interactive
┌───────┴──────────────────────────────┴─────────┐
│ MinIO Layer │
│ • Auth (IAM / OIDC / LDAP) │
│ • Encryption (TLS, SSE, KMS) │
│ • Erasure Coding / Healing │
│ • Versioning / Lifecycle / Replication │
└───────▲───────────────▲───────────────▲────────┘
│ │ │
┌───────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐
│ Local Disks │ │ NVMe / JBOD │ │ Object Tier │
│ (HDD / SSD) │ │ (Hybrid) │ │ Edge Nodes │
└──────────────┘ └─────────────┘ └─────────────┘

4. Installation Methods

4.1 Local Binary (Single-Node Dev)

curl -L https://dl.min.io/server/minio/release/$(uname -s | tr '[:upper:]' '[:lower:]')-amd64/minio -o minio
chmod +x minio
./minio server /data --console-address :9090

Environment variables (optional):

export MINIO_ROOT_USER="minioadmin"
export MINIO_ROOT_PASSWORD="strongpassword123!"

Access:

  • API: http://localhost:9000
  • Console: http://localhost:9090

4.2 Docker (Single Instance)

docker run -d \
-p 9000:9000 -p 9090:9090 \
-e MINIO_ROOT_USER=minioadmin \
-e MINIO_ROOT_PASSWORD=strongpassword123! \
-v $(pwd)/data:/data \
--name minio \
quay.io/minio/minio server /data --console-address ":9090"

4.3 Distributed Docker (4 Nodes example)

docker run -d --name minio{1..4} \
-p 90{00..03}:9000 -p 91{00..03}:9090 \
-e MINIO_ROOT_USER=minioadmin \
-e MINIO_ROOT_PASSWORD=strongpassword123! \
-v /mnt/drive{1..4}:/data \
quay.io/minio/minio server \
http://host{1..4}/data{1..4} --console-address ":9090"
note

Replace host{1..4} with actual resolvable hostnames / IPs.

4.4 Kubernetes (Helm)

helm repo add minio https://charts.min.io/
helm repo update

helm install minio minio/minio \
--namespace storage --create-namespace \
--set rootUser=minioadmin \
--set rootPassword=strongpassword123! \
--set resources.requests.memory=2Gi \
--set persistence.size=200Gi

Port Forward for local access:

kubectl port-forward svc/minio 9000:9000 -n storage
kubectl port-forward svc/minio 9090:9001 -n storage

4.5 Operator (Multi-Tenant in K8s)

The [MinIO Operator] provides CRDs like Tenant for declarative provisioning.

kubectl apply -k github.com/minio/operator/resources/overlays/stable

Create a tenant (YAML excerpt):

apiVersion: minio.min.io/v2
kind: Tenant
metadata:
name: mlops-storage
spec:
pools:
- servers: 4
volumesPerServer: 4
size: 2Ti
mountPath: /export
credsSecret:
name: mlops-root-creds
exposeServices:
console: true

5. Command-Line Client (mc)

Install:

curl -L https://dl.min.io/client/mc/release/$(uname -s | tr '[:upper:]' '[:lower:]')-amd64/mc -o mc
chmod +x mc && sudo mv mc /usr/local/bin/

Add alias:

mc alias set local http://localhost:9000 minioadmin strongpassword123!

Common operations:

mc mb local/datasets
mc cp ./images/*.jpg local/datasets/images/
mc ls --recursive local/datasets
mc rm --recursive --force local/datasets/tmp/
mc mirror ./localfolder local/backupfolder

5.1 Bucket Versioning & Lifecycle

mc version enable local/datasets
mc ilm add local/datasets --expiry-days 30 --noncurrent-expiry-days 7
mc ilm ls local/datasets

5.2 Policies & Users

Create a user + read-only policy:

mc admin user add local analyst analyst-pass-123
cat > readonly.json <<'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::datasets",
"arn:aws:s3:::datasets/*"
]
}
]
}
EOF

mc admin policy add local readonly readonly.json
mc admin policy set local readonly user=analyst

5.2.1 Policy Validation Pattern (Simulated Test)

MinIO does not currently expose an AWS-style "policy simulator" API, but you can validate least-privilege policies safely by:

  1. Creating a temporary test user.
  2. Attaching the draft policy.
  3. Attempting only the expected allowed + intentionally disallowed operations.
  4. Reviewing mc admin trace -v output for 403 AccessDenied vs 200.

Example quick check (expect success):

mc cp README.md local/datasets/   # Should FAIL for readonly user
mc --config-dir /tmp/mc-test alias set ro http://localhost:9000 analyst analyst-pass-123
mc --config-dir /tmp/mc-test ls ro/datasets # Should succeed

Clean up:

mc admin user remove local analyst
mc admin policy remove local readonly

5.3 Server Health & Info

mc admin info local
mc admin health info local
mc admin trace -v local

6. SDK Usage (S3-Compatible APIs)

import boto3, os

session = boto3.session.Session()
s3 = session.client(
's3',
endpoint_url='http://localhost:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='strongpassword123!',
region_name='us-east-1'
)

bucket = 'datasets'
s3.create_bucket(Bucket=bucket)

# Upload
s3.upload_file('train.parquet', bucket, 'ml/train/train.parquet')

# Stream download
obj = s3.get_object(Bucket=bucket, Key='ml/train/train.parquet')
data = obj['Body'].read()[:100]
print(len(data))

7. Advanced Features

7.1 Erasure Coding

MinIO automatically applies erasure coding in distributed or multi-drive setups. Choose number of drives = data + parity. Example: 12 drives with EC: 8+4 means any 4 drives may fail without data loss.

Check:

mc admin info local | grep -i 'erasure'

7.2 Object Locking & Immutability

mc mb --with-lock local/audit
mc retention set --default GOVERNANCE 30d local/audit

7.3 Bucket Replication

mc admin replicate add local/datasets remote/datasets --replicate "delete,delete-marker,existing-objects,metadata"

7.4 KMS Integration (HashiCorp Vault example)

Set env:

export MINIO_KMS_VAULT_APPROLE_ID=... 
export MINIO_KMS_VAULT_APPROLE_SECRET=...
export MINIO_KMS_VAULT_ENDPOINT=https://vault.internal:8200

7.5 Tiering (Offload to Cloud)

mc admin tier add s3 local tier1 \
--endpoint https://s3.amazonaws.com \
--access-key AKIA... --secret-key ... \
--region us-east-1

mc ilm add local/datasets --transition-days 60 --storage-class tier1

8. Performance Tuning

LayerRecommendation
Network10/25/40GbE, enable MTU 9000 (jumbo) if consistent.
FilesystemXFS or EXT4; disable atime; align with RAID striping.
DrivesPrefer homogeneous sets; NVMe for metadata-heavy workloads (small objects).
ParallelismUse multi-part uploads for objects >64MB; tune concurrency in SDK.
Client TuningIncrease max_concurrent_requests, S3 SDK thread pools.
EncryptionOffload TLS with modern ciphers; ensure CPU AES-NI.
HealingSchedule outside peak; monitor via mc admin heal.
Multipart Threshold

Use multipart uploads for objects larger than ~64MB (MinIO auto-optimizes many cases, but explicitly enabling multipart in SDKs improves throughput for >512MB objects). For very large datasets (multi-GB), target part size 64–128MB balancing memory and parallelism.

Bench seed (simple):

mc mb local/bench
for i in {1..50}; do head -c 64M /dev/urandom > file.$i; done
time parallel -j 16 mc cp file.{} local/bench/ ::: {1..50}

9. Monitoring & Observability

MinIO exposes Prometheus metrics at /minio/v2/metrics/cluster (and node endpoints).

Enable scrape (Prometheus YAML snippet):

- job_name: 'minio'
static_configs:
- targets: ['minio-1:9000','minio-2:9000']
metrics_path: /minio/v2/metrics/cluster
scheme: http

Key metrics:

  • minio_s3_requests_total
  • minio_disk_storage_used_bytes
  • minio_node_heal_objects_total
  • minio_s3_errors_total
  • minio_iam_policy_count

Tracing: use mc admin trace -v for live S3 API events.

10. Data Lifecycle Management

FeaturePurpose
VersioningRecover accidental deletes / training artifact regression.
Lifecycle ExpiryRemove stale checkpoints / intermediate shards.
TransitionTier cold data to cheaper storage.
ReplicationDR / Geo locality.
Object LockCompliance, audit trails.

Sample lifecycle JSON (multi-rule):

[
{
"ID": "expire-temp",
"Prefix": "tmp/",
"Status": "Enabled",
"Expiration": { "Days": 3 }
},
{
"ID": "transition-embeddings",
"Filter": { "Prefix": "embeddings/" },
"Status": "Enabled",
"Transitions": [{ "Days": 30, "StorageClass": "tier1" }]
}
]

Apply:

mc ilm import local/datasets < lifecycle.json

11. Backup & Disaster Recovery

Strategies:

  1. Bucket replication to secondary cluster.
  2. Periodic snapshot via mc mirror to immutable target.
  3. Offsite tape / glacier archive for legal retention.
  4. Metadata export (mc admin info --json).

Mirror example:

mc alias set dr http://dr-site:9000 minioadmin strongpasswordDR!
mc mirror --overwrite --watch local/datasets dr/datasets

12. Security Hardening Checklist

AreaAction
Root CredentialsRotate; store in secret manager; never reuse.
TLSUse valid certs (MINIO_SERVER_URL=https://...).
IAMGrant least privilege; avoid wildcard s3:*.
AuditEnable object locking for compliance buckets.
NetworkRestrict ingress with firewall / security groups.
EncryptionSSE-KMS for sensitive model weights (PII).
Multi-TenantSeparate tenants for prod vs staging.
Key RotationIntegrate with external KMS schedule.

Sample restrictive training role policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject","s3:ListBucket"],
"Resource": [
"arn:aws:s3:::datasets/training/*",
"arn:aws:s3:::datasets"
],
"Condition": {"StringLike": {"s3:prefix": ["training/*"]}}
}
]
}

12.1 Multi-Tenant Strategy (Kubernetes Operator)

When isolating environments (e.g., dev, staging, prod) deploy separate MinIO Tenants instead of relying solely on bucket policies for:

  • Strict resource quotas (per tenant pool size).
  • Independent upgrade cadence / blast radius.
  • Credential & KMS separation.

Sample second tenant (partial YAML):

apiVersion: minio.min.io/v2
kind: Tenant
metadata:
name: prod-artifacts
spec:
pools:
- servers: 6
volumesPerServer: 6
size: 4Ti
features:
bucketDNS: true
exposeServices:
console: true
Isolation vs Shared Cluster

If hardware is constrained, you can still run separate tenants sharing physical nodes; the Operator maps distinct StatefulSets and PVCs per tenant.

13. Integration Highlights

ToolNotes
MLflowConfigure MLFLOW_S3_ENDPOINT_URL env var.
SparkSet spark.hadoop.fs.s3a.endpoint to MinIO URL; enable path style.
RayUse s3:// URIs for dataset load/save.
Trino/PrestoHive connector with S3 endpoint overrides.
FeastRegistry + offline store referencing s3:// URIs.
AirflowUse S3Hook with custom endpoint.
HuggingFaceCache model weights to MinIO via env HF_HOME=s3://... (with s3fs).

14. Troubleshooting

SymptomCheck
Slow UploadsNetwork MTU mismatch; DNS; multi-part disabled.
Access DeniedPolicy or credential mismatch; trace to view failing call.
High 5xx ErrorsDisk full, drive offline, healing backlog.
Missing ObjectsVersioning: check non-current versions.
TLS FailuresCert CN/SAN mismatch vs endpoint URL.

Commands:

mc admin trace -v local | head
mc admin heal --recursive local/datasets
mc admin prometheus generate local

15. Capacity Planning Quick Start

Formula (approx):

Raw Capacity = (Number of Drives * Drive Size)
Usable ≈ Raw * (DataDrives / TotalDrives)

Example: 12 × 14TB, 8+4 EC → usable ≈ 168TB * (8/12) ≈ 112TB before overhead (plan 15% free space).

16. Operational Runbook (Sample)

TaskFrequencyCommand / Procedure
Credential Rotation90dmc admin user info, update secrets.
Metrics ReviewDailyGrafana dashboard.
Disk HealthWeeklySMART checks / mc admin info.
Lifecycle AuditMonthlymc ilm ls.
UpgradeQuarterlyRolling restart with new image.

17. Minimal Python Data Pipeline Example

import os, boto3, json, time
import pandas as pd

ep = 'http://localhost:9000'
ak = 'minioadmin'; sk = 'strongpassword123!'
bucket = 'datasets'

s3 = boto3.client('s3', endpoint_url=ep, aws_access_key_id=ak, aws_secret_access_key=sk)
try:
s3.create_bucket(Bucket=bucket)
except s3.exceptions.BucketAlreadyOwnedByYou:
pass

df = pd.DataFrame({"x": range(1000), "y": [v*v for v in range(1000)]})
df.to_parquet('features.parquet')
s3.upload_file('features.parquet', bucket, 'features/2025-09-17/features.parquet')

print("Uploaded features")
print([c['Key'] for c in s3.list_objects_v2(Bucket=bucket, Prefix='features/')['Contents']])

18. Further Reading

Next Steps

Integrate this storage backend with your feature store, experiment tracker, and model registry to unify artifact management under a single consistent S3 abstraction.


Last Updated: 2025-09-17