Installation and Setup
This guide will walk you through installing PostgresML on various platforms and configuring it for your first machine learning project.
Prerequisites
Before installing PostgresML, ensure you have:
- PostgreSQL 14 or later
- Administrative (superuser) privileges on PostgreSQL
- At least 2GB of free disk space
- Python 3.7+ (for Python SDK, optional)
Installation Methods
Method 1: Using Docker (Recommended for Development)
The easiest way to get started with PostgresML is using Docker:
# Run PostgresML container with GPU support (optional)
docker run -it \
-v postgresml_data:/var/lib/postgresql \
-p 5433:5432 \
-p 8000:8000 \
ghcr.io/postgresml/postgresml:latest \
sudo -u postgresml psql -d postgresml
# Or without GPU support
docker run -it \
-v postgresml_data:/var/lib/postgresql \
-p 5433:5432 \
ghcr.io/postgresml/postgresml:latest
Access the PostgresML database:
psql postgres://postgres@localhost:5433/postgresml
Method 2: Ubuntu/Debian Installation
Install PostgresML on Ubuntu or Debian systems:
# Add PostgresML APT repository
echo "deb [trusted=yes] https://apt.postgresml.org $(lsb_release -cs) main" | \
sudo tee -a /etc/apt/sources.list
# Update and install
sudo apt update
sudo apt install -y postgresml-14 # or postgresml-15, postgresml-16
# The extension will be automatically installed
Method 3: From Source
For advanced users who want to build from source:
# Install dependencies
sudo apt install -y \
build-essential \
libpq-dev \
postgresql-server-dev-14 \
python3-dev \
libopenblas-dev \
cmake \
pkg-config \
libclang-dev
# Clone the repository
git clone https://github.com/postgresml/postgresml
cd postgresml/pgml-extension
# Build and install
cargo install cargo-pgrx
cargo pgrx init --pg14 /usr/bin/pg_config
cargo pgrx install --release
# Install Python dependencies
cd ../pgml-python
pip install -e .
Method 4: Cloud Deployment
PostgresML offers managed cloud hosting at https://postgresml.org:
- Sign up for a free account
- Create a new database
- Get your connection string
- Start using PostgresML immediately
Enabling the Extension
After installation, enable PostgresML in your database:
-- Connect to your database
\c your_database
-- Enable the pgml extension
CREATE EXTENSION IF NOT EXISTS pgml;
-- Verify installation
SELECT pgml.version();
Expected output:
version
---------
2.7.0
Installing Additional Components
PostgresML Dashboard
The dashboard provides a web UI for managing models:
# Install dashboard (included in Docker image)
sudo apt install -y postgresml-dashboard
# Start the dashboard
postgresml-dashboard --database-url postgres://user:password@localhost:5432/your_db
Access the dashboard at http://localhost:8000
Python SDK
Install the PostgresML Python SDK for programmatic access:
pip install pgml
Basic usage:
from pgml import Database
# Connect to PostgresML
db = Database("postgres://user:password@localhost:5432/your_db")
# Query with SDK
results = db.query("SELECT pgml.version()")
print(results)
JavaScript SDK
Install the PostgresML JavaScript SDK:
npm install pgml
Basic usage:
const pgml = require('pgml');
const client = pgml.newClient({
connectionString: 'postgres://user:password@localhost:5432/your_db'
});
async function main() {
const results = await client.query('SELECT pgml.version()');
console.log(results);
}
Configuration
Memory Settings
For optimal performance, adjust PostgreSQL configuration:
# Edit postgresql.conf
sudo nano /etc/postgresql/14/main/postgresql.conf
Recommended settings:
# Memory settings
shared_buffers = 2GB # 25% of system RAM
effective_cache_size = 6GB # 75% of system RAM
work_mem = 64MB # Per operation memory
maintenance_work_mem = 512MB # For VACUUM, CREATE INDEX
# Parallel query settings
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
# Extension settings
shared_preload_libraries = 'pgml'
Restart PostgreSQL:
sudo systemctl restart postgresql
GPU Support
If you have NVIDIA GPUs and want to use them for training:
# Install CUDA toolkit (NVIDIA GPUs)
sudo apt install -y nvidia-cuda-toolkit
# Verify GPU is detected
nvidia-smi
# PostgresML will automatically use GPU when available
Verification
Verify your installation is working correctly:
-- Test basic functionality
SELECT pgml.version();
-- Create a simple test
CREATE TABLE test_data AS
SELECT
random() AS feature1,
random() AS feature2,
(random() > 0.5)::int AS label
FROM generate_series(1, 1000);
-- Train a simple model
SELECT * FROM pgml.train(
'test_model',
'classification',
'test_data',
'label'
);
-- Make predictions
SELECT
feature1,
feature2,
pgml.predict('test_model', ARRAY[feature1, feature2]) AS prediction
FROM test_data
LIMIT 5;
-- Clean up
DROP TABLE test_data;
If all commands execute successfully, your installation is complete!
Troubleshooting
Extension Not Loading
If the extension fails to load:
-- Check extension availability
SELECT * FROM pg_available_extensions WHERE name = 'pgml';
-- Check shared libraries
SHOW shared_preload_libraries;
-- Try loading explicitly
LOAD 'pgml';
Permission Issues
Ensure your user has the necessary permissions:
-- Grant permissions
GRANT ALL ON SCHEMA pgml TO your_user;
GRANT ALL ON ALL TABLES IN SCHEMA pgml TO your_user;
Python Dependencies
If you encounter Python-related errors:
# Ensure Python packages are installed
pip install numpy pandas scikit-learn xgboost lightgbm
# For transformers support
pip install transformers torch
Next Steps
Now that PostgresML is installed, continue with:
- Basic Usage - Learn fundamental operations
- Training Models - Train your first ML model
- Making Predictions - Use models for inference