Skip to main content

Installation and Setup

This guide will walk you through installing PostgresML on various platforms and configuring it for your first machine learning project.

Prerequisites

Before installing PostgresML, ensure you have:

  • PostgreSQL 14 or later
  • Administrative (superuser) privileges on PostgreSQL
  • At least 2GB of free disk space
  • Python 3.7+ (for Python SDK, optional)

Installation Methods

The easiest way to get started with PostgresML is using Docker:

# Run PostgresML container with GPU support (optional)
docker run -it \
-v postgresml_data:/var/lib/postgresql \
-p 5433:5432 \
-p 8000:8000 \
ghcr.io/postgresml/postgresml:latest \
sudo -u postgresml psql -d postgresml

# Or without GPU support
docker run -it \
-v postgresml_data:/var/lib/postgresql \
-p 5433:5432 \
ghcr.io/postgresml/postgresml:latest

Access the PostgresML database:

psql postgres://postgres@localhost:5433/postgresml

Method 2: Ubuntu/Debian Installation

Install PostgresML on Ubuntu or Debian systems:

# Add PostgresML APT repository
echo "deb [trusted=yes] https://apt.postgresml.org $(lsb_release -cs) main" | \
sudo tee -a /etc/apt/sources.list

# Update and install
sudo apt update
sudo apt install -y postgresml-14 # or postgresml-15, postgresml-16

# The extension will be automatically installed

Method 3: From Source

For advanced users who want to build from source:

# Install dependencies
sudo apt install -y \
build-essential \
libpq-dev \
postgresql-server-dev-14 \
python3-dev \
libopenblas-dev \
cmake \
pkg-config \
libclang-dev

# Clone the repository
git clone https://github.com/postgresml/postgresml
cd postgresml/pgml-extension

# Build and install
cargo install cargo-pgrx
cargo pgrx init --pg14 /usr/bin/pg_config
cargo pgrx install --release

# Install Python dependencies
cd ../pgml-python
pip install -e .

Method 4: Cloud Deployment

PostgresML offers managed cloud hosting at https://postgresml.org:

  1. Sign up for a free account
  2. Create a new database
  3. Get your connection string
  4. Start using PostgresML immediately

Enabling the Extension

After installation, enable PostgresML in your database:

-- Connect to your database
\c your_database

-- Enable the pgml extension
CREATE EXTENSION IF NOT EXISTS pgml;

-- Verify installation
SELECT pgml.version();

Expected output:

 version 
---------
2.7.0

Installing Additional Components

PostgresML Dashboard

The dashboard provides a web UI for managing models:

# Install dashboard (included in Docker image)
sudo apt install -y postgresml-dashboard

# Start the dashboard
postgresml-dashboard --database-url postgres://user:password@localhost:5432/your_db

Access the dashboard at http://localhost:8000

Python SDK

Install the PostgresML Python SDK for programmatic access:

pip install pgml

Basic usage:

from pgml import Database

# Connect to PostgresML
db = Database("postgres://user:password@localhost:5432/your_db")

# Query with SDK
results = db.query("SELECT pgml.version()")
print(results)

JavaScript SDK

Install the PostgresML JavaScript SDK:

npm install pgml

Basic usage:

const pgml = require('pgml');

const client = pgml.newClient({
connectionString: 'postgres://user:password@localhost:5432/your_db'
});

async function main() {
const results = await client.query('SELECT pgml.version()');
console.log(results);
}

Configuration

Memory Settings

For optimal performance, adjust PostgreSQL configuration:

# Edit postgresql.conf
sudo nano /etc/postgresql/14/main/postgresql.conf

Recommended settings:

# Memory settings
shared_buffers = 2GB # 25% of system RAM
effective_cache_size = 6GB # 75% of system RAM
work_mem = 64MB # Per operation memory
maintenance_work_mem = 512MB # For VACUUM, CREATE INDEX

# Parallel query settings
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

# Extension settings
shared_preload_libraries = 'pgml'

Restart PostgreSQL:

sudo systemctl restart postgresql

GPU Support

If you have NVIDIA GPUs and want to use them for training:

# Install CUDA toolkit (NVIDIA GPUs)
sudo apt install -y nvidia-cuda-toolkit

# Verify GPU is detected
nvidia-smi

# PostgresML will automatically use GPU when available

Verification

Verify your installation is working correctly:

-- Test basic functionality
SELECT pgml.version();

-- Create a simple test
CREATE TABLE test_data AS
SELECT
random() AS feature1,
random() AS feature2,
(random() > 0.5)::int AS label
FROM generate_series(1, 1000);

-- Train a simple model
SELECT * FROM pgml.train(
'test_model',
'classification',
'test_data',
'label'
);

-- Make predictions
SELECT
feature1,
feature2,
pgml.predict('test_model', ARRAY[feature1, feature2]) AS prediction
FROM test_data
LIMIT 5;

-- Clean up
DROP TABLE test_data;

If all commands execute successfully, your installation is complete!

Troubleshooting

Extension Not Loading

If the extension fails to load:

-- Check extension availability
SELECT * FROM pg_available_extensions WHERE name = 'pgml';

-- Check shared libraries
SHOW shared_preload_libraries;

-- Try loading explicitly
LOAD 'pgml';

Permission Issues

Ensure your user has the necessary permissions:

-- Grant permissions
GRANT ALL ON SCHEMA pgml TO your_user;
GRANT ALL ON ALL TABLES IN SCHEMA pgml TO your_user;

Python Dependencies

If you encounter Python-related errors:

# Ensure Python packages are installed
pip install numpy pandas scikit-learn xgboost lightgbm

# For transformers support
pip install transformers torch

Next Steps

Now that PostgresML is installed, continue with: