Skip to main content

What is PostgresML?

PostgresML is an end-to-end machine learning platform built entirely inside PostgreSQL. It allows you to train models and make predictions using SQL queries, without moving data out of your database.

Overview

PostgresML transforms your PostgreSQL database into a complete machine learning platform by adding support for training and deploying models directly within the database. This eliminates the need for external ML infrastructure and data pipelines.

Key Features

🤖 In-Database ML

  • Train models directly on your data without ETL
  • Use familiar SQL syntax for machine learning
  • Real-time predictions with low latency
  • No data movement reduces security risks

🎯 Pre-trained Models

  • Access to Hugging Face transformers
  • State-of-the-art NLP models
  • Computer vision models
  • Time series forecasting models

⚡ Vector Operations

  • Built-in vector similarity search
  • Store and query embeddings efficiently
  • Semantic search capabilities
  • RAG (Retrieval Augmented Generation) support

🔧 Multiple Algorithms

  • Classification and regression
  • Clustering and dimensionality reduction
  • Natural language processing
  • Time series analysis

How PostgresML Works

PostgresML extends PostgreSQL with machine learning capabilities through a native extension. The workflow is simple:

  1. Store your data in PostgreSQL tables
  2. Train models using SQL functions
  3. Make predictions with SQL queries
  4. Evaluate and deploy models in production

This approach provides several advantages:

  • No data movement between systems
  • Leverage PostgreSQL's ACID guarantees
  • Use existing database security and permissions
  • Scale with your database infrastructure

Use Cases

Real-Time Recommendations

  • Product recommendations in e-commerce
  • Content personalization
  • User behavior prediction

Natural Language Processing

  • Text classification and sentiment analysis
  • Named entity recognition
  • Question answering systems
  • Semantic search

Fraud Detection

  • Real-time transaction analysis
  • Anomaly detection
  • Risk scoring

Time Series Forecasting

  • Sales forecasting
  • Demand prediction
  • Resource planning

Vector Search & RAG

  • Semantic document search
  • Chatbots with context
  • Knowledge base queries

Architecture Components

PostgresML Extension

A PostgreSQL extension written in Rust that provides machine learning functions and operations directly in the database.

Dashboard

A web application for managing models, monitoring performance, and visualizing results.

SDK

Client libraries for Python, JavaScript, and other languages to interact with PostgresML programmatically.

Model Store

Built-in model registry for versioning and deploying trained models.

Comparison with Other Solutions

FeaturePostgresMLExternal ML ServicesStandalone ML Platforms
Data MovementNoneRequiredRequired
LatencyVery LowMedium-HighMedium
Setup ComplexityLowMediumHigh
SQL IntegrationNativeVia APIsVia ETL
CostDatabase onlyAdditional servicesAdditional infrastructure

Supported Algorithms

PostgresML supports a wide range of algorithms through scikit-learn, XGBoost, LightGBM, and more:

  • Linear Models: Linear/Logistic Regression, Ridge, Lasso
  • Tree-based: Random Forest, Gradient Boosting, XGBoost, LightGBM
  • Neural Networks: MLP, Deep Learning via transformers
  • Clustering: K-Means, DBSCAN, Hierarchical
  • Dimensionality Reduction: PCA, t-SNE, UMAP
  • NLP: Transformers from Hugging Face

Getting Started

Ready to try PostgresML? Continue with the Installation Guide to set up PostgresML on your PostgreSQL database.

Resources