Skip to main content

Recommendation Systems using Neural Collaborative Filtering

· 16 min read
Duong Nguyen Thuan
AI/ML Engineer, MLOps Enthusiast

Recommendation systems are ubiquitous in modern applications, from Netflix suggesting movies to Amazon recommending products. Neural Collaborative Filtering (NCF) represents a significant advancement in recommendation technology by leveraging deep learning to model complex user-item interactions.

What are Recommendation Systems?

Recommendation systems are algorithms designed to suggest relevant items to users based on various factors such as past behavior, preferences, and similarities with other users. They solve the information overload problem by filtering and prioritizing content that users are most likely to find valuable.

Types of Recommendation Systems

  1. Content-Based Filtering: Recommends items similar to those a user has liked in the past based on item features
  2. Collaborative Filtering: Uses the preferences of similar users to make recommendations
  3. Hybrid Systems: Combines multiple recommendation strategies
Why Collaborative Filtering?

Collaborative filtering is particularly powerful because it can discover hidden patterns in user behavior without requiring explicit item features. It answers the question: "Users who liked what you liked also enjoyed these items."

Traditional Collaborative Filtering

Before diving into Neural Collaborative Filtering, let's understand traditional approaches:

Matrix Factorization

Matrix Factorization (MF) is the foundation of many collaborative filtering systems. It decomposes the user-item interaction matrix into two lower-dimensional matrices representing latent factors.

import numpy as np

# User-Item interaction matrix (users x items)
# 1 = interaction, 0 = no interaction
R = np.array([
[5, 3, 0, 1],
[4, 0, 0, 1],
[1, 1, 0, 5],
[1, 0, 0, 4],
[0, 1, 5, 4],
])

# Matrix Factorization: R ≈ P × Q^T
# P: User latent factors (users x k)
# Q: Item latent factors (items x k)
# k: number of latent dimensions

The goal is to learn matrices P and Q such that their product approximates the original interaction matrix:

$$R \approx P \times Q^T$$

Limitations of Matrix Factorization

While effective, traditional MF has several limitations:

  • Linear interactions: Uses dot product to model user-item interactions, which is linear
  • Limited expressiveness: Cannot capture complex, non-linear relationships
  • Feature engineering: Requires manual feature engineering for side information
  • Cold start problem: Struggles with new users or items with no interaction history

Neural Collaborative Filtering (NCF)

Neural Collaborative Filtering, introduced by He et al. in 2017, addresses these limitations by replacing the dot product with a neural network that can learn arbitrary functions from data.

NCF Architecture

The NCF framework consists of three main components:

  1. Embedding Layer: Maps sparse user and item IDs to dense vectors
  2. Neural CF Layers: Multiple fully connected layers to learn interactions
  3. Output Layer: Predicts the probability of user-item interaction
import torch
import torch.nn as nn

class NCF(nn.Module):
def __init__(self, num_users, num_items, embedding_dim, hidden_layers):
"""
Neural Collaborative Filtering Model

Args:
num_users: Number of unique users
num_items: Number of unique items
embedding_dim: Dimension of embedding vectors
hidden_layers: List of hidden layer sizes
"""
super(NCF, self).__init__()

# Embedding layers
self.user_embedding = nn.Embedding(num_users, embedding_dim)
self.item_embedding = nn.Embedding(num_items, embedding_dim)

# Neural CF layers
layers = []
input_dim = embedding_dim * 2

for hidden_dim in hidden_layers:
layers.append(nn.Linear(input_dim, hidden_dim))
layers.append(nn.ReLU())
layers.append(nn.Dropout(0.2))
input_dim = hidden_dim

self.fc_layers = nn.Sequential(*layers)

# Output layer
self.output_layer = nn.Linear(input_dim, 1)
self.sigmoid = nn.Sigmoid()

# Initialize weights
self._init_weights()

def _init_weights(self):
"""Initialize model weights"""
nn.init.normal_(self.user_embedding.weight, std=0.01)
nn.init.normal_(self.item_embedding.weight, std=0.01)

for layer in self.fc_layers:
if isinstance(layer, nn.Linear):
nn.init.xavier_uniform_(layer.weight)
nn.init.zeros_(layer.bias)

def forward(self, user_ids, item_ids):
"""
Forward pass

Args:
user_ids: Tensor of user IDs
item_ids: Tensor of item IDs

Returns:
Predicted interaction probabilities
"""
# Get embeddings
user_embedded = self.user_embedding(user_ids)
item_embedded = self.item_embedding(item_ids)

# Concatenate user and item embeddings
x = torch.cat([user_embedded, item_embedded], dim=-1)

# Pass through neural network layers
x = self.fc_layers(x)

# Output layer with sigmoid activation
output = self.sigmoid(self.output_layer(x))

return output.squeeze()

Generalized Matrix Factorization (GMF)

GMF is a neural network interpretation of traditional matrix factorization:

class GMF(nn.Module):
def __init__(self, num_users, num_items, embedding_dim):
"""
Generalized Matrix Factorization

This is equivalent to traditional MF but implemented as a neural network
"""
super(GMF, self).__init__()

self.user_embedding = nn.Embedding(num_users, embedding_dim)
self.item_embedding = nn.Embedding(num_items, embedding_dim)
self.output_layer = nn.Linear(embedding_dim, 1)
self.sigmoid = nn.Sigmoid()

self._init_weights()

def _init_weights(self):
nn.init.normal_(self.user_embedding.weight, std=0.01)
nn.init.normal_(self.item_embedding.weight, std=0.01)
nn.init.xavier_uniform_(self.output_layer.weight)

def forward(self, user_ids, item_ids):
# Element-wise product of embeddings
user_embedded = self.user_embedding(user_ids)
item_embedded = self.item_embedding(item_ids)

# Element-wise multiplication
x = user_embedded * item_embedded

# Output
output = self.sigmoid(self.output_layer(x))
return output.squeeze()

Multi-Layer Perceptron (MLP) for NCF

The MLP component learns non-linear interactions:

class MLP(nn.Module):
def __init__(self, num_users, num_items, embedding_dim, hidden_layers):
"""
Multi-Layer Perceptron for NCF
"""
super(MLP, self).__init__()

self.user_embedding = nn.Embedding(num_users, embedding_dim)
self.item_embedding = nn.Embedding(num_items, embedding_dim)

# Build MLP layers
layers = []
input_dim = embedding_dim * 2

for hidden_dim in hidden_layers:
layers.append(nn.Linear(input_dim, hidden_dim))
layers.append(nn.ReLU())
layers.append(nn.BatchNorm1d(hidden_dim))
layers.append(nn.Dropout(0.2))
input_dim = hidden_dim

self.mlp_layers = nn.Sequential(*layers)
self.output_layer = nn.Linear(input_dim, 1)
self.sigmoid = nn.Sigmoid()

self._init_weights()

def _init_weights(self):
nn.init.normal_(self.user_embedding.weight, std=0.01)
nn.init.normal_(self.item_embedding.weight, std=0.01)

for layer in self.mlp_layers:
if isinstance(layer, nn.Linear):
nn.init.xavier_uniform_(layer.weight)
nn.init.zeros_(layer.bias)

def forward(self, user_ids, item_ids):
user_embedded = self.user_embedding(user_ids)
item_embedded = self.item_embedding(item_ids)

# Concatenate embeddings
x = torch.cat([user_embedded, item_embedded], dim=-1)

# Pass through MLP
x = self.mlp_layers(x)

# Output
output = self.sigmoid(self.output_layer(x))
return output.squeeze()

Neural Matrix Factorization (NeuMF)

NeuMF combines GMF and MLP to leverage both linear and non-linear interactions:

class NeuMF(nn.Module):
def __init__(self, num_users, num_items, gmf_dim, mlp_dim, mlp_layers):
"""
Neural Matrix Factorization - Combines GMF and MLP

Args:
num_users: Number of users
num_items: Number of items
gmf_dim: Embedding dimension for GMF
mlp_dim: Embedding dimension for MLP
mlp_layers: List of MLP hidden layer sizes
"""
super(NeuMF, self).__init__()

# GMF embeddings
self.gmf_user_embedding = nn.Embedding(num_users, gmf_dim)
self.gmf_item_embedding = nn.Embedding(num_items, gmf_dim)

# MLP embeddings
self.mlp_user_embedding = nn.Embedding(num_users, mlp_dim)
self.mlp_item_embedding = nn.Embedding(num_items, mlp_dim)

# MLP layers
layers = []
input_dim = mlp_dim * 2

for hidden_dim in mlp_layers:
layers.append(nn.Linear(input_dim, hidden_dim))
layers.append(nn.ReLU())
layers.append(nn.Dropout(0.2))
input_dim = hidden_dim

self.mlp_layers = nn.Sequential(*layers)

# Final prediction layer
self.output_layer = nn.Linear(gmf_dim + input_dim, 1)
self.sigmoid = nn.Sigmoid()

self._init_weights()

def _init_weights(self):
nn.init.normal_(self.gmf_user_embedding.weight, std=0.01)
nn.init.normal_(self.gmf_item_embedding.weight, std=0.01)
nn.init.normal_(self.mlp_user_embedding.weight, std=0.01)
nn.init.normal_(self.mlp_item_embedding.weight, std=0.01)

for layer in self.mlp_layers:
if isinstance(layer, nn.Linear):
nn.init.xavier_uniform_(layer.weight)
nn.init.zeros_(layer.bias)

nn.init.xavier_uniform_(self.output_layer.weight)

def forward(self, user_ids, item_ids):
# GMF part
gmf_user = self.gmf_user_embedding(user_ids)
gmf_item = self.gmf_item_embedding(item_ids)
gmf_output = gmf_user * gmf_item

# MLP part
mlp_user = self.mlp_user_embedding(user_ids)
mlp_item = self.mlp_item_embedding(item_ids)
mlp_input = torch.cat([mlp_user, mlp_item], dim=-1)
mlp_output = self.mlp_layers(mlp_input)

# Concatenate GMF and MLP outputs
concat = torch.cat([gmf_output, mlp_output], dim=-1)

# Final prediction
output = self.sigmoid(self.output_layer(concat))
return output.squeeze()

Training NCF Models

Here's how to train an NCF model:

import torch
from torch.utils.data import Dataset, DataLoader
import torch.optim as optim

class RatingDataset(Dataset):
"""Dataset for user-item interactions"""
def __init__(self, user_ids, item_ids, ratings):
self.user_ids = torch.LongTensor(user_ids)
self.item_ids = torch.LongTensor(item_ids)
self.ratings = torch.FloatTensor(ratings)

def __len__(self):
return len(self.ratings)

def __getitem__(self, idx):
return self.user_ids[idx], self.item_ids[idx], self.ratings[idx]

def train_ncf(model, train_loader, val_loader, epochs=20, lr=0.001):
"""
Train NCF model

Args:
model: NCF model instance
train_loader: Training data loader
val_loader: Validation data loader
epochs: Number of training epochs
lr: Learning rate
"""
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

# Loss function and optimizer
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=1e-5)

# Learning rate scheduler
scheduler = optim.lr_scheduler.ReduceLROnPlateau(
optimizer, mode='min', factor=0.5, patience=3, verbose=True
)

best_val_loss = float('inf')

for epoch in range(epochs):
# Training phase
model.train()
train_loss = 0.0

for user_ids, item_ids, ratings in train_loader:
user_ids = user_ids.to(device)
item_ids = item_ids.to(device)
ratings = ratings.to(device)

# Forward pass
predictions = model(user_ids, item_ids)
loss = criterion(predictions, ratings)

# Backward pass
optimizer.zero_grad()
loss.backward()
optimizer.step()

train_loss += loss.item()

# Validation phase
model.eval()
val_loss = 0.0

with torch.no_grad():
for user_ids, item_ids, ratings in val_loader:
user_ids = user_ids.to(device)
item_ids = item_ids.to(device)
ratings = ratings.to(device)

predictions = model(user_ids, item_ids)
loss = criterion(predictions, ratings)
val_loss += loss.item()

# Calculate average losses
train_loss /= len(train_loader)
val_loss /= len(val_loader)

print(f'Epoch {epoch+1}/{epochs}:')
print(f' Train Loss: {train_loss:.4f}')
print(f' Val Loss: {val_loss:.4f}')

# Update learning rate
scheduler.step(val_loss)

# Save best model
if val_loss < best_val_loss:
best_val_loss = val_loss
torch.save(model.state_dict(), 'best_ncf_model.pth')
print(f' Model saved with val_loss: {val_loss:.4f}')

print()

return model

Negative Sampling

For implicit feedback data (clicks, views, etc.), we need negative sampling:

import random

def negative_sampling(user_ids, item_ids, num_items, num_negatives=4):
"""
Generate negative samples for implicit feedback

Args:
user_ids: List of user IDs with positive interactions
item_ids: List of item IDs with positive interactions
num_items: Total number of items
num_negatives: Number of negative samples per positive sample

Returns:
Extended lists with negative samples
"""
# Create set of positive interactions
positive_interactions = set(zip(user_ids, item_ids))

extended_users = []
extended_items = []
extended_labels = []

for user, item in zip(user_ids, item_ids):
# Add positive sample
extended_users.append(user)
extended_items.append(item)
extended_labels.append(1.0)

# Add negative samples
neg_count = 0
while neg_count < num_negatives:
neg_item = random.randint(0, num_items - 1)

# Ensure it's truly a negative sample
if (user, neg_item) not in positive_interactions:
extended_users.append(user)
extended_items.append(neg_item)
extended_labels.append(0.0)
neg_count += 1

return extended_users, extended_items, extended_labels

Complete Training Example

Here's a complete example using the MovieLens dataset:

import pandas as pd
from sklearn.model_selection import train_test_split

# Load data (example with MovieLens format)
def load_movielens_data(filepath):
"""Load and preprocess MovieLens data"""
# Assuming format: userId, movieId, rating, timestamp
df = pd.read_csv(filepath)

# Convert to implicit feedback (1 if rating >= 4, 0 otherwise)
df['label'] = (df['rating'] >= 4).astype(float)

# Create user and item mappings
user_ids = df['userId'].unique()
item_ids = df['movieId'].unique()

user_map = {id: idx for idx, id in enumerate(user_ids)}
item_map = {id: idx for idx, id in enumerate(item_ids)}

df['user_idx'] = df['userId'].map(user_map)
df['item_idx'] = df['movieId'].map(item_map)

return df, len(user_ids), len(item_ids)

# Main training script
def main():
# Load data
df, num_users, num_items = load_movielens_data('ratings.csv')

# Split data
train_df, val_df = train_test_split(df, test_size=0.2, random_state=42)

# Apply negative sampling to training data
train_users, train_items, train_labels = negative_sampling(
train_df['user_idx'].tolist(),
train_df['item_idx'].tolist(),
num_items,
num_negatives=4
)

# Create datasets
train_dataset = RatingDataset(train_users, train_items, train_labels)
val_dataset = RatingDataset(
val_df['user_idx'].tolist(),
val_df['item_idx'].tolist(),
val_df['label'].tolist()
)

# Create data loaders
train_loader = DataLoader(train_dataset, batch_size=256, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=256, shuffle=False)

# Initialize model
model = NeuMF(
num_users=num_users,
num_items=num_items,
gmf_dim=32,
mlp_dim=32,
mlp_layers=[64, 32, 16]
)

# Train model
model = train_ncf(model, train_loader, val_loader, epochs=20, lr=0.001)

print("Training completed!")

if __name__ == '__main__':
main()

Evaluation Metrics

Evaluating recommendation systems requires specialized metrics:

import numpy as np

def hit_rate_at_k(predictions, ground_truth, k=10):
"""
Calculate Hit Rate@K

Hit rate measures if the relevant item appears in top-K recommendations
"""
hits = 0
total = len(ground_truth)

for user, items in ground_truth.items():
if user in predictions:
top_k = predictions[user][:k]
if any(item in top_k for item in items):
hits += 1

return hits / total

def ndcg_at_k(predictions, ground_truth, k=10):
"""
Calculate Normalized Discounted Cumulative Gain@K

NDCG measures ranking quality, giving more weight to highly ranked items
"""
ndcgs = []

for user, true_items in ground_truth.items():
if user not in predictions:
continue

pred_items = predictions[user][:k]

# Calculate DCG
dcg = 0.0
for i, item in enumerate(pred_items):
if item in true_items:
dcg += 1.0 / np.log2(i + 2)

# Calculate ideal DCG
idcg = sum(1.0 / np.log2(i + 2) for i in range(min(len(true_items), k)))

# Calculate NDCG
if idcg > 0:
ndcgs.append(dcg / idcg)

return np.mean(ndcgs) if ndcgs else 0.0

def evaluate_model(model, test_loader, num_users, num_items, k=10):
"""
Evaluate NCF model on test set

Returns:
Dictionary with evaluation metrics
"""
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.eval()

# Get all predictions
all_predictions = {}
ground_truth = {}

with torch.no_grad():
for user_id in range(num_users):
# Get user's interactions
item_ids = torch.arange(num_items).to(device)
user_ids = torch.full((num_items,), user_id).to(device)

# Predict scores for all items
scores = model(user_ids, item_ids).cpu().numpy()

# Get top-K items
top_k_items = np.argsort(scores)[::-1][:k]
all_predictions[user_id] = top_k_items.tolist()

# Calculate metrics
hr_at_10 = hit_rate_at_k(all_predictions, ground_truth, k=10)
ndcg_at_10 = ndcg_at_k(all_predictions, ground_truth, k=10)

return {
'HR@10': hr_at_10,
'NDCG@10': ndcg_at_10
}

Making Recommendations

Once trained, use the model to generate recommendations:

def get_recommendations(model, user_id, num_items, k=10, exclude_items=None):
"""
Get top-K recommendations for a user

Args:
model: Trained NCF model
user_id: User ID to generate recommendations for
num_items: Total number of items
k: Number of recommendations to return
exclude_items: Set of item IDs to exclude (e.g., already interacted items)

Returns:
List of (item_id, score) tuples
"""
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.eval()

if exclude_items is None:
exclude_items = set()

with torch.no_grad():
# Predict scores for all items
user_ids = torch.full((num_items,), user_id).to(device)
item_ids = torch.arange(num_items).to(device)

scores = model(user_ids, item_ids).cpu().numpy()

# Create list of (item_id, score) pairs
item_scores = [(i, score) for i, score in enumerate(scores)
if i not in exclude_items]

# Sort by score and return top-K
item_scores.sort(key=lambda x: x[1], reverse=True)

return item_scores[:k]

# Example usage
def recommend_for_user(model, user_id, user_history, num_items, k=10):
"""
Generate recommendations for a user

Args:
model: Trained model
user_id: User to recommend for
user_history: Set of items user has already interacted with
num_items: Total number of items
k: Number of recommendations
"""
recommendations = get_recommendations(
model,
user_id,
num_items,
k=k,
exclude_items=user_history
)

print(f"Top {k} recommendations for User {user_id}:")
for rank, (item_id, score) in enumerate(recommendations, 1):
print(f" {rank}. Item {item_id}: {score:.4f}")

return recommendations

Advanced Techniques

1. Incorporating Side Information

Enhance NCF with user and item features:

class NCF_with_Features(nn.Module):
def __init__(self, num_users, num_items, user_feat_dim, item_feat_dim,
embedding_dim, hidden_layers):
"""NCF with side information"""
super(NCF_with_Features, self).__init__()

# Embeddings
self.user_embedding = nn.Embedding(num_users, embedding_dim)
self.item_embedding = nn.Embedding(num_items, embedding_dim)

# Feature processing
self.user_feat_fc = nn.Linear(user_feat_dim, embedding_dim)
self.item_feat_fc = nn.Linear(item_feat_dim, embedding_dim)

# Neural network layers
layers = []
input_dim = embedding_dim * 4 # user_emb + user_feat + item_emb + item_feat

for hidden_dim in hidden_layers:
layers.append(nn.Linear(input_dim, hidden_dim))
layers.append(nn.ReLU())
layers.append(nn.Dropout(0.2))
input_dim = hidden_dim

self.fc_layers = nn.Sequential(*layers)
self.output_layer = nn.Linear(input_dim, 1)
self.sigmoid = nn.Sigmoid()

def forward(self, user_ids, item_ids, user_features, item_features):
# Get embeddings
user_emb = self.user_embedding(user_ids)
item_emb = self.item_embedding(item_ids)

# Process features
user_feat = torch.relu(self.user_feat_fc(user_features))
item_feat = torch.relu(self.item_feat_fc(item_features))

# Concatenate all representations
x = torch.cat([user_emb, user_feat, item_emb, item_feat], dim=-1)

# Pass through network
x = self.fc_layers(x)
output = self.sigmoid(self.output_layer(x))

return output.squeeze()

2. Attention Mechanism

Add attention to focus on important features:

class AttentionNCF(nn.Module):
def __init__(self, num_users, num_items, embedding_dim, hidden_layers):
"""NCF with attention mechanism"""
super(AttentionNCF, self).__init__()

self.user_embedding = nn.Embedding(num_users, embedding_dim)
self.item_embedding = nn.Embedding(num_items, embedding_dim)

# Attention layers
self.attention = nn.Sequential(
nn.Linear(embedding_dim * 2, embedding_dim),
nn.Tanh(),
nn.Linear(embedding_dim, 1)
)

# Rest of the network
layers = []
input_dim = embedding_dim * 2

for hidden_dim in hidden_layers:
layers.append(nn.Linear(input_dim, hidden_dim))
layers.append(nn.ReLU())
input_dim = hidden_dim

self.fc_layers = nn.Sequential(*layers)
self.output_layer = nn.Linear(input_dim, 1)
self.sigmoid = nn.Sigmoid()

def forward(self, user_ids, item_ids):
user_emb = self.user_embedding(user_ids)
item_emb = self.item_embedding(item_ids)

# Calculate attention weights
concat = torch.cat([user_emb, item_emb], dim=-1)
attention_weights = torch.softmax(self.attention(concat), dim=-1)

# Apply attention
attended = concat * attention_weights

# Pass through network
x = self.fc_layers(attended)
output = self.sigmoid(self.output_layer(x))

return output.squeeze()

3. Multi-Task Learning

Learn multiple objectives simultaneously:

class MultiTaskNCF(nn.Module):
def __init__(self, num_users, num_items, embedding_dim, hidden_layers):
"""NCF with multi-task learning (rating prediction + ranking)"""
super(MultiTaskNCF, self).__init__()

self.user_embedding = nn.Embedding(num_users, embedding_dim)
self.item_embedding = nn.Embedding(num_items, embedding_dim)

# Shared layers
layers = []
input_dim = embedding_dim * 2

for hidden_dim in hidden_layers:
layers.append(nn.Linear(input_dim, hidden_dim))
layers.append(nn.ReLU())
input_dim = hidden_dim

self.shared_layers = nn.Sequential(*layers)

# Task-specific heads
self.rating_head = nn.Linear(input_dim, 1) # Regression
self.ranking_head = nn.Sequential(
nn.Linear(input_dim, 1),
nn.Sigmoid()
) # Binary classification

def forward(self, user_ids, item_ids, task='ranking'):
user_emb = self.user_embedding(user_ids)
item_emb = self.item_embedding(item_ids)

x = torch.cat([user_emb, item_emb], dim=-1)
shared_output = self.shared_layers(x)

if task == 'rating':
return self.rating_head(shared_output).squeeze()
else: # ranking
return self.ranking_head(shared_output).squeeze()

Best Practices

1. Hyperparameter Tuning

Key hyperparameters to tune:

  • Embedding dimension: 8-128 (typically 32-64 works well)
  • Hidden layer sizes: [64, 32, 16] is a good starting point
  • Learning rate: 0.001-0.01
  • Batch size: 256-1024
  • Negative sampling ratio: 4-10 negatives per positive

2. Regularization

Prevent overfitting:

# L2 regularization in optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-5)

# Dropout in layers
nn.Dropout(0.2)

# Early stopping
if val_loss > best_val_loss:
patience_counter += 1
if patience_counter >= patience:
print("Early stopping triggered")
break

3. Data Preprocessing

Important preprocessing steps:

def preprocess_data(df):
"""Preprocess interaction data"""
# Remove users/items with too few interactions
user_counts = df['user_id'].value_counts()
item_counts = df['item_id'].value_counts()

df = df[df['user_id'].isin(user_counts[user_counts >= 5].index)]
df = df[df['item_id'].isin(item_counts[item_counts >= 5].index)]

# Normalize timestamps if available
if 'timestamp' in df.columns:
df['timestamp'] = (df['timestamp'] - df['timestamp'].min()) / \
(df['timestamp'].max() - df['timestamp'].min())

return df

Advantages and Limitations

Advantages of NCF

Non-linear interactions: Can model complex user-item relationships
Flexibility: Easy to incorporate additional features
Scalability: Efficient training with mini-batch gradient descent
State-of-the-art performance: Often outperforms traditional methods
Deep learning benefits: Can leverage transfer learning and pre-training

Limitations

Computational cost: More expensive than simple matrix factorization
Data hungry: Requires substantial training data
Cold start: Still struggles with new users/items
Interpretability: Black-box nature makes it hard to explain recommendations
Hyperparameter sensitivity: Requires careful tuning

Real-World Applications

1. E-commerce

# Product recommendation system
model = NeuMF(
num_users=1000000,
num_items=500000,
gmf_dim=64,
mlp_dim=64,
mlp_layers=[128, 64, 32]
)

# Recommend products based on purchase history
recommendations = get_recommendations(
model,
user_id=12345,
num_items=500000,
k=20
)

2. Content Streaming

# Movie/music recommendation
# Incorporate temporal dynamics
class TemporalNCF(NCF):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.time_embedding = nn.Embedding(24, 8) # Hour of day

def forward(self, user_ids, item_ids, time_features):
# Add time-aware recommendations
pass

3. Social Media

# Post/content recommendation
# Consider engagement signals (likes, shares, comments)
engagement_weights = {
'like': 1.0,
'share': 2.0,
'comment': 3.0
}

Conclusion

Neural Collaborative Filtering represents a significant advancement in recommendation systems by leveraging deep learning to model complex user-item interactions. The flexibility of NCF allows for:

  • Better personalization through non-linear modeling
  • Integration of diverse signals (features, context, temporal dynamics)
  • Scalability to large datasets
  • Continuous improvement through online learning

While NCF has revolutionized recommendation systems, it's important to remember that the best approach often depends on your specific use case, data availability, and computational resources. Consider starting with simpler methods and gradually increasing complexity as needed.

Key Takeaways
  1. NCF extends traditional collaborative filtering with neural networks
  2. Combining GMF and MLP (NeuMF) often yields best results
  3. Proper negative sampling is crucial for implicit feedback
  4. Regularization and hyperparameter tuning are essential
  5. Always evaluate with appropriate metrics (HR@K, NDCG@K)

Further Reading

Code Repository

The complete implementation with example datasets is available on GitHub. Feel free to experiment and adapt the code for your specific use case!

# Clone the repository
git clone https://github.com/YourRepo/ncf-recommendation
cd ncf-recommendation

# Install dependencies
pip install torch pandas numpy scikit-learn

# Train the model
python train.py --data movielens --epochs 20 --batch-size 256