MongoDB with Ruby on Rails
Complete guide to NoSQL database integration
Fundamentals
Document-Oriented Storage
MongoDB stores data in BSON (Binary JSON) format, which is a binary representation of JSON documents. This allows for flexible, schema-less data storage where each document can have different fields.
// Example MongoDB Document
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"name": "John Doe",
"email": "[email protected]",
"age": 30,
"address": {
"street": "123 Main St",
"city": "New York",
"zip": "10001"
},
"interests": ["programming", "music", "travel"],
"created_at": ISODate("2024-01-15T10:30:00Z")
}
Core Features
- Document-Oriented: Data is stored in BSON (Binary JSON) format with rich data types
- Schema-Less: No predefined structure required – documents can evolve over time
- Scalable: Horizontal scaling with sharding across multiple servers
- High Performance: In-memory processing, efficient indexing, and fast queries
- Rich Query Language: Support for complex queries, aggregation pipelines, and geospatial queries
- Replication: Built-in replication for high availability and data redundancy
- GridFS: File storage system for large files
MongoDB vs Relational Databases
Aspect | MongoDB | Relational Databases |
---|---|---|
Data Model | Document-oriented (JSON-like) | Table-based (rows and columns) |
Schema | Flexible, schema-less | Rigid, predefined schema |
Relationships | Embedded documents or references | Foreign keys and joins |
Scaling | Horizontal (sharding) | Vertical (bigger hardware) |
Transactions | Limited (single document) | Full ACID support |
Query Language | MongoDB Query Language | SQL |
Advantages for Rails Applications
✅ Development Benefits
- Rapid Prototyping: No schema migrations needed for initial development
- Flexible Data Models: Easy to evolve data structures as requirements change
- JSON-like Structure: Natural fit for Rails hashes and JSON APIs
- Rich Data Types: Support for arrays, nested objects, and complex data structures
- Mongoid ORM: Rails-like interface with familiar ActiveRecord patterns
✅ Performance Benefits
- Read-Heavy Workloads: Excellent performance for applications with more reads than writes
- Horizontal Scaling: Can scale across multiple servers easily
- In-Memory Processing: Fast queries with proper indexing
- Aggregation Pipeline: Powerful data processing capabilities
- Geospatial Queries: Built-in support for location-based features
⚠️ Considerations & Trade-offs
- No Referential Integrity: Must handle relationships in application code
- Learning Curve: Different query patterns and data modeling concepts
- Transaction Limitations: Limited to single-document transactions
- Storage Overhead: Document storage can be less space-efficient than normalized tables
- Deployment Complexity: Different hosting and management considerations
Ideal Use Cases
- Content Management Systems: Flexible content structures with varying fields
- Real-time Analytics: Fast aggregation and reporting on large datasets
- IoT Applications: Time-series data and sensor readings
- Mobile Apps: JSON APIs and flexible data models
- E-commerce Catalogs: Product data with varying attributes
- Social Media Platforms: User-generated content with complex relationships
- Logging Systems: High-volume write operations
When to Avoid MongoDB
- Complex multi-document transactions
- Strict referential integrity requirements
- Heavy write workloads with complex relationships
- Existing SQL-based reporting systems
- Team with limited NoSQL experience
Core Components
Mongod (Database Server)
The primary database process that handles data storage, queries, and data management. It manages the data files and provides the database interface.
Mongos (Query Router)
Acts as a query router for sharded clusters. It routes client requests to the appropriate shards and aggregates results.
Config Servers
Store metadata and configuration settings for sharded clusters. They maintain information about data distribution across shards.
Data Organization
Database Structure:
└── Database (e.g., myapp_development)
├── Collection (e.g., users)
│ ├── Document 1
│ ├── Document 2
│ └── Document 3
├── Collection (e.g., products)
│ ├── Document 1
│ └── Document 2
└── Collection (e.g., orders)
├── Document 1
└── Document 2
Indexing Strategy
MongoDB uses B-tree indexes for efficient query performance. Indexes can be created on single fields, compound fields, or special types like text and geospatial indexes.
// Common Index Types
// Single field index
db.users.createIndex({ "email": 1 })
// Compound index
db.users.createIndex({ "email": 1, "created_at": -1 })
// Text index
db.products.createIndex({ "name": "text", "description": "text" })
// Geospatial index
db.locations.createIndex({ "location": "2dsphere" })
What is Mongoid?
Mongoid is the official MongoDB ODM (Object Document Mapper) for Ruby. It provides a Rails-like interface for working with MongoDB documents, similar to how ActiveRecord works with relational databases.
Key Features
- ActiveRecord-like Interface: Familiar methods like
find
,where
,create
- Validations: Rails-style validations for document integrity
- Associations: Support for embedded and referenced relationships
- Callbacks: Lifecycle hooks like
before_save
,after_create
- Scopes: Reusable query chains
- Indexing: Declarative index definitions
- Serialization: Custom serialization for complex data types
Basic Mongoid Model
class User
include Mongoid::Document
include Mongoid::Timestamps
# Field definitions
field :email, type: String
field :name, type: String
field :age, type: Integer
field :active, type: Boolean, default: true
# Validations
validates :email, presence: true, uniqueness: true
validates :name, presence: true
# Indexes
index({ email: 1 }, { unique: true })
# Scopes
scope :active, -> { where(active: true) }
scope :adults, -> { where(:age.gte => 18) }
# Instance methods
def full_name
"#{name} (#{email})"
end
end
Mongoid vs ActiveRecord Comparison
Feature | Mongoid | ActiveRecord |
---|---|---|
Data Storage | BSON Documents | Relational Tables |
Schema | Dynamic fields | Predefined columns |
Relationships | Embedded + Referenced | Foreign keys + Joins |
Queries | MongoDB Query Language | SQL |
Migrations | Not needed | Required for schema changes |
Transactions | Single document only | Full ACID support |
Setup & Configuration
System Requirements
- Operating System: macOS 10.14+, Ubuntu 18.04+, CentOS 7+, Windows 10+
- Memory: Minimum 4GB RAM (8GB+ recommended for production)
- Storage: SSD recommended for better performance
- Ruby: Ruby 2.7+ with Rails 6.0+
- Network: Port 27017 available for MongoDB
Required Tools
# Verify Ruby and Rails versions
ruby --version # Should be 2.7+
rails --version # Should be 6.0+
# Check if MongoDB is already installed
mongod --version
# Verify network connectivity
telnet localhost 27017
macOS Installation
# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Add MongoDB tap
brew tap mongodb/brew
# Install MongoDB Community Edition
brew install mongodb-community
# Start MongoDB service
brew services start mongodb/brew/mongodb-community
# Verify installation
mongod --version
mongo --version
Ubuntu/Debian Installation
# Import MongoDB public GPG key
wget -qO - https://www.mongodb.org/static/pgp/server-6.0.asc | sudo apt-key add -
# Add MongoDB repository
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-6.0.list
# Update package database
sudo apt-get update
# Install MongoDB
sudo apt-get install -y mongodb-org
# Start MongoDB service
sudo systemctl start mongod
sudo systemctl enable mongod
# Verify installation
mongod --version
Windows Installation
# Download MongoDB Community Server from:
# https://www.mongodb.com/try/download/community
# Extract to C:\mongodb
# Create data directory
mkdir C:\data\db
# Start MongoDB (from C:\mongodb\bin)
mongod --dbpath C:\data\db
# Or install as a Windows service
mongod --install --dbpath C:\data\db --logpath C:\mongodb\log\mongod.log
Docker Installation (Alternative)
# Pull MongoDB image
docker pull mongo:6.0
# Run MongoDB container
docker run -d \
--name mongodb \
-p 27017:27017 \
-v mongodb_data:/data/db \
mongo:6.0
# Connect to MongoDB
docker exec -it mongodb mongosh
Step 1: Add Mongoid to Gemfile
# Gemfile
source 'https://rubygems.org'
# ... other gems ...
# MongoDB ODM
gem 'mongoid', '~> 8.0'
# Optional: Better performance for BSON operations
gem 'bson_ext', '~> 1.12'
# Optional: For MongoDB monitoring
gem 'mongo', '~> 2.18'
group :development, :test do
# Optional: MongoDB GUI client
gem 'mongodb_visualizer'
end
Step 2: Install Dependencies
# Install gems
bundle install
# Verify Mongoid installation
rails runner "puts Mongoid::VERSION"
Step 3: Generate Configuration
# Generate Mongoid configuration file
rails generate mongoid:config
# This creates:
# - config/mongoid.yml
# - config/application.rb (updates)
Step 4: Configure Database
# config/mongoid.yml
development:
clients:
default:
uri: mongodb://localhost:27017/myapp_development
options:
server_selection_timeout: 5
max_pool_size: 5
min_pool_size: 1
max_idle_time: 300
wait_queue_timeout: 2500
test:
clients:
default:
uri: mongodb://localhost:27017/myapp_test
options:
server_selection_timeout: 5
max_pool_size: 5
min_pool_size: 1
production:
clients:
default:
uri: <%= ENV['MONGODB_URI'] %>
options:
server_selection_timeout: 5
max_pool_size: 20
min_pool_size: 5
max_idle_time: 300
wait_queue_timeout: 2500
read_preference: :secondary
write_concern: { w: 1, j: true }
Step 5: Update Application Configuration
# config/application.rb
require_relative "boot"
require "rails"
# Pick the frameworks you want:
require "action_controller/railtie"
require "action_mailer/railtie"
require "action_view/railtie"
require "action_cable/engine"
require "rails/test_unit/railtie"
# require "active_record/railtie" # Comment out if using MongoDB only
module MyApp
class Application < Rails::Application
# ... other configuration ...
# Initialize Mongoid
config.mongoid.logger = Rails.logger
end
end
Development Environment
Local Development Setup
# .env.development
MONGODB_URI=mongodb://localhost:27017/myapp_development
MONGODB_USERNAME=
MONGODB_PASSWORD=
# Enable query logging in development
# config/environments/development.rb
config.mongoid.logger = Rails.logger
config.mongoid.logger.level = Logger::DEBUG
Test Environment
Test Database Configuration
# config/environments/test.rb
config.mongoid.logger = Rails.logger
config.mongoid.logger.level = Logger::INFO
# Optional: Use in-memory MongoDB for faster tests
# Add to Gemfile: gem 'mongodb-memory-server'
Production Environment
# Production environment variables
MONGODB_URI=mongodb+srv://username:[email protected]/myapp_production
MONGODB_SSL_CA_CERT=/path/to/ca-certificate.crt
MONGODB_SSL_CERT=/path/to/client-certificate.crt
MONGODB_SSL_KEY=/path/to/client-key.pem
# Enhanced production configuration
# config/mongoid.yml (production section)
production:
clients:
default:
uri: <%= ENV['MONGODB_URI'] %>
options:
server_selection_timeout: 5
max_pool_size: 20
min_pool_size: 5
max_idle_time: 300
wait_queue_timeout: 2500
read_preference: :secondary
write_concern: { w: 1, j: true }
ssl: true
ssl_ca_cert: <%= ENV['MONGODB_SSL_CA_CERT'] %>
ssl_cert: <%= ENV['MONGODB_SSL_CERT'] %>
ssl_key: <%= ENV['MONGODB_SSL_KEY'] %>
retry_writes: true
retry_reads: true
Database Connection Test
# Test MongoDB connection
rails runner "puts 'MongoDB connected!' if Mongoid.default_client.database"
# Test basic operations
rails runner "
user = User.create(name: 'Test User', email: '[email protected]')
puts 'User created: ' + user.name
user.destroy
puts 'Test completed successfully!'
"
Health Check Script
# lib/tasks/mongodb.rake
namespace :mongodb do
desc "Check MongoDB connection and basic operations"
task health_check: :environment do
begin
# Test connection
client = Mongoid.default_client
database = client.database
puts "✅ MongoDB Connection: OK"
puts "✅ Database: #{database.name}"
# Test write operation
test_collection = database.collection('health_check')
test_collection.insert_one({ test: true, timestamp: Time.current })
puts "✅ Write Operation: OK"
# Test read operation
result = test_collection.find({ test: true }).first
puts "✅ Read Operation: OK"
# Cleanup
test_collection.delete_many({ test: true })
puts "✅ Cleanup: OK"
puts "\n🎉 MongoDB Health Check: PASSED"
rescue => e
puts "❌ MongoDB Health Check: FAILED"
puts "Error: #{e.message}"
exit 1
end
end
end
Performance Testing
# Test connection pool and performance
rails runner "
start_time = Time.current
# Test bulk operations
100.times do |i|
User.create(name: \"User #{i}\", email: \"user#{i}@example.com\")
end
end_time = Time.current
puts \"Created 100 users in #{end_time - start_time} seconds\"
# Test queries
start_time = Time.current
users = User.where(:name => /User/).limit(10)
end_time = Time.current
puts \"Queried 10 users in #{end_time - start_time} seconds\"
# Cleanup
User.delete_all
puts \"Cleanup completed\"
"
Common Issues & Solutions
Connection Refused
Problem: Cannot connect to MongoDB server
# Check if MongoDB is running
brew services list | grep mongodb
sudo systemctl status mongod
# Start MongoDB if not running
brew services start mongodb/brew/mongodb-community
sudo systemctl start mongod
# Check port availability
lsof -i :27017
Authentication Errors
Problem: Authentication failed
# Check connection string format
# Correct: mongodb://username:password@host:port/database
# Wrong: mongodb://host:port/database
# Test connection without authentication first
# Then gradually add authentication requirements
SSL/TLS Issues
Problem: SSL certificate validation errors
# For development, you can disable SSL verification
# (NOT recommended for production)
options:
ssl: true
ssl_verify_cert: false
# For production, ensure proper certificate paths
ssl_ca_cert: /path/to/ca-certificate.crt
ssl_cert: /path/to/client-certificate.crt
ssl_key: /path/to/client-key.pem
Debugging Commands
# Check MongoDB logs
tail -f /usr/local/var/log/mongodb/mongo.log
sudo tail -f /var/log/mongodb/mongod.log
# Check Rails logs for MongoDB queries
tail -f log/development.log | grep -i mongo
# Test MongoDB shell connection
mongosh
# or
mongo
# Check Mongoid configuration
rails runner "puts Mongoid.clients"
rails runner "puts Mongoid.default_client.database.name"
Models & Data Modeling
Creating Your First Model
# app/models/user.rb
class User
include Mongoid::Document
include Mongoid::Timestamps
# Field definitions
field :email, type: String
field :name, type: String
field :age, type: Integer
field :active, type: Boolean, default: true
field :preferences, type: Hash, default: {}
field :tags, type: Array, default: []
# Validations
validates :email, presence: true, uniqueness: true
validates :name, presence: true
validates :age, numericality: { greater_than: 0, less_than: 150 }
# Indexes
index({ email: 1 }, { unique: true })
index({ name: 1 })
index({ active: 1, created_at: -1 })
# Scopes
scope :active, -> { where(active: true) }
scope :adults, -> { where(:age.gte => 18) }
scope :recent, -> { order(created_at: -1) }
# Instance methods
def full_name
"#{name} (#{email})"
end
def adult?
age >= 18
end
end
Model Components Explained
- Mongoid::Document: Makes the class a MongoDB document
- Mongoid::Timestamps: Adds created_at and updated_at fields
- field: Defines document fields with types and options
- validates: Rails-style validations for data integrity
- index: Creates database indexes for performance
- scope: Reusable query chains
Field Types Reference
Mongoid Type | Ruby Type | Description | Example |
---|---|---|---|
String | String | Text data | "John Doe" |
Integer | Integer | Whole numbers | 25 |
Float | Float | Decimal numbers | 99.99 |
Boolean | TrueClass/FalseClass | True/false values | true |
Date | Date | Date without time | Date.new(2024, 1, 15) |
DateTime | DateTime | Date with time | DateTime.current |
Array | Array | List of values | ["ruby", "rails"] |
Hash | Hash | Key-value pairs | {theme: "dark"} |
ObjectId | BSON::ObjectId | MongoDB ObjectId | BSON::ObjectId.new |
Field Configuration Options
class Product
include Mongoid::Document
# Basic field with type
field :name, type: String
# Field with default value
field :price, type: Float, default: 0.0
# Field with custom getter/setter
field :slug, type: String
def slug=(value)
super(value&.downcase&.gsub(/\s+/, '-'))
end
# Field with localization
field :description, type: String, localize: true
# Field with custom serialization
field :metadata, type: Hash, default: {}
# Virtual attributes (not stored in database)
attr_accessor :temporary_note
# Custom field methods
def display_price
"$#{price.round(2)}"
end
def expensive?
price > 100
end
end
Field Options Reference
Option | Type | Description | Example |
---|---|---|---|
type | Class | Data type for the field | type: String |
default | Any | Default value for new documents | default: true |
localize | Boolean | Enable localization for the field | localize: true |
as | Symbol | Alias for the field name | as: :title |
overwrite | Boolean | Overwrite existing field definition | overwrite: true |
Custom Field Types
# Custom field type for Money
class Money
include Mongoid::Fields::Serializable
def initialize(amount = 0, currency = 'USD')
@amount = amount.to_f
@currency = currency
end
def serialize
{ amount: @amount, currency: @currency }
end
def deserialize(object)
return self if object.nil?
@amount = object['amount'].to_f
@currency = object['currency']
self
end
def to_s
"#{@currency} #{@amount}"
end
end
# Usage in model
class Product
include Mongoid::Document
field :price, type: Money, default: -> { Money.new(0) }
end
Embedded vs Referenced Documents
Embedded Documents (One-to-Few)
Use when the related data is small, doesn't change frequently, and is always accessed together with the parent.
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
# Embedded one-to-one
embeds_one :profile
# Embedded one-to-many
embeds_many :addresses
end
class Profile
include Mongoid::Document
field :bio, type: String
field :avatar_url, type: String
field :location, type: String
embedded_in :user
end
class Address
include Mongoid::Document
field :street, type: String
field :city, type: String
field :state, type: String
field :zip_code, type: String
field :primary, type: Boolean, default: false
embedded_in :user
end
Referenced Documents (One-to-Many/Many-to-Many)
Use when data is large, changes frequently, or needs to be shared across multiple documents.
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
# Referenced one-to-many
has_many :posts
# Referenced many-to-many
has_and_belongs_to_many :roles
end
class Post
include Mongoid::Document
field :title, type: String
field :content, type: String
field :published_at, type: DateTime
belongs_to :user
validates :title, presence: true
end
class Role
include Mongoid::Document
field :name, type: String
field :description, type: String
has_and_belongs_to_many :users
end
When to Use Each Strategy
Strategy | Use When | Advantages | Disadvantages |
---|---|---|---|
Embedded | Small data, always accessed together | Fast reads, atomic updates | Document size limits, no sharing |
Referenced | Large data, shared across documents | Flexible, reusable, smaller documents | Multiple queries, no atomic updates |
Index Types and Usage
class User
include Mongoid::Document
field :email, type: String
field :username, type: String
field :name, type: String
field :status, type: String, default: "active"
field :created_at, type: DateTime
field :last_login_at, type: DateTime
field :location, type: Array # [longitude, latitude]
field :tags, type: Array, default: []
field :metadata, type: Hash, default: {}
# Single field indexes
index({ email: 1 }, { unique: true })
index({ username: 1 }, { unique: true })
index({ status: 1 })
index({ created_at: -1 })
index({ last_login_at: -1 })
# Compound indexes (order matters!)
index({ status: 1, created_at: -1 })
index({ email: 1, status: 1 })
index({ status: 1, last_login_at: -1 })
# Text search indexes
index({ name: "text", bio: "text" })
# Geospatial indexes
index({ location: "2dsphere" })
# Array indexes
index({ tags: 1 })
# Sparse indexes (skip null values)
index({ phone: 1 }, { sparse: true })
# TTL indexes (auto-delete after time)
index({ created_at: 1 }, { expire_after_seconds: 86400 }) # 24 hours
# Partial indexes (only for specific conditions)
index({ email: 1 }, { partialFilterExpression: { status: "active" } })
# Background indexes for large collections
index({ username: 1 }, { background: true })
end
Index Management
# Create all indexes for a model
User.create_indexes
# Create indexes for all models
Mongoid.create_indexes
# Drop all indexes for a model
User.remove_indexes
# Check existing indexes
User.collection.indexes.each do |index|
puts "Index: #{index['name']}"
puts "Keys: #{index['key']}"
puts "Options: #{index['options']}"
puts "---"
end
# Create indexes with specific options
User.collection.indexes.create_one(
{ email: 1, status: 1 },
{
background: true,
name: "email_status_idx"
}
)
# Drop specific index
User.collection.indexes.drop_one("email_status_idx")
# Check index usage statistics
User.collection.aggregate([
{ "$indexStats" => {} }
])
Index Best Practices
- Compound Index Order: Most selective field first
- Covered Queries: Include all queried fields in index
- Avoid Over-Indexing: Each index has write overhead
- Background Indexing: Use for large collections
- Monitor Usage: Remove unused indexes
- TTL Indexes: For time-based data cleanup
- Partial Indexes: For conditional queries
Defining Scopes
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
field :age, type: Integer
field :status, type: String
field :created_at, type: DateTime
# Basic scopes
scope :active, -> { where(status: "active") }
scope :inactive, -> { where(status: "inactive") }
scope :adults, -> { where(:age.gte => 18) }
scope :recent, -> { order(created_at: -1) }
# Scopes with parameters
scope :older_than, ->(age) { where(:age.gt => age) }
scope :created_after, ->(date) { where(:created_at.gt => date) }
# Chained scopes
scope :active_adults, -> { active.adults }
scope :recent_active, -> { active.recent }
# Scopes with complex logic
scope :search, ->(query) {
any_of(
{ name: /#{query}/i },
{ email: /#{query}/i }
)
}
end
# Usage
User.active.adults.recent.limit(10)
User.search("john").older_than(25)
Custom Query Methods
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
field :age, type: Integer
field :location, type: Array
# Class methods for complex queries
def self.find_by_email_domain(domain)
where(email: /@#{domain}$/)
end
def self.find_nearby(lat, lng, radius_km = 10)
where(
location: {
"$near" => {
"$geometry" => {
"type" => "Point",
"coordinates" => [lng, lat]
},
"$maxDistance" => radius_km * 1000
}
}
)
end
def self.age_distribution
collection.aggregate([
{ "$group" => {
"_id" => "$age",
"count" => { "$sum" => 1 }
}},
{ "$sort" => { "_id" => 1 } }
])
end
# Instance methods
def display_name
name.present? ? name : email
end
def age_group
case age
when 0..17 then "minor"
when 18..25 then "young_adult"
when 26..64 then "adult"
else "senior"
end
end
end
Model Callbacks
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
field :slug, type: String
field :last_login_at, type: DateTime
# Before callbacks
before_create :generate_slug
before_save :normalize_email
before_update :track_changes
# After callbacks
after_create :send_welcome_email
after_save :update_search_index
after_destroy :cleanup_related_data
# Around callbacks
around_save :log_operation_time
private
def generate_slug
self.slug = name.parameterize
end
def normalize_email
self.email = email.downcase.strip
end
def track_changes
Rails.logger.info "User #{id} changed: #{changes.keys.join(', ')}"
end
def send_welcome_email
UserMailer.welcome(self).deliver_later
end
def update_search_index
SearchIndexJob.perform_later(self)
end
def cleanup_related_data
Post.where(user_id: id).destroy_all
end
def log_operation_time
start_time = Time.current
yield
Rails.logger.info "Operation took #{Time.current - start_time} seconds"
end
end
Callback Types
Callback | Triggered When | Common Uses |
---|---|---|
before_create | Before document is created | Generate slugs, set defaults |
after_create | After document is created | Send notifications, create related data |
before_save | Before any save operation | Normalize data, validate custom rules |
after_save | After any save operation | Update search indexes, cache invalidation |
before_destroy | Before document is deleted | Validate deletion, backup data |
after_destroy | After document is deleted | Cleanup related data, audit logging |
E-commerce Product Model
class Product
include Mongoid::Document
include Mongoid::Timestamps
# Basic fields
field :name, type: String
field :description, type: String
field :sku, type: String
field :price, type: Float
field :cost, type: Float
field :status, type: String, default: "draft"
field :category, type: String
field :tags, type: Array, default: []
field :inventory, type: Integer, default: 0
field :weight, type: Float
field :dimensions, type: Hash, default: {}
# Complex fields
field :metadata, type: Hash, default: {}
field :seo_data, type: Hash, default: {}
field :pricing_tiers, type: Array, default: []
# Embedded documents
embeds_many :variants
embeds_many :images
embeds_many :reviews
# Referenced associations
belongs_to :brand
has_many :order_items
# Validations
validates :name, presence: true
validates :sku, presence: true, uniqueness: true
validates :price, numericality: { greater_than: 0 }
validates :inventory, numericality: { greater_than_or_equal_to: 0 }
# Indexes
index({ sku: 1 }, { unique: true })
index({ name: "text", description: "text" })
index({ category: 1, status: 1 })
index({ price: 1 })
index({ tags: 1 })
index({ "brand.name" => 1 })
# Scopes
scope :active, -> { where(status: "active") }
scope :in_stock, -> { where(:inventory.gt => 0) }
scope :by_category, ->(category) { where(category: category) }
scope :price_range, ->(min, max) { where(:price.gte => min, :price.lte => max) }
scope :featured, -> { where("metadata.featured" => true) }
# Callbacks
before_save :update_seo_slug
after_save :update_search_index
# Instance methods
def available?
active? && inventory > 0
end
def profit_margin
return 0 if cost.zero?
((price - cost) / cost * 100).round(2)
end
def update_inventory(quantity)
increment(inventory: quantity)
end
def primary_image
images.find_by(primary: true) || images.first
end
def average_rating
reviews.any? ? reviews.avg(:rating) : 0
end
private
def update_seo_slug
self.seo_data = seo_data.merge(
slug: name.parameterize,
title: "#{name} - #{brand&.name}",
description: description.truncate(160)
)
end
def update_search_index
SearchIndexJob.perform_later(self)
end
end
Social Media Post Model
class Post
include Mongoid::Document
include Mongoid::Timestamps
# Basic fields
field :content, type: String
field :visibility, type: String, default: "public"
field :status, type: String, default: "published"
field :location, type: Array # [longitude, latitude]
field :language, type: String, default: "en"
# Complex fields
field :media_urls, type: Array, default: []
field :hashtags, type: Array, default: []
field :mentions, type: Array, default: []
field :metadata, type: Hash, default: {}
# Embedded documents
embeds_many :comments
embeds_many :reactions
# Referenced associations
belongs_to :user
has_many :shares
has_and_belongs_to_many :hashtags
# Validations
validates :content, presence: true, length: { maximum: 280 }
validates :visibility, inclusion: { in: %w[public private friends] }
# Indexes
index({ user_id: 1, created_at: -1 })
index({ visibility: 1, created_at: -1 })
index({ location: "2dsphere" })
index({ hashtags: 1 })
index({ "user.username" => 1 })
# Scopes
scope :public_posts, -> { where(visibility: "public") }
scope :recent, -> { order(created_at: -1) }
scope :by_user, ->(user) { where(user: user) }
scope :with_media, -> { where(:media_urls.ne => []) }
scope :trending, -> { where(:created_at.gte => 24.hours.ago) }
# Callbacks
before_save :extract_hashtags_and_mentions
after_create :notify_followers
# Instance methods
def like_count
reactions.where(type: "like").count
end
def comment_count
comments.count
end
def share_count
shares.count
end
def engagement_rate
total_engagement = like_count + comment_count + share_count
user.followers_count > 0 ? (total_engagement.to_f / user.followers_count * 100).round(2) : 0
end
def can_be_viewed_by?(viewer)
return true if visibility == "public"
return true if user == viewer
return true if visibility == "friends" && user.friends.include?(viewer)
false
end
private
def extract_hashtags_and_mentions
self.hashtags = content.scan(/#\w+/).map(&:downcase)
self.mentions = content.scan(/@\w+/).map(&:downcase)
end
def notify_followers
NotificationJob.perform_later(self)
end
end
Queries & Aggregation
Finding Documents
# Find all documents
users = User.all
# Find by ID (ObjectId)
user = User.find("507f1f77bcf86cd799439011")
user = User.find(BSON::ObjectId.from_string("507f1f77bcf86cd799439011"))
# Find by field value
user = User.find_by(email: "[email protected]")
users = User.where(status: "active")
# Find first/last documents
first_user = User.first
last_user = User.last
first_active = User.where(status: "active").first
# Find with multiple conditions
active_adults = User.where(:age.gte => 18, status: "active")
recent_users = User.where(:created_at.gte => 1.week.ago)
premium_users = User.where(:subscription_type.in => ["premium", "enterprise"])
# Find with OR conditions
users = User.any_of(
{ email: /@gmail.com$/ },
{ email: /@yahoo.com$/ }
)
# Find with NOT conditions
non_gmail_users = User.where(:email.nin => [/@gmail.com$/])
# Find with EXISTS conditions
users_with_phone = User.where(:phone.exists => true)
users_without_bio = User.where(:bio.exists => false)
Query Operators Reference
Operator | Description | Example | MongoDB Equivalent |
---|---|---|---|
eq | Equal to | where(status: "active") | $eq |
ne | Not equal to | where(:status.ne => "inactive") | $ne |
gt | Greater than | where(:age.gt => 18) | $gt |
gte | Greater than or equal | where(:age.gte => 18) | $gte |
lt | Less than | where(:age.lt => 65) | $lt |
lte | Less than or equal | where(:age.lte => 65) | $lte |
in | In array | where(:status.in => ["active", "pending"]) | $in |
nin | Not in array | where(:status.nin => ["deleted", "banned"]) | $nin |
exists | Field exists | where(:phone.exists => true) | $exists |
regex | Regular expression | where(:email => /@gmail.com$/) | $regex |
Limiting and Sorting
# Limit results
recent_users = User.order(created_at: -1).limit(10)
# Skip documents (pagination)
page_2_users = User.order(created_at: -1).skip(20).limit(10)
# Sort by multiple fields
users = User.order(:status.asc, :created_at.desc)
# Sort by embedded fields
users = User.order("profile.age" => -1)
# Distinct values
unique_statuses = User.distinct(:status)
unique_domains = User.distinct(:email).map { |email| email.split('@').last }
# Count documents
total_users = User.count
active_count = User.where(status: "active").count
recent_count = User.where(:created_at.gte => 1.day.ago).count
Complex Query Operators
# Array operators
users_with_tags = User.where(:tags.all => ["ruby", "rails"])
users_with_any_tag = User.where(:tags.in => ["ruby", "rails"])
users_without_tags = User.where(:tags.nin => ["php", "java"])
# Array element matching
users_with_first_tag = User.where("tags.0" => "ruby")
users_with_size = User.where(:tags.size => 3)
# Object/Embedded document queries
users_with_city = User.where("address.city" => "New York")
users_with_coordinates = User.where("location.0" => { "$gte" => -74, "$lte" => -73 })
# Nested object queries
premium_users = User.where("subscription.plan" => "premium")
active_premium = User.where("subscription.plan" => "premium", "subscription.status" => "active")
# Date range queries
today_users = User.where(:created_at.gte => Date.current.beginning_of_day)
this_week_users = User.where(:created_at.gte => Date.current.beginning_of_week)
last_month_users = User.where(:created_at.gte => 1.month.ago)
# Text search (requires text index)
search_results = User.where("$text" => { "$search" => "john developer" })
search_with_score = User.where("$text" => { "$search" => "ruby rails" }).order("$text" => { "$meta" => "textScore" })
Logical Operators
# AND conditions (default)
active_adults = User.where(:status => "active", :age.gte => 18)
# OR conditions
gmail_or_yahoo = User.any_of(
{ email: /@gmail.com$/ },
{ email: /@yahoo.com$/ }
)
# NOR conditions (neither condition true)
not_gmail_not_yahoo = User.nor(
{ email: /@gmail.com$/ },
{ email: /@yahoo.com$/ }
)
# Complex logical combinations
complex_query = User.where(:status => "active").any_of(
{ :age.gte => 18, :verified => true },
{ :age.gte => 21, :verified => false }
).nor(
{ email: /@spam.com$/ }
)
Geospatial Queries
# Near query (requires 2dsphere index)
nearby_users = User.where(
location: {
"$near" => {
"$geometry" => {
"type" => "Point",
"coordinates" => [-73.935242, 40.730610] # NYC coordinates
},
"$maxDistance" => 5000 # 5km radius
}
}
)
# Within polygon
polygon_users = User.where(
location: {
"$geoWithin" => {
"$geometry" => {
"type" => "Polygon",
"coordinates" => [[
[-74, 40], [-74, 41], [-73, 41], [-73, 40], [-74, 40]
]]
}
}
}
)
# Intersects with polygon
intersecting_users = User.where(
location: {
"$geoIntersects" => {
"$geometry" => {
"type" => "Polygon",
"coordinates" => [[
[-74, 40], [-74, 41], [-73, 41], [-73, 40], [-74, 40]
]]
}
}
}
)
Basic Aggregation
# Simple aggregation
result = User.collection.aggregate([
{ "$match" => { status: "active" } },
{ "$group" => {
"_id" => "$age_group",
"count" => { "$sum" => 1 },
"avg_age" => { "$avg" => "$age" }
}},
{ "$sort" => { "_id" => 1 } }
])
# Using Mongoid aggregation
result = User.where(status: "active").aggregate([
{ "$group" => {
"_id" => "$age_group",
"count" => { "$sum" => 1 }
}}
])
Aggregation Stages
Stage | Description | Example |
---|---|---|
$match | Filter documents | { "$match" => { status: "active" } } |
$group | Group by field | { "$group" => { "_id" => "$category", "count" => { "$sum" => 1 } } } |
$sort | Sort results | { "$sort" => { "count" => -1 } } |
$limit | Limit results | { "$limit" => 10 } |
$skip | Skip documents | { "$skip" => 20 } |
$project | Select fields | { "$project" => { name: 1, email: 1, _id: 0 } } |
$lookup | Join collections | { "$lookup" => { from: "posts", localField: "_id", foreignField: "user_id", as: "posts" } } |
$unwind | Deconstruct arrays | { "$unwind" => "$tags" } |
$addFields | Add computed fields | { "$addFields" => { "full_name" => { "$concat" => ["$first_name", " ", "$last_name"] } } } |
Advanced Aggregation Examples
# User statistics by age group
age_stats = User.collection.aggregate([
{ "$addFields" => {
"age_group" => {
"$switch" => {
"branches" => [
{ "case" => { "$lt" => ["$age", 18] }, "then" => "minor" },
{ "case" => { "$lt" => ["$age", 25] }, "then" => "young_adult" },
{ "case" => { "$lt" => ["$age", 65] }, "then" => "adult" }
],
"default" => "senior"
}
}
}},
{ "$group" => {
"_id" => "$age_group",
"count" => { "$sum" => 1 },
"avg_age" => { "$avg" => "$age" },
"min_age" => { "$min" => "$age" },
"max_age" => { "$max" => "$age" }
}},
{ "$sort" => { "count" => -1 } }
])
# User activity timeline
activity_timeline = User.collection.aggregate([
{ "$match" => { :created_at.gte => 30.days.ago } },
{ "$group" => {
"_id" => {
"year" => { "$year" => "$created_at" },
"month" => { "$month" => "$created_at" },
"day" => { "$dayOfMonth" => "$created_at" }
},
"new_users" => { "$sum" => 1 }
}},
{ "$sort" => { "_id" => 1 } }
])
# Top users by post count
top_users = User.collection.aggregate([
{ "$lookup" => {
"from" => "posts",
"localField" => "_id",
"foreignField" => "user_id",
"as" => "posts"
}},
{ "$addFields" => {
"post_count" => { "$size" => "$posts" }
}},
{ "$match" => { "post_count" => { "$gt" => 0 } } },
{ "$sort" => { "post_count" => -1 } },
{ "$limit" => 10 },
{ "$project" => {
"name" => 1,
"email" => 1,
"post_count" => 1,
"_id" => 0
}}
])
Aggregation Operators
Category | Operators | Description |
---|---|---|
Arithmetic | $add, $subtract, $multiply, $divide, $mod | Mathematical operations |
Comparison | $eq, $ne, $gt, $gte, $lt, $lte | Value comparisons |
Logical | $and, $or, $not, $nor | Logical operations |
String | $concat, $substr, $toLower, $toUpper | String manipulation |
Date | $year, $month, $dayOfMonth, $hour | Date/time operations |
Array | $size, $push, $addToSet, $first, $last | Array operations |
Conditional | $cond, $switch, $case | Conditional logic |
Query Performance Best Practices
# Use indexes for frequently queried fields
class User
include Mongoid::Document
field :email, type: String
field :status, type: String
field :created_at, type: DateTime
# Create compound indexes for common query patterns
index({ email: 1 }, { unique: true })
index({ status: 1, created_at: -1 })
index({ "profile.age" => 1, status: 1 })
# Text search index
index({ name: "text", bio: "text" })
# Geospatial index
index({ location: "2dsphere" })
end
# Use covered queries (all fields in index)
# Good: All fields in index
User.where(status: "active").only(:status, :created_at)
# Avoid: Fields not in index
User.where(status: "active").only(:status, :created_at, :name)
# Use projection to limit returned fields
users = User.where(status: "active").only(:name, :email)
# Use limit for large result sets
recent_users = User.order(created_at: -1).limit(100)
# Use skip with limit for pagination
page_users = User.order(created_at: -1).skip(offset).limit(per_page)
Query Analysis and Optimization
# Analyze query performance
explanation = User.where(status: "active").explain
# Check if query uses index
puts "Uses index: #{explanation['queryPlanner']['winningPlan']['inputStage']['indexName']}"
# Check execution time
puts "Execution time: #{explanation['executionStats']['executionTimeMillis']}ms"
# Check documents examined vs returned
puts "Docs examined: #{explanation['executionStats']['totalDocsExamined']}"
puts "Docs returned: #{explanation['executionStats']['nReturned']}"
# Profile slow queries
Mongoid.default_client.database.command(profile: 2, slowms: 100)
# Monitor query performance
class QueryLogger
def self.log(query, duration)
Rails.logger.info "Query: #{query} took #{duration}ms"
end
end
# Use in models
class User
include Mongoid::Document
def self.with_logging
start_time = Time.current
result = yield
duration = ((Time.current - start_time) * 1000).round
QueryLogger.log(to_sql, duration)
result
end
end
Bulk Operations
# Bulk insert
users_data = [
{ name: "John", email: "[email protected]" },
{ name: "Jane", email: "[email protected]" },
{ name: "Bob", email: "[email protected]" }
]
User.collection.insert_many(users_data)
# Bulk update
User.collection.update_many(
{ status: "pending" },
{ "$set" => { status: "active", updated_at: Time.current } }
)
# Bulk upsert
User.collection.update_many(
{ email: "[email protected]" },
{ "$set" => { name: "John Updated", updated_at: Time.current } },
{ upsert: true }
)
# Bulk delete
User.collection.delete_many({ status: "inactive" })
# Using Mongoid for bulk operations
User.where(status: "pending").update_all(status: "active")
# Batch processing
User.where(status: "pending").find_in_batches(batch_size: 1000) do |batch|
batch.each do |user|
user.update(status: "active")
end
end
E-commerce Analytics
# Sales analytics
sales_analytics = Order.collection.aggregate([
{ "$match" => {
created_at: { "$gte" => 30.days.ago },
status: "completed"
}},
{ "$group" => {
"_id" => {
"year" => { "$year" => "$created_at" },
"month" => { "$month" => "$created_at" },
"day" => { "$dayOfMonth" => "$created_at" }
},
"total_sales" => { "$sum" => "$total_amount" },
"order_count" => { "$sum" => 1 },
"avg_order_value" => { "$avg" => "$total_amount" }
}},
{ "$sort" => { "_id" => 1 } }
])
# Product performance
product_performance = Product.collection.aggregate([
{ "$lookup" => {
"from" => "order_items",
"localField" => "_id",
"foreignField" => "product_id",
"as" => "orders"
}},
{ "$addFields" => {
"total_sold" => { "$sum" => "$orders.quantity" },
"total_revenue" => { "$sum" => { "$multiply" => ["$orders.quantity", "$orders.price"] } }
}},
{ "$match" => { "total_sold" => { "$gt" => 0 } } },
{ "$sort" => { "total_revenue" => -1 } },
{ "$limit" => 10 }
])
# Customer segmentation
customer_segments = User.collection.aggregate([
{ "$lookup" => {
"from" => "orders",
"localField" => "_id",
"foreignField" => "user_id",
"as" => "orders"
}},
{ "$addFields" => {
"total_spent" => { "$sum" => "$orders.total_amount" },
"order_count" => { "$size" => "$orders" },
"avg_order_value" => { "$avg" => "$orders.total_amount" }
}},
{ "$addFields" => {
"segment" => {
"$switch" => {
"branches" => [
{ "case" => { "$gte" => ["$total_spent", 1000] }, "then" => "premium" },
{ "case" => { "$gte" => ["$total_spent", 500] }, "then" => "regular" }
],
"default" => "new"
}
}
}},
{ "$group" => {
"_id" => "$segment",
"count" => { "$sum" => 1 },
"avg_spent" => { "$avg" => "$total_spent" }
}}
])
Social Media Analytics
# Engagement analytics
engagement_analytics = Post.collection.aggregate([
{ "$match" => {
created_at: { "$gte" => 7.days.ago },
visibility: "public"
}},
{ "$addFields" => {
"total_engagement" => {
"$add" => [
{ "$size" => "$reactions" },
{ "$size" => "$comments" },
"$share_count"
]
}
}},
{ "$group" => {
"_id" => {
"year" => { "$year" => "$created_at" },
"month" => { "$month" => "$created_at" },
"day" => { "$dayOfMonth" => "$created_at" }
},
"total_posts" => { "$sum" => 1 },
"total_engagement" => { "$sum" => "$total_engagement" },
"avg_engagement" => { "$avg" => "$total_engagement" }
}},
{ "$sort" => { "_id" => 1 } }
])
# Trending hashtags
trending_hashtags = Post.collection.aggregate([
{ "$match" => {
created_at: { "$gte" => 24.hours.ago },
visibility: "public"
}},
{ "$unwind" => "$hashtags" },
{ "$group" => {
"_id" => "$hashtags",
"count" => { "$sum" => 1 },
"total_engagement" => { "$sum" => { "$add" => [
{ "$size" => "$reactions" },
{ "$size" => "$comments" }
]}}
}},
{ "$sort" => { "count" => -1 } },
{ "$limit" => 10 }
])
# User activity timeline
user_activity = User.collection.aggregate([
{ "$lookup" => {
"from" => "posts",
"localField" => "_id",
"foreignField" => "user_id",
"as" => "posts"
}},
{ "$addFields" => {
"post_count" => { "$size" => "$posts" },
"last_post_date" => { "$max" => "$posts.created_at" }
}},
{ "$addFields" => {
"activity_level" => {
"$switch" => {
"branches" => [
{ "case" => { "$gte" => ["$post_count", 10] }, "then" => "high" },
{ "case" => { "$gte" => ["$post_count", 5] }, "then" => "medium" }
],
"default" => "low"
}
}
}},
{ "$group" => {
"_id" => "$activity_level",
"count" => { "$sum" => 1 },
"avg_posts" => { "$avg" => "$post_count" }
}}
])
Content Management Queries
# Content search with relevance
content_search = Article.collection.aggregate([
{ "$match" => {
"$text" => { "$search" => "ruby rails mongodb" },
status: "published"
}},
{ "$addFields" => {
"relevance_score" => { "$meta" => "textScore" }
}},
{ "$sort" => { "relevance_score" => -1 } },
{ "$limit" => 20 }
])
# Category content distribution
category_distribution = Article.collection.aggregate([
{ "$match" => { status: "published" } },
{ "$group" => {
"_id" => "$category",
"article_count" => { "$sum" => 1 },
"total_views" => { "$sum" => "$view_count" },
"avg_rating" => { "$avg" => "$rating" }
}},
{ "$sort" => { "article_count" => -1 } }
])
# Author performance
author_performance = User.collection.aggregate([
{ "$lookup" => {
"from" => "articles",
"localField" => "_id",
"foreignField" => "author_id",
"as" => "articles"
}},
{ "$addFields" => {
"published_articles" => {
"$size" => {
"$filter" => {
"input" => "$articles",
"cond" => { "$eq" => ["$$this.status", "published"] }
}
}
},
"total_views" => { "$sum" => "$articles.view_count" },
"avg_rating" => { "$avg" => "$articles.rating" }
}},
{ "$match" => { "published_articles" => { "$gt" => 0 } } },
{ "$sort" => { "total_views" => -1 } },
{ "$limit" => 10 }
])
Associations & Relationships
One-to-One Embedding
What is One-to-One Embedding? This is when you embed a single document within another document. It's perfect for data that belongs exclusively to the parent and is always accessed together. Think of it like a user having one profile or one set of preferences.
When to Use: Use one-to-one embedding when the embedded data is small, doesn't change frequently, and is always accessed together with the parent document. Examples include user profiles, preferences, settings, or configuration data.
Benefits: Fast reads (no additional queries), atomic updates, and simple data access. The embedded document is stored directly within the parent document, so there's no need for joins.
Limitations: The embedded document cannot be shared between multiple parents, and the parent document size increases. MongoDB has a 16MB document size limit.
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
embeds_one :profile
embeds_one :preferences
validates :email, presence: true, uniqueness: true
end
class Profile
include Mongoid::Document
field :bio, type: String
field :avatar_url, type: String
field :location, type: String
field :website, type: String
field :birth_date, type: Date
embedded_in :user
validates :bio, length: { maximum: 500 }
def display_location
location.present? ? location : "Not specified"
end
end
class Preferences
include Mongoid::Document
field :theme, type: String, default: "light"
field :language, type: String, default: "en"
field :notifications, type: Hash, default: {}
field :privacy_settings, type: Hash, default: {}
embedded_in :user
def notification_enabled?(type)
notifications[type.to_s] == true
end
end
# Usage examples
user = User.create(email: "[email protected]", name: "John Doe")
user.create_profile(
bio: "Ruby developer passionate about clean code",
location: "New York, NY",
website: "https://johndoe.dev"
)
user.create_preferences(
theme: "dark",
language: "en",
notifications: { email: true, push: false, sms: true }
)
# Accessing embedded documents
user.profile.bio # => "Ruby developer passionate about clean code"
user.preferences.notification_enabled?(:email) # => true
user.profile.display_location # => "New York, NY"
One-to-Many Embedding
What is One-to-Many Embedding? This allows you to embed multiple documents within a parent document. It's like having a collection of related items that belong exclusively to one parent. Think of a user having multiple addresses, phone numbers, or social media accounts.
When to Use: Use one-to-many embedding when you have a collection of related data that:
- Always belongs to one parent (never shared)
- Is relatively small in size
- Is accessed together with the parent
- Doesn't change frequently
Benefits: Atomic updates (all embedded documents are updated together), fast reads (no additional queries), and simple data access. You can also query embedded documents directly using MongoDB's dot notation.
Considerations: Be mindful of document size limits (16MB), and remember that embedded documents cannot be shared between parents. For large collections or frequently changing data, consider using referenced relationships instead.
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
embeds_many :addresses
embeds_many :phone_numbers
embeds_many :social_accounts
validates :email, presence: true, uniqueness: true
end
class Address
include Mongoid::Document
field :street, type: String
field :city, type: String
field :state, type: String
field :zip_code, type: String
field :country, type: String, default: "USA"
field :primary, type: Boolean, default: false
field :address_type, type: String, default: "home" # home, work, billing
embedded_in :user
validates :street, :city, :state, presence: true
scope :primary, -> { where(primary: true) }
scope :by_type, ->(type) { where(address_type: type) }
def full_address
[street, city, state, zip_code, country].compact.join(", ")
end
def make_primary!
user.addresses.update_all(primary: false)
update!(primary: true)
end
end
class PhoneNumber
include Mongoid::Document
field :number, type: String
field :type, type: String, default: "mobile" # mobile, home, work
field :primary, type: Boolean, default: false
field :verified, type: Boolean, default: false
embedded_in :user
validates :number, presence: true, format: { with: /\A\+?[\d\s\-\(\)]+\z/ }
scope :verified, -> { where(verified: true) }
scope :primary, -> { where(primary: true) }
end
class SocialAccount
include Mongoid::Document
field :platform, type: String # twitter, linkedin, github
field :username, type: String
field :url, type: String
field :verified, type: Boolean, default: false
embedded_in :user
validates :platform, :username, presence: true
scope :verified, -> { where(verified: true) }
scope :by_platform, ->(platform) { where(platform: platform) }
end
# Usage examples
user = User.create(email: "[email protected]", name: "Jane Smith")
# Add addresses
user.addresses.create(
street: "123 Main St",
city: "New York",
state: "NY",
zip_code: "10001",
primary: true
)
user.addresses.create(
street: "456 Work Ave",
city: "New York",
state: "NY",
zip_code: "10002",
address_type: "work"
)
# Add phone numbers
user.phone_numbers.create(
number: "+1-555-123-4567",
type: "mobile",
primary: true,
verified: true
)
# Add social accounts
user.social_accounts.create(
platform: "github",
username: "janesmith",
url: "https://github.com/janesmith",
verified: true
)
# Querying embedded documents
primary_address = user.addresses.primary.first
verified_phones = user.phone_numbers.verified
github_account = user.social_accounts.by_platform("github").first
Embedded Document Best Practices
When to Use Embedded Documents: Embedded documents are perfect for data that has a strong parent-child relationship and is always accessed together. They provide excellent performance for read operations since all data is retrieved in a single query.
Size Considerations: Keep embedded documents small and manageable. MongoDB has a 16MB document size limit, so avoid embedding large arrays or complex nested structures. If your embedded data grows large, consider moving to referenced relationships.
Access Patterns: Use embedded documents when the data is always accessed together with the parent. If you frequently need to access embedded data independently, consider using referenced relationships instead.
Update Patterns: Embedded documents support atomic updates, which means all embedded documents are updated together. This is great for data consistency but can be inefficient if you only need to update a single embedded document.
Best Practices Summary:
- Use for small, related data: Addresses, phone numbers, preferences
- Always accessed together: Profile with user, addresses with user
- Limited size: Avoid embedding large arrays or complex nested structures
- Atomic updates: Updates to embedded documents are atomic
- No sharing: Embedded documents cannot be shared between parents
Embedded Document Queries
Querying Embedded Documents: MongoDB provides powerful querying capabilities for embedded documents. You can query embedded fields using dot notation, which allows you to search within nested structures efficiently. This is one of the key advantages of MongoDB's document model.
Dot Notation: Use dot notation to access nested fields. For example, "addresses.city"
refers to the city field within the addresses array.
This works for both single embedded documents and arrays of embedded documents.
Array Queries: When querying arrays of embedded documents, you can use operators
like $exists
to check for the presence of elements, or array indices to access
specific positions. This is particularly useful for finding users with multiple addresses or
specific types of embedded documents.
Complex Queries: You can combine multiple conditions on embedded documents to create sophisticated queries. This allows you to find documents that match specific criteria across different embedded structures, making MongoDB very flexible for complex data relationships.
# Find users with specific embedded document criteria
users_with_ny_address = User.where("addresses.city" => "New York")
users_with_verified_phone = User.where("phone_numbers.verified" => true)
users_with_github = User.where("social_accounts.platform" => "github")
# Find users with primary address in specific state
users_in_california = User.where("addresses.primary" => true, "addresses.state" => "CA")
# Find users with multiple addresses
users_with_multiple_addresses = User.where("addresses.1" => { "$exists" => true })
# Find users with verified social accounts
users_with_verified_social = User.where("social_accounts.verified" => true)
# Complex embedded queries
users_with_complete_profile = User.where(
"profile.bio" => { "$exists" => true, "$ne" => "" },
"addresses.primary" => true,
"phone_numbers.verified" => true
)
One-to-Many Relationships
What are Referenced Relationships? Referenced relationships use document IDs to link documents across different collections. Unlike embedded relationships, the related documents are stored separately and referenced by their IDs. This is similar to foreign keys in relational databases.
When to Use Referenced Relationships: Use referenced relationships when:
- Data is large or complex
- Documents need to be shared between multiple parents
- Data changes frequently
- You need to query related documents independently
- You want to avoid hitting document size limits
Benefits: Referenced relationships provide flexibility and scalability. They allow you to:
- Share documents between multiple parents
- Query related documents independently
- Handle large datasets efficiently
- Update related documents without affecting the parent
Considerations: Referenced relationships require additional queries to fetch related data, which can lead to N+1 query problems. Use eager loading (includes) to optimize performance when you need to access related documents.
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
field :status, type: String, default: "active"
# One-to-many associations
has_many :posts
has_many :comments
has_many :orders
has_many :notifications
validates :email, presence: true, uniqueness: true
# Scopes for related data
scope :with_posts, -> { includes(:posts) }
scope :active_with_posts, -> { where(status: "active").includes(:posts) }
def post_count
posts.count
end
def recent_posts(limit = 5)
posts.order(created_at: -1).limit(limit)
end
def total_comments
comments.count
end
end
class Post
include Mongoid::Document
include Mongoid::Timestamps
field :title, type: String
field :content, type: String
field :status, type: String, default: "draft"
field :published_at, type: DateTime
field :view_count, type: Integer, default: 0
field :tags, type: Array, default: []
belongs_to :user
has_many :comments
has_many :likes
validates :title, presence: true
validates :content, presence: true, length: { minimum: 10 }
scope :published, -> { where(status: "published") }
scope :recent, -> { order(created_at: -1) }
scope :popular, -> { order(view_count: -1) }
def publish!
update!(status: "published", published_at: Time.current)
end
def increment_view_count!
increment!(view_count: 1)
end
end
class Comment
include Mongoid::Document
include Mongoid::Timestamps
field :content, type: String
field :status, type: String, default: "approved"
field :rating, type: Integer
belongs_to :user
belongs_to :post
validates :content, presence: true, length: { minimum: 2 }
validates :rating, numericality: { greater_than: 0, less_than: 6 }, allow_nil: true
scope :approved, -> { where(status: "approved") }
scope :recent, -> { order(created_at: -1) }
end
# Usage examples
user = User.create(email: "[email protected]", name: "John Author")
# Create posts
post1 = user.posts.create(
title: "Getting Started with MongoDB",
content: "MongoDB is a powerful NoSQL database...",
tags: ["mongodb", "nosql", "database"]
)
post2 = user.posts.create(
title: "Advanced Rails Patterns",
content: "In this post, we'll explore advanced Rails patterns...",
tags: ["rails", "ruby", "patterns"]
)
# Add comments to posts
other_user = User.create(email: "[email protected]", name: "Jane Reader")
other_user.comments.create(
post: post1,
content: "Great article! Very helpful.",
rating: 5
)
# Query relationships
user.posts.published.count # => 0 (posts are drafts)
user.posts.first.publish! # => Publish the first post
user.posts.published.count # => 1
# Eager loading to avoid N+1 queries
users_with_posts = User.includes(:posts).where(:id.in => [user.id, other_user.id])
users_with_posts.each do |u|
puts "#{u.name} has #{u.posts.count} posts"
end
Many-to-Many Relationships
What are Many-to-Many Relationships? Many-to-many relationships allow documents to be associated with multiple other documents in both directions. This is useful when you have complex relationships where both sides can have multiple connections. Think of users having multiple roles, or users following multiple other users.
When to Use Many-to-Many Relationships: Use many-to-many relationships when:
- Both sides of the relationship can have multiple connections
- You need to query relationships from both directions
- The relationship data is simple (just IDs)
- You want to avoid creating intermediate documents
Implementation Options: MongoDB offers two main approaches for many-to-many relationships:
- has_and_belongs_to_many: Simple relationships with just IDs
- has_many through: Complex relationships with additional data
Performance Considerations: Many-to-many relationships can become complex to query efficiently. Consider using indexes on the relationship arrays and be mindful of array size limits. For very large relationships, consider using a separate collection as an intermediate table.
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
# Many-to-many associations
has_and_belongs_to_many :roles
has_and_belongs_to_many :groups
has_and_belongs_to_many :followed_users, class_name: "User", inverse_of: :followers
has_and_belongs_to_many :followers, class_name: "User", inverse_of: :followed_users
validates :email, presence: true, uniqueness: true
def admin?
roles.any? { |role| role.name == "admin" }
end
def follow!(user)
followed_users << user unless followed_users.include?(user)
end
def unfollow!(user)
followed_users.delete(user)
end
def following?(user)
followed_users.include?(user)
end
end
class Role
include Mongoid::Document
field :name, type: String
field :description, type: String
field :permissions, type: Array, default: []
has_and_belongs_to_many :users
validates :name, presence: true, uniqueness: true
scope :active, -> { where(:name.nin => ["deleted"]) }
def has_permission?(permission)
permissions.include?(permission.to_s)
end
end
class Group
include Mongoid::Document
field :name, type: String
field :description, type: String
field :privacy, type: String, default: "public" # public, private, secret
has_and_belongs_to_many :users
has_many :posts
validates :name, presence: true
scope :public_groups, -> { where(privacy: "public") }
def member_count
users.count
end
def add_member(user)
users << user unless users.include?(user)
end
def remove_member(user)
users.delete(user)
end
end
# Usage examples
# Create roles
admin_role = Role.create(name: "admin", permissions: ["manage_users", "manage_content"])
moderator_role = Role.create(name: "moderator", permissions: ["moderate_content"])
user_role = Role.create(name: "user", permissions: ["create_content"])
# Create groups
ruby_group = Group.create(name: "Ruby Developers", description: "Ruby programming community")
rails_group = Group.create(name: "Rails Developers", description: "Rails framework community")
# Create users and assign roles
user1 = User.create(email: "[email protected]", name: "Admin User")
user2 = User.create(email: "[email protected]", name: "Moderator User")
user3 = User.create(email: "[email protected]", name: "Regular User")
user1.roles << admin_role
user2.roles << moderator_role
user3.roles << user_role
# Add users to groups
ruby_group.add_member(user1)
ruby_group.add_member(user2)
rails_group.add_member(user1)
rails_group.add_member(user3)
# Follow relationships
user1.follow!(user2)
user1.follow!(user3)
user2.follow!(user1)
# Query relationships
User.where(:roles.in => [admin_role.id]).count # => 1
ruby_group.users.count # => 2
user1.followed_users.count # => 2
user1.followers.count # => 1
# Check permissions
user1.admin? # => true
user2.admin? # => false
admin_role.has_permission?("manage_users") # => true
Referenced Relationship Best Practices
When to Use Referenced Relationships: Referenced relationships are ideal for data that needs to be shared, queried independently, or updated frequently. They provide the flexibility to handle complex data relationships while maintaining good performance characteristics.
Performance Optimization: The key to good performance with referenced relationships is
proper indexing and eager loading. Always create indexes on foreign key fields and use includes()
to avoid N+1 query problems when accessing related documents.
Data Independence: Referenced relationships allow documents to exist independently. This is perfect for data that can be shared between multiple parents or needs to be queried and updated independently of its parent document.
Scalability Considerations: While referenced relationships provide flexibility, they can become performance bottlenecks if not properly optimized. Monitor query performance and consider denormalization for frequently accessed data.
Best Practices Summary:
- Use for large or shared data: Posts, comments, orders
- Independent entities: Data that can exist on its own
- Frequent updates: Data that changes often
- Eager loading: Use includes() to avoid N+1 queries
- Index foreign keys: Always index referenced fields
Polymorphic Relationships
What are Polymorphic Relationships? Polymorphic relationships allow a document to belong to multiple different types of documents. This is useful when you have a common behavior or feature that can be applied to different types of content. Think of comments that can be attached to posts, articles, videos, or any other content type.
When to Use Polymorphic Relationships: Use polymorphic relationships when:
- You have common behavior across different document types
- You want to avoid creating separate collections for each relationship
- The related data has the same structure regardless of the parent type
- You need to query related data across different parent types
Implementation: Polymorphic relationships in MongoDB use two fields:
- commentable_type: Stores the class name of the parent
- commentable_id: Stores the ID of the parent document
Performance Considerations: Polymorphic relationships can be slower to query than regular relationships because MongoDB needs to check multiple collections. Use proper indexes on both the type and ID fields, and consider eager loading to optimize performance.
class Comment
include Mongoid::Document
include Mongoid::Timestamps
field :content, type: String
field :status, type: String, default: "approved"
# Polymorphic association
belongs_to :commentable, polymorphic: true
belongs_to :user
validates :content, presence: true
scope :approved, -> { where(status: "approved") }
scope :recent, -> { order(created_at: -1) }
end
class Post
include Mongoid::Document
include Mongoid::Timestamps
field :title, type: String
field :content, type: String
field :status, type: String, default: "draft"
belongs_to :user
has_many :comments, as: :commentable
validates :title, presence: true
end
class Article
include Mongoid::Document
include Mongoid::Timestamps
field :title, type: String
field :content, type: String
field :category, type: String
belongs_to :author, class_name: "User"
has_many :comments, as: :commentable
validates :title, presence: true
end
class Video
include Mongoid::Document
include Mongoid::Timestamps
field :title, type: String
field :url, type: String
field :duration, type: Integer
belongs_to :creator, class_name: "User"
has_many :comments, as: :commentable
validates :title, :url, presence: true
end
# Usage examples
user = User.create(email: "[email protected]", name: "Commenter")
# Create different content types
post = Post.create(
title: "MongoDB Guide",
content: "Complete guide to MongoDB...",
user: User.first
)
article = Article.create(
title: "Rails Best Practices",
content: "Learn Rails best practices...",
category: "Programming",
author: User.first
)
video = Video.create(
title: "MongoDB Tutorial",
url: "https://youtube.com/watch?v=abc123",
duration: 1800,
creator: User.first
)
# Add comments to different content types
user.comments.create(
commentable: post,
content: "Great post about MongoDB!"
)
user.comments.create(
commentable: article,
content: "Very helpful article!"
)
user.comments.create(
commentable: video,
content: "Excellent tutorial!"
)
# Query polymorphic relationships
post.comments.count # => 1
article.comments.count # => 1
video.comments.count # => 1
# Find all comments by a user
user.comments.includes(:commentable).each do |comment|
puts "Comment on #{comment.commentable.class.name}: #{comment.content}"
end
# Find comments on specific content types
Comment.where(commentable_type: "Post").count # => 1
Comment.where(commentable_type: "Article").count # => 1
Comment.where(commentable_type: "Video").count # => 1
Polymorphic Best Practices
Design Considerations: When implementing polymorphic relationships, ensure that all parent types have a consistent interface. This makes it easier to work with the polymorphic association and reduces the complexity of your code.
Performance Optimization: Polymorphic relationships require careful indexing to maintain good performance. Always create compound indexes on both the type and ID fields, and use eager loading to avoid N+1 queries when accessing polymorphic associations.
Type Safety: Since polymorphic relationships store class names as strings, it's important to validate the commentable_type values and handle cases where the referenced class might not exist or might have been renamed.
Use Cases: Polymorphic relationships are perfect for features that can be applied to multiple types of content, such as comments, likes, attachments, or any other shared behavior across different document types.
Best Practices Summary:
- Use for shared behavior: Comments, likes, attachments
- Consistent interface: All commentable objects should have similar methods
- Index polymorphic fields: Index both commentable_type and commentable_id
- Eager loading: Use includes() with polymorphic associations
- Type safety: Validate commentable_type values
Eager Loading Strategies
What is Eager Loading? Eager loading is a technique that loads related documents in a single query instead of making separate queries for each relationship. This is crucial for avoiding the N+1 query problem, where accessing related data results in many additional database queries.
The N+1 Query Problem: When you have a collection of documents and need to access their related data, without eager loading, MongoDB will make one query for the main documents and then one additional query for each document to fetch its related data. This can result in hundreds or thousands of queries for large datasets.
When to Use Eager Loading: Use eager loading whenever you know you'll need to access related data for multiple documents. This is especially important when:
- Displaying lists with related data
- Iterating through documents and accessing relationships
- Building complex views that need multiple related documents
Performance Impact: Eager loading can dramatically improve performance by reducing the number of database queries. However, it does load more data into memory, so use it judiciously and only for relationships you actually need.
# Avoid N+1 queries with eager loading
# Bad: N+1 queries
users = User.all
users.each do |user|
puts "#{user.name} has #{user.posts.count} posts" # N+1 query per user
end
# Good: Eager loading
users = User.includes(:posts).all
users.each do |user|
puts "#{user.name} has #{user.posts.count} posts" # No additional queries
end
# Multiple associations
users = User.includes(:posts, :comments, :roles).all
# Nested eager loading
users = User.includes(posts: :comments).all
# Conditional eager loading
users = User.includes(:posts).where(:status => "active")
# Polymorphic eager loading
comments = Comment.includes(:commentable, :user).all
# Custom eager loading with scopes
users = User.includes(:posts).where(:status => "active")
users.each do |user|
puts "#{user.name}: #{user.posts.published.count} published posts"
end
Indexing for Relationships
Why Index Relationships? Proper indexing is crucial for maintaining good performance when working with relationships in MongoDB. Without proper indexes, queries that involve relationships can become very slow, especially as your data grows.
Key Indexing Strategies: When working with relationships, you should create indexes on:
- Foreign key fields: Always index the fields that reference other documents
- Polymorphic fields: Index both the type and ID fields for polymorphic relationships
- Compound indexes: Create compound indexes for frequently used query combinations
- Array fields: Index array fields that store relationship IDs
Performance Considerations: Indexes improve query performance but add overhead to write operations. Monitor your index usage and remove unused indexes. For polymorphic relationships, consider creating separate indexes for each type if you frequently query by specific types.
Index Maintenance: Regularly review your indexes to ensure they're being used effectively. MongoDB provides tools to analyze index usage and identify unused or inefficient indexes.
class User
include Mongoid::Document
field :email, type: String
field :status, type: String
has_many :posts
has_many :comments
# Index foreign keys in related models
index({ email: 1 }, { unique: true })
index({ status: 1 })
end
class Post
include Mongoid::Document
field :title, type: String
field :status, type: String
field :user_id, type: BSON::ObjectId
belongs_to :user
# Index foreign key
index({ user_id: 1 })
index({ status: 1, user_id: 1 })
index({ created_at: -1, user_id: 1 })
end
class Comment
include Mongoid::Document
field :content, type: String
field :user_id, type: BSON::ObjectId
field :post_id, type: BSON::ObjectId
field :commentable_type, type: String
field :commentable_id, type: BSON::ObjectId
belongs_to :user
belongs_to :post
belongs_to :commentable, polymorphic: true
# Index foreign keys
index({ user_id: 1 })
index({ post_id: 1 })
index({ commentable_type: 1, commentable_id: 1 })
index({ created_at: -1, user_id: 1 })
end
# Create all indexes
Mongoid.create_indexes
Relationship Query Optimization
# Use projection to limit returned fields
users = User.only(:name, :email).includes(:posts)
# Use scopes for common queries
class User
include Mongoid::Document
has_many :posts
scope :with_recent_posts, -> {
includes(:posts).where("posts.created_at" => { "$gte" => 1.week.ago })
}
scope :with_post_count, -> {
aggregate([
{ "$lookup" => {
"from" => "posts",
"localField" => "_id",
"foreignField" => "user_id",
"as" => "posts"
}},
{ "$addFields" => { "post_count" => { "$size" => "$posts" } } },
{ "$match" => { "post_count" => { "$gt" => 0 } } }
])
}
end
# Batch processing for large datasets
User.find_in_batches(batch_size: 1000) do |batch|
batch.each do |user|
user.posts.includes(:comments).each do |post|
# Process post and comments
end
end
end
# Use aggregation for complex relationship queries
top_users = User.collection.aggregate([
{ "$lookup" => {
"from" => "posts",
"localField" => "_id",
"foreignField" => "user_id",
"as" => "posts"
}},
{ "$addFields" => {
"post_count" => { "$size" => "$posts" },
"total_views" => { "$sum" => "$posts.view_count" }
}},
{ "$match" => { "post_count" => { "$gt" => 0 } } },
{ "$sort" => { "total_views" => -1 } },
{ "$limit" => 10 }
])
E-commerce Relationships
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
field :status, type: String, default: "active"
# Embedded relationships
embeds_one :profile
embeds_many :addresses
embeds_many :payment_methods
# Referenced relationships
has_many :orders
has_many :reviews
has_many :wishlist_items
has_and_belongs_to_many :favorite_categories
validates :email, presence: true, uniqueness: true
def total_spent
orders.completed.sum(:total_amount)
end
def average_order_value
completed_orders = orders.completed
return 0 if completed_orders.empty?
completed_orders.sum(:total_amount) / completed_orders.count
end
end
class Order
include Mongoid::Document
include Mongoid::Timestamps
field :order_number, type: String
field :status, type: String, default: "pending"
field :total_amount, type: Float
field :shipping_address, type: Hash
field :billing_address, type: Hash
belongs_to :user
has_many :order_items
has_many :payments
validates :order_number, presence: true, uniqueness: true
scope :completed, -> { where(status: "completed") }
scope :pending, -> { where(status: "pending") }
def complete!
update!(status: "completed")
end
def total_items
order_items.sum(:quantity)
end
end
class Product
include Mongoid::Document
include Mongoid::Timestamps
field :name, type: String
field :description, type: String
field :price, type: Float
field :sku, type: String
field :category, type: String
field :tags, type: Array, default: []
field :inventory, type: Integer, default: 0
belongs_to :brand
has_many :order_items
has_many :reviews
has_many :wishlist_items
validates :name, :price, :sku, presence: true
validates :sku, uniqueness: true
scope :in_stock, -> { where(:inventory.gt => 0) }
scope :by_category, ->(category) { where(category: category) }
def available?
inventory > 0
end
def average_rating
reviews.any? ? reviews.avg(:rating) : 0
end
end
class Review
include Mongoid::Document
include Mongoid::Timestamps
field :rating, type: Integer
field :title, type: String
field :content, type: String
field :verified_purchase, type: Boolean, default: false
belongs_to :user
belongs_to :product
validates :rating, presence: true, numericality: { greater_than: 0, less_than: 6 }
validates :content, presence: true, length: { minimum: 10 }
scope :verified, -> { where(verified_purchase: true) }
scope :recent, -> { order(created_at: -1) }
end
# Usage examples
user = User.create(email: "[email protected]", name: "John Customer")
# Add embedded data
user.create_profile(
phone: "+1-555-123-4567",
preferences: { newsletter: true, marketing: false }
)
user.addresses.create(
street: "123 Main St",
city: "New York",
state: "NY",
zip_code: "10001",
primary: true
)
# Create products and orders
product1 = Product.create(
name: "Ruby Programming Book",
price: 29.99,
sku: "BOOK-RUBY-001",
category: "Books",
inventory: 50
)
product2 = Product.create(
name: "Rails Framework Guide",
price: 39.99,
sku: "BOOK-RAILS-001",
category: "Books",
inventory: 30
)
# Create order
order = user.orders.create(
order_number: "ORD-#{Time.current.to_i}",
total_amount: 69.98,
shipping_address: user.addresses.primary.first.attributes,
billing_address: user.addresses.primary.first.attributes
)
# Add items to order
order.order_items.create(
product: product1,
quantity: 1,
price: product1.price
)
order.order_items.create(
product: product2,
quantity: 1,
price: product2.price
)
# Add review
user.reviews.create(
product: product1,
rating: 5,
title: "Excellent book!",
content: "This book helped me learn Ruby programming from scratch.",
verified_purchase: true
)
# Query relationships
user.total_spent # => 69.98
user.average_order_value # => 69.98
product1.average_rating # => 5.0
user.orders.completed.count # => 0
order.complete!
user.orders.completed.count # => 1
Social Network Relationships
class User
include Mongoid::Document
field :username, type: String
field :email, type: String
field :name, type: String
field :bio, type: String
field :status, type: String, default: "active"
# Embedded relationships
embeds_one :profile
embeds_many :social_links
# Referenced relationships
has_many :posts
has_many :comments
has_many :messages_sent, class_name: "Message", foreign_key: "sender_id"
has_many :messages_received, class_name: "Message", foreign_key: "recipient_id"
# Many-to-many relationships
has_and_belongs_to_many :followed_users, class_name: "User", inverse_of: :followers
has_and_belongs_to_many :followers, class_name: "User", inverse_of: :followed_users
has_and_belongs_to_many :groups
validates :username, :email, presence: true, uniqueness: true
def follow!(user)
followed_users << user unless followed_users.include?(user)
end
def unfollow!(user)
followed_users.delete(user)
end
def following?(user)
followed_users.include?(user)
end
def feed_posts
Post.where(:user_id.in => followed_users.pluck(:id) + [id])
.order(created_at: -1)
.includes(:user, :comments)
end
end
class Post
include Mongoid::Document
include Mongoid::Timestamps
field :content, type: String
field :visibility, type: String, default: "public"
field :location, type: Array # [longitude, latitude]
field :tags, type: Array, default: []
belongs_to :user
has_many :comments
has_many :likes
has_many :shares
validates :content, presence: true, length: { maximum: 1000 }
scope :public_posts, -> { where(visibility: "public") }
scope :recent, -> { order(created_at: -1) }
def like_count
likes.count
end
def comment_count
comments.count
end
def share_count
shares.count
end
end
class Group
include Mongoid::Document
field :name, type: String
field :description, type: String
field :privacy, type: String, default: "public"
field :rules, type: Array, default: []
has_and_belongs_to_many :users
has_many :posts
validates :name, presence: true
scope :public_groups, -> { where(privacy: "public") }
def member_count
users.count
end
def add_member(user)
users << user unless users.include?(user)
end
def remove_member(user)
users.delete(user)
end
def is_member?(user)
users.include?(user)
end
end
class Message
include Mongoid::Document
include Mongoid::Timestamps
field :content, type: String
field :read, type: Boolean, default: false
field :message_type, type: String, default: "text" # text, image, file
belongs_to :sender, class_name: "User"
belongs_to :recipient, class_name: "User"
validates :content, presence: true
scope :unread, -> { where(read: false) }
scope :recent, -> { order(created_at: -1) }
def mark_as_read!
update!(read: true)
end
end
# Usage examples
user1 = User.create(username: "john_doe", email: "[email protected]", name: "John Doe")
user2 = User.create(username: "jane_smith", email: "[email protected]", name: "Jane Smith")
user3 = User.create(username: "bob_wilson", email: "[email protected]", name: "Bob Wilson")
# Follow relationships
user1.follow!(user2)
user1.follow!(user3)
user2.follow!(user1)
# Create group
ruby_group = Group.create(
name: "Ruby Developers",
description: "Community for Ruby developers",
privacy: "public"
)
ruby_group.add_member(user1)
ruby_group.add_member(user2)
# Create posts
post1 = user1.posts.create(
content: "Just learned about MongoDB associations!",
visibility: "public",
tags: ["mongodb", "rails", "learning"]
)
post2 = user2.posts.create(
content: "Great post, John! MongoDB is amazing.",
visibility: "public",
tags: ["mongodb", "agreement"]
)
# Add comments
user2.comments.create(
post: post1,
content: "Thanks for sharing this!"
)
# Send messages
user1.messages_sent.create(
recipient: user2,
content: "Hey Jane, check out my new post about MongoDB!"
)
# Query relationships
user1.followed_users.count # => 2
user1.followers.count # => 1
user1.feed_posts.count # => 3 (including own posts)
ruby_group.member_count # => 2
user2.messages_received.unread.count # => 1
Validations & Callbacks
Core Validation Types
What are Validations? Validations are rules that ensure data integrity and quality before documents are saved to the database. They act as a safety net, preventing invalid or inconsistent data from being stored. Validations run before save operations and can prevent documents from being persisted if they don't meet the specified criteria.
Why Use Validations? Validations provide several important benefits:
- Data Integrity: Ensure only valid data is stored
- Business Rules: Enforce application-specific requirements
- Error Prevention: Catch issues early in the development cycle
- User Experience: Provide clear feedback about validation errors
- Security: Prevent malicious or malformed data
Validation Types: MongoDB with Mongoid provides a comprehensive set of validation types:
- Presence: Ensures fields are not blank or nil
- Uniqueness: Prevents duplicate values across documents
- Format: Validates against regular expressions or patterns
- Length: Ensures string fields meet size requirements
- Numericality: Validates numeric ranges and types
- Inclusion/Exclusion: Restricts values to specific sets
Validation Timing: Validations run at specific points in the document lifecycle:
- Before Save: Validations run before any save operation
- Before Update: Can be configured to run before updates
- Conditional: Can be made conditional based on other fields
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
field :age, type: Integer
field :username, type: String
field :bio, type: String
field :website, type: String
field :status, type: String, default: "active"
# Presence validations
validates :email, presence: true
validates :name, presence: true
validates :username, presence: true
# Uniqueness validations
validates :email, uniqueness: true
validates :username, uniqueness: true, case_sensitive: false
# Format validations
validates :email, format: { with: URI::MailTo::EMAIL_REGEXP }
validates :website, format: { with: URI::regexp(%w[http https]), allow_blank: true }
# Length validations
validates :name, length: { minimum: 2, maximum: 50 }
validates :bio, length: { maximum: 500 }
validates :username, length: { minimum: 3, maximum: 20 }
# Numerical validations
validates :age, numericality: {
greater_than: 0,
less_than: 150,
only_integer: true,
allow_nil: true
}
# Inclusion validations
validates :status, inclusion: {
in: %w[active inactive suspended deleted],
message: "must be active, inactive, suspended, or deleted"
}
# Exclusion validations
validates :username, exclusion: {
in: %w[admin root system],
message: "cannot be a reserved username"
}
# Custom validation methods
validate :username_format
validate :age_consistency
private
def username_format
return if username.blank?
unless username.match?(/\A[a-zA-Z0-9_]+\z/)
errors.add(:username, "can only contain letters, numbers, and underscores")
end
end
def age_consistency
return if age.blank?
if age < 13 && status == "active"
errors.add(:age, "must be at least 13 for active accounts")
end
end
end
Validation Options Reference
Validation Options Overview: Each validation type supports various options that allow you to customize the validation behavior. Understanding these options helps you create more precise and flexible validation rules that match your specific business requirements.
Common Options: Most validations support these common options:
- message: Custom error message for the validation
- allow_blank: Skip validation if field is blank
- allow_nil: Skip validation if field is nil
- if: Only run validation if condition is met
- unless: Skip validation if condition is met
Performance Considerations: Some validations, like uniqueness, can be expensive on large collections. Consider using database indexes to improve performance, and be mindful of validation complexity when dealing with high-volume data.
Validation | Options | Description | Example |
---|---|---|---|
presence | true | Field must be present | validates :email, presence: true |
uniqueness | scope, case_sensitive | Field must be unique | validates :email, uniqueness: true |
format | with, allow_blank | Field must match regex | validates :email, format: { with: /\A[^@\s]+@[^@\s]+\z/ } |
length | minimum, maximum, is | String length constraints | validates :name, length: { minimum: 2 } |
numericality | greater_than, less_than, only_integer | Number constraints | validates :age, numericality: { greater_than: 0 } |
inclusion | in, message | Value must be in list | validates :status, inclusion: { in: %w[active inactive] } |
exclusion | in, message | Value must not be in list | validates :username, exclusion: { in: %w[admin] } |
Custom Validations
What are Custom Validations? Custom validations allow you to implement complex business logic that goes beyond the built-in validation types. They give you complete control over the validation process and enable you to create sophisticated validation rules that are specific to your application's requirements.
When to Use Custom Validations: Use custom validations when:
- You need complex business logic validation
- Multiple fields need to be validated together
- You need conditional validation based on other fields
- Built-in validations don't cover your specific requirements
- You need to validate against external data or services
Implementation Patterns: Custom validations follow these common patterns:
- Single Field Validation: Validate one field with complex logic
- Cross-Field Validation: Validate relationships between multiple fields
- Conditional Validation: Only validate under certain conditions
- External Validation: Validate against external APIs or services
Error Handling: Custom validations use the errors.add
method to add validation
errors. You can add errors to specific fields or to the base object, and you can
provide custom error messages that are user-friendly and informative.
class Product
include Mongoid::Document
field :name, type: String
field :price, type: Float
field :sku, type: String
field :category, type: String
field :tags, type: Array, default: []
field :metadata, type: Hash, default: {}
validates :name, presence: true
validates :price, numericality: { greater_than: 0 }
validates :sku, presence: true, uniqueness: true
# Custom validations
validate :sku_format
validate :price_consistency
validate :tags_limit
validate :metadata_structure
private
def sku_format
return if sku.blank?
unless sku.match?(/\A[A-Z]{2,3}-\d{3,6}\z/)
errors.add(:sku, "must be in format: XX-123 or XXX-123456")
end
end
def price_consistency
return if price.blank?
# Check if price is reasonable for category
case category
when "electronics"
if price < 10
errors.add(:price, "electronics must cost at least $10")
end
when "books"
if price > 200
errors.add(:price, "books cannot cost more than $200")
end
end
end
def tags_limit
return if tags.blank?
if tags.length > 10
errors.add(:tags, "cannot have more than 10 tags")
end
if tags.any? { |tag| tag.length > 20 }
errors.add(:tags, "each tag must be 20 characters or less")
end
end
def metadata_structure
return if metadata.blank?
required_keys = ["brand", "weight", "dimensions"]
missing_keys = required_keys - metadata.keys
if missing_keys.any?
errors.add(:metadata, "must include: #{missing_keys.join(', ')}")
end
end
end
Conditional Validations
What are Conditional Validations? Conditional validations allow you to apply validation rules only under specific circumstances. This is useful when certain fields should only be validated based on the state of other fields or the document's current condition. Conditional validations make your validation logic more flexible and context-aware.
When to Use Conditional Validations: Use conditional validations when:
- Certain fields are only required in specific states
- Validation rules change based on other field values
- You want to avoid unnecessary validation overhead
- Different business rules apply in different contexts
- You need to validate optional fields only when they're provided
Conditional Methods: Conditional validations can use different types of conditions:
- Symbol: Reference to a method that returns true/false
- Lambda: Inline condition using a lambda or proc
- String: String representation of a method call
- Proc: Custom proc object for complex conditions
Performance Benefits: Conditional validations can improve performance by avoiding unnecessary validation checks. They also make your validation logic more readable and maintainable by clearly expressing when each validation should apply.
class Order
include Mongoid::Document
field :order_number, type: String
field :total_amount, type: Float
field :status, type: String, default: "pending"
field :payment_method, type: String
field :shipping_address, type: Hash
field :billing_address, type: Hash
validates :order_number, presence: true, uniqueness: true
validates :total_amount, numericality: { greater_than: 0 }
# Conditional validations
validates :payment_method, presence: true, if: :requires_payment?
validates :shipping_address, presence: true, if: :requires_shipping?
validates :billing_address, presence: true, if: :requires_billing?
# Conditional validation with custom method
validate :payment_method_valid, if: :requires_payment?
validate :address_format, if: :requires_shipping?
private
def requires_payment?
status != "cancelled" && total_amount > 0
end
def requires_shipping?
status != "cancelled" && !digital_product?
end
def requires_billing?
status != "cancelled"
end
def digital_product?
# Logic to determine if product is digital
false
end
def payment_method_valid
valid_methods = %w[credit_card paypal stripe]
unless valid_methods.include?(payment_method)
errors.add(:payment_method, "must be one of: #{valid_methods.join(', ')}")
end
end
def address_format
required_fields = %w[street city state zip_code]
shipping_address&.each do |field, value|
if required_fields.include?(field) && value.blank?
errors.add(:shipping_address, "#{field} is required")
end
end
end
end
Cross-Field Validations
class User
include Mongoid::Document
field :email, type: String
field :email_confirmation, type: String
field :password, type: String
field :password_confirmation, type: String
field :birth_date, type: Date
field :registration_date, type: Date
field :premium_expires_at, type: DateTime
validates :email, presence: true, format: { with: URI::MailTo::EMAIL_REGEXP }
validates :password, presence: true, length: { minimum: 8 }
validates :birth_date, presence: true
# Cross-field validations
validate :email_confirmation_matches
validate :password_confirmation_matches
validate :birth_date_reasonable
validate :premium_expiry_after_registration
private
def email_confirmation_matches
return if email.blank? || email_confirmation.blank?
unless email == email_confirmation
errors.add(:email_confirmation, "doesn't match email")
end
end
def password_confirmation_matches
return if password.blank? || password_confirmation.blank?
unless password == password_confirmation
errors.add(:password_confirmation, "doesn't match password")
end
end
def birth_date_reasonable
return if birth_date.blank?
if birth_date > Date.current
errors.add(:birth_date, "cannot be in the future")
end
if birth_date < 100.years.ago.to_date
errors.add(:birth_date, "seems too old")
end
end
def premium_expiry_after_registration
return if premium_expires_at.blank? || registration_date.blank?
if premium_expires_at <= registration_date
errors.add(:premium_expires_at, "must be after registration date")
end
end
end
Model Callbacks
class User
include Mongoid::Document
include Mongoid::Timestamps
field :email, type: String
field :name, type: String
field :username, type: String
field :slug, type: String
field :last_login_at, type: DateTime
field :login_count, type: Integer, default: 0
field :status, type: String, default: "active"
validates :email, presence: true, uniqueness: true
validates :name, presence: true
# Before callbacks
before_create :generate_username
before_save :normalize_email
before_update :track_changes
before_validation :generate_slug
# After callbacks
after_create :send_welcome_email
after_save :update_search_index
after_destroy :cleanup_related_data
# Around callbacks
around_save :log_operation_time
private
def generate_username
return if username.present?
base_username = name.parameterize
counter = 1
loop do
candidate = "#{base_username}#{counter}"
unless User.where(username: candidate).exists?
self.username = candidate
break
end
counter += 1
end
end
def normalize_email
self.email = email.downcase.strip if email.present?
end
def track_changes
changed_fields = changes.keys - %w[updated_at]
if changed_fields.any?
Rails.logger.info "User #{id} changed: #{changed_fields.join(', ')}"
end
end
def generate_slug
return if slug.present?
self.slug = name.parameterize
end
def send_welcome_email
UserMailer.welcome(self).deliver_later
end
def update_search_index
SearchIndexJob.perform_later(self)
end
def cleanup_related_data
# Clean up related posts, comments, etc.
Post.where(user_id: id).destroy_all
Comment.where(user_id: id).destroy_all
end
def log_operation_time
start_time = Time.current
yield
duration = ((Time.current - start_time) * 1000).round
Rails.logger.info "User save operation took #{duration}ms"
end
end
Callback Types and Usage
Callback Lifecycle: Understanding when each callback runs is crucial for implementing the right logic at the right time. Callbacks follow a specific order during document operations, and knowing this order helps you avoid common pitfalls and implement effective business logic.
Before vs After Callbacks: The choice between before and after callbacks depends on your needs:
- Before Callbacks: Use when you need to modify the document or prevent the operation
- After Callbacks: Use for side effects that don't affect the current operation
- Around Callbacks: Use when you need to wrap the entire operation with custom logic
Common Patterns: Each callback type has established patterns and best practices:
- Data Preparation: Use before callbacks for normalization and defaults
- Side Effects: Use after callbacks for notifications and external updates
- Performance Monitoring: Use around callbacks for timing and logging
- Cleanup: Use after callbacks for resource cleanup and maintenance
Callback | Triggered When | Common Uses | Example |
---|---|---|---|
before_create | Before document is created | Generate slugs, set defaults | before_create :generate_slug |
after_create | After document is created | Send notifications, create related data | after_create :send_welcome_email |
before_save | Before any save operation | Normalize data, validate custom rules | before_save :normalize_email |
after_save | After any save operation | Update search indexes, cache invalidation | after_save :update_search_index |
before_update | Before document is updated | Track changes, validate updates | before_update :track_changes |
after_update | After document is updated | Notify subscribers, update related data | after_update :notify_subscribers |
before_destroy | Before document is deleted | Validate deletion, backup data | before_destroy :validate_deletion |
after_destroy | After document is deleted | Cleanup related data, audit logging | after_destroy :cleanup_related_data |
around_save | Around save operations | Performance monitoring, transactions | around_save :log_operation_time |
E-commerce Product Validations
Real-World Application: E-commerce applications require sophisticated validation rules to ensure data quality and business rule compliance. Product data must be accurate, complete, and consistent to provide a good user experience and maintain inventory integrity.
Business Requirements: E-commerce products have specific validation needs:
- Pricing Rules: Prices must be positive and follow business logic
- Inventory Management: Stock levels must be tracked accurately
- Product Identification: SKUs must be unique and follow patterns
- Category Classification: Products must belong to valid categories
- Metadata Requirements: Essential product information must be complete
Validation Strategy: The validation approach combines multiple techniques:
- Basic Validations: Presence, format, and range checks
- Custom Validations: Complex business logic and cross-field validation
- Conditional Validations: Different rules for different product states
- Performance Optimizations: Efficient validation for high-volume data
User Experience: Good validation provides clear, actionable error messages that help users understand what needs to be fixed. This improves data quality and reduces support requests.
class Product
include Mongoid::Document
include Mongoid::Timestamps
field :name, type: String
field :description, type: String
field :price, type: Float
field :cost, type: Float
field :sku, type: String
field :category, type: String
field :status, type: String, default: "draft"
field :inventory, type: Integer, default: 0
field :weight, type: Float
field :dimensions, type: Hash, default: {}
field :tags, type: Array, default: []
field :metadata, type: Hash, default: {}
# Basic validations
validates :name, presence: true, length: { minimum: 2, maximum: 100 }
validates :price, numericality: { greater_than: 0 }
validates :sku, presence: true, uniqueness: true
validates :category, presence: true, inclusion: { in: %w[electronics books clothing] }
validates :status, inclusion: { in: %w[draft active inactive archived] }
validates :inventory, numericality: { greater_than_or_equal_to: 0, only_integer: true }
# Custom validations
validate :sku_format
validate :price_vs_cost
validate :weight_required_for_shipping
validate :dimensions_format
validate :tags_limit
# Callbacks
before_save :normalize_name
before_create :generate_sku_if_missing
after_save :update_search_index
after_destroy :cleanup_related_data
private
def sku_format
return if sku.blank?
unless sku.match?(/\A[A-Z]{2,3}-\d{3,6}\z/)
errors.add(:sku, "must be in format: XX-123 or XXX-123456")
end
end
def price_vs_cost
return if price.blank? || cost.blank?
if price < cost
errors.add(:price, "cannot be less than cost")
end
if price > cost * 10
errors.add(:price, "markup seems too high")
end
end
def weight_required_for_shipping
return if category == "digital"
if weight.blank? || weight <= 0
errors.add(:weight, "is required for physical products")
end
end
def dimensions_format
return if dimensions.blank?
required_keys = %w[length width height]
missing_keys = required_keys - dimensions.keys
if missing_keys.any?
errors.add(:dimensions, "must include: #{missing_keys.join(', ')}")
end
dimensions.each do |key, value|
unless value.is_a?(Numeric) && value > 0
errors.add(:dimensions, "#{key} must be a positive number")
end
end
end
def tags_limit
return if tags.blank?
if tags.length > 10
errors.add(:tags, "cannot have more than 10 tags")
end
if tags.any? { |tag| tag.length > 20 }
errors.add(:tags, "each tag must be 20 characters or less")
end
end
def normalize_name
self.name = name.titleize if name.present?
end
def generate_sku_if_missing
return if sku.present?
prefix = category.upcase[0..2]
counter = Product.where(category: category).count + 1
self.sku = "#{prefix}-#{counter.to_s.rjust(3, '0')}"
end
def update_search_index
SearchIndexJob.perform_later(self)
end
def cleanup_related_data
OrderItem.where(product_id: id).destroy_all
Review.where(product_id: id).destroy_all
end
end
Social Media Post Validations
Content Moderation Challenges: Social media platforms require sophisticated validation rules to ensure content quality, user safety, and platform integrity. Posts must be validated for content appropriateness, format compliance, and user engagement features.
Validation Requirements: Social media posts have unique validation needs:
- Content Safety: Filter inappropriate or harmful content
- Format Compliance: Ensure hashtags and mentions follow proper format
- User Engagement: Validate mentions and hashtags for real users
- Media Management: Control media attachments and file types
- Scheduling Logic: Ensure scheduled posts are set for future dates
Content Processing: Social media posts often require automatic processing:
- Hashtag Extraction: Automatically detect and format hashtags
- Mention Validation: Verify that mentioned users exist
- Content Analysis: Check for inappropriate language or spam
- Media Validation: Ensure media files are valid and accessible
User Experience: Social media validation must balance content quality with user convenience. Clear error messages help users understand why their posts were rejected, while automatic processing reduces the burden on users.
class Post
include Mongoid::Document
include Mongoid::Timestamps
field :content, type: String
field :title, type: String
field :visibility, type: String, default: "public"
field :status, type: String, default: "draft"
field :location, type: Array # [longitude, latitude]
field :hashtags, type: Array, default: []
field :mentions, type: Array, default: []
field :media_urls, type: Array, default: []
field :language, type: String, default: "en"
field :scheduled_at, type: DateTime
belongs_to :user
# Basic validations
validates :content, presence: true, length: { maximum: 280 }
validates :visibility, inclusion: { in: %w[public private friends] }
validates :status, inclusion: { in: %w[draft published scheduled archived] }
validates :language, inclusion: { in: %w[en es fr de] }
# Custom validations
validate :content_appropriate
validate :hashtags_format
validate :mentions_valid
validate :location_format
validate :scheduled_at_future
validate :media_limit
# Callbacks
before_save :extract_hashtags_and_mentions
before_create :set_default_title
after_create :notify_followers
after_save :update_trending_hashtags
private
def content_appropriate
return if content.blank?
inappropriate_words = %w[spam scam fraud]
if inappropriate_words.any? { |word| content.downcase.include?(word) }
errors.add(:content, "contains inappropriate content")
end
end
def hashtags_format
return if hashtags.blank?
hashtags.each do |hashtag|
unless hashtag.match?(/\A#[a-zA-Z0-9_]+\z/)
errors.add(:hashtags, "must start with # and contain only letters, numbers, and underscores")
end
end
end
def mentions_valid
return if mentions.blank?
mentions.each do |mention|
unless mention.match?(/\A@[a-zA-Z0-9_]+\z/)
errors.add(:mentions, "must start with @ and contain only letters, numbers, and underscores")
end
# Check if mentioned user exists
username = mention[1..-1]
unless User.where(username: username).exists?
errors.add(:mentions, "user @#{username} does not exist")
end
end
end
def location_format
return if location.blank?
unless location.is_a?(Array) && location.length == 2
errors.add(:location, "must be an array with [longitude, latitude]")
end
longitude, latitude = location
unless longitude.between?(-180, 180) && latitude.between?(-90, 90)
errors.add(:location, "coordinates are invalid")
end
end
def scheduled_at_future
return if scheduled_at.blank?
if scheduled_at <= Time.current
errors.add(:scheduled_at, "must be in the future")
end
end
def media_limit
return if media_urls.blank?
if media_urls.length > 10
errors.add(:media_urls, "cannot have more than 10 media items")
end
media_urls.each do |url|
unless url.match?(/\Ahttps?:\/\/.+/)
errors.add(:media_urls, "must be valid URLs")
end
end
end
def extract_hashtags_and_mentions
return if content.blank?
self.hashtags = content.scan(/#\w+/).map(&:downcase)
self.mentions = content.scan(/@\w+/).map(&:downcase)
end
def set_default_title
return if title.present?
self.title = content.truncate(50) if content.present?
end
def notify_followers
return unless status == "published"
NotificationJob.perform_later(self)
end
def update_trending_hashtags
return if hashtags.blank?
TrendingHashtagsJob.perform_later(hashtags)
end
end
Performance & Optimization
Index Types and Usage
class User
include Mongoid::Document
field :email, type: String
field :username, type: String
field :name, type: String
field :status, type: String, default: "active"
field :created_at, type: DateTime
field :last_login_at, type: DateTime
field :location, type: Array # [longitude, latitude]
field :tags, type: Array, default: []
field :metadata, type: Hash, default: {}
# Single field indexes
index({ email: 1 }, { unique: true })
index({ username: 1 }, { unique: true })
index({ status: 1 })
index({ created_at: -1 })
index({ last_login_at: -1 })
# Compound indexes (order matters!)
index({ status: 1, created_at: -1 })
index({ email: 1, status: 1 })
index({ status: 1, last_login_at: -1 })
# Text search indexes
index({ name: "text", bio: "text" })
# Geospatial indexes
index({ location: "2dsphere" })
# Array indexes
index({ tags: 1 })
# Sparse indexes (skip null values)
index({ phone: 1 }, { sparse: true })
# TTL indexes (auto-delete after time)
index({ created_at: 1 }, { expire_after_seconds: 86400 }) # 24 hours
# Partial indexes (only for specific conditions)
index({ email: 1 }, { partialFilterExpression: { status: "active" } })
# Background indexes for large collections
index({ username: 1 }, { background: true })
end
Index Management
# Create all indexes for a model
User.create_indexes
# Create indexes for all models
Mongoid.create_indexes
# Drop all indexes for a model
User.remove_indexes
# Check existing indexes
User.collection.indexes.each do |index|
puts "Index: #{index['name']}"
puts "Keys: #{index['key']}"
puts "Options: #{index['options']}"
puts "---"
end
# Create indexes with specific options
User.collection.indexes.create_one(
{ email: 1, status: 1 },
{
background: true,
name: "email_status_idx"
}
)
# Drop specific index
User.collection.indexes.drop_one("email_status_idx")
# Check index usage statistics
User.collection.aggregate([
{ "$indexStats" => {} }
])
Index Best Practices
- Compound Index Order: Most selective field first
- Covered Queries: Include all queried fields in index
- Avoid Over-Indexing: Each index has write overhead
- Background Indexing: Use for large collections
- Monitor Usage: Remove unused indexes
- TTL Indexes: For time-based data cleanup
- Partial Indexes: For conditional queries
Query Performance Best Practices
# Use projection to limit returned fields
users = User.where(status: "active").only(:name, :email)
# Use limit for large result sets
recent_users = User.order(created_at: -1).limit(100)
# Use skip with limit for pagination
page_users = User.order(created_at: -1).skip(offset).limit(per_page)
# Use covered queries (all fields in index)
# Good: All fields in index
User.where(status: "active").only(:status, :created_at)
# Avoid: Fields not in index
User.where(status: "active").only(:status, :created_at, :name)
# Use compound queries efficiently
# Good: Uses compound index
User.where(status: "active", :created_at.gte => 1.week.ago)
# Avoid: Inefficient query pattern
User.where(status: "active").where(:created_at.gte => 1.week.ago)
# Use aggregation for complex queries
top_users = User.collection.aggregate([
{ "$match" => { status: "active" } },
{ "$group" => {
"_id" => "$status",
"count" => { "$sum" => 1 }
}},
{ "$sort" => { "count" => -1 } }
])
# Use bulk operations for large datasets
User.collection.bulk_write([
{ update_one: { filter: { status: "pending" }, update: { "$set" => { status: "active" } } } },
{ update_one: { filter: { status: "inactive" }, update: { "$set" => { status: "deleted" } } } }
])
Query Analysis and Optimization
# Analyze query performance
explanation = User.where(status: "active").explain
# Check if query uses index
puts "Uses index: #{explanation['queryPlanner']['winningPlan']['inputStage']['indexName']}"
# Check execution time
puts "Execution time: #{explanation['executionStats']['executionTimeMillis']}ms"
# Check documents examined vs returned
puts "Docs examined: #{explanation['executionStats']['totalDocsExamined']}"
puts "Docs returned: #{explanation['executionStats']['nReturned']}"
# Profile slow queries
Mongoid.default_client.database.command(profile: 2, slowms: 100)
# Monitor query performance
class QueryLogger
def self.log(query, duration)
Rails.logger.info "Query: #{query} took #{duration}ms"
end
end
# Use in models
class User
include Mongoid::Document
def self.with_logging
start_time = Time.current
result = yield
duration = ((Time.current - start_time) * 1000).round
QueryLogger.log(to_sql, duration)
result
end
end
# Analyze slow queries
slow_queries = Mongoid.default_client.database.command(
profile: 2,
slowms: 100
)
# Check index usage
index_stats = User.collection.aggregate([
{ "$indexStats" => {} }
])
# Monitor collection stats
collection_stats = User.collection.stats
puts "Collection size: #{collection_stats['size']} bytes"
puts "Document count: #{collection_stats['count']}"
puts "Average document size: #{collection_stats['avgObjSize']} bytes"
Eager Loading Strategies
# Avoid N+1 queries with eager loading
# Bad: N+1 queries
users = User.all
users.each do |user|
puts "#{user.name} has #{user.posts.count} posts" # N+1 query per user
end
# Good: Eager loading
users = User.includes(:posts).all
users.each do |user|
puts "#{user.name} has #{user.posts.count} posts" # No additional queries
end
# Multiple associations
users = User.includes(:posts, :comments, :roles).all
# Nested eager loading
users = User.includes(posts: :comments).all
# Conditional eager loading
users = User.includes(:posts).where(:status => "active")
# Polymorphic eager loading
comments = Comment.includes(:commentable, :user).all
# Custom eager loading with scopes
users = User.includes(:posts).where(:status => "active")
users.each do |user|
puts "#{user.name}: #{user.posts.published.count} published posts"
end
# Use projection with eager loading
users = User.includes(:posts).only(:name, :email).all
Performance Monitoring
What is Performance Monitoring? Performance monitoring involves tracking and analyzing database operations to identify bottlenecks, optimize queries, and ensure your application meets performance requirements. Effective monitoring provides insights into how your MongoDB database is performing and helps you make data-driven optimization decisions.
Why Monitor Performance? Performance monitoring is essential for:
- Proactive Optimization: Identify issues before they become problems
- Capacity Planning: Understand resource usage and plan for growth
- User Experience: Ensure fast response times for users
- Cost Optimization: Use resources efficiently and reduce costs
- Debugging: Quickly identify and resolve performance issues
Monitoring Tools: MongoDB provides several monitoring capabilities:
- Query Profiler: Capture and analyze slow queries
- Server Status: Monitor database server health
- Connection Pool: Track connection usage and availability
- Custom Metrics: Build application-specific monitoring
Monitoring Strategy: Effective monitoring follows these principles:
- Set Baselines: Establish normal performance metrics
- Monitor Trends: Track performance over time
- Alert on Anomalies: Get notified of performance issues
- Regular Reviews: Periodically analyze monitoring data
# Enable query profiling
Mongoid.default_client.database.command(profile: 2, slowms: 100)
# Check profiling status
profile_status = Mongoid.default_client.database.command(profile: -1)
# Get slow query logs
slow_queries = Mongoid.default_client.database.command(
profile: 0,
filter: { millis: { "$gt" => 100 } }
)
# Monitor database operations
class DatabaseMonitor
def self.log_operation(operation, duration, collection = nil)
Rails.logger.info "[DB] #{operation} on #{collection} took #{duration}ms"
end
def self.track_query(query, duration)
Rails.logger.info "[QUERY] #{query} took #{duration}ms"
end
end
# Custom monitoring middleware
class MongoidMonitoring
def self.track_operation(operation)
start_time = Time.current
result = yield
duration = ((Time.current - start_time) * 1000).round
DatabaseMonitor.log_operation(operation, duration)
result
end
end
# Use in models
class User
include Mongoid::Document
def self.with_monitoring
MongoidMonitoring.track_operation("user_query") do
yield
end
end
end
# Monitor connection pool
pool_stats = Mongoid.default_client.cluster.pool_stats
puts "Active connections: #{pool_stats['active']}"
puts "Available connections: #{pool_stats['available']}"
puts "Pending connections: #{pool_stats['pending']}"
# Monitor server status
server_status = Mongoid.default_client.database.command(serverStatus: 1)
puts "Uptime: #{server_status['uptime']} seconds"
puts "Connections: #{server_status['connections']['current']}"
puts "Operations: #{server_status['opcounters']}"
Performance Metrics
What are Performance Metrics? Performance metrics are quantitative measurements that help you understand how your MongoDB database is performing. These metrics provide insights into query efficiency, resource usage, and overall database health. Tracking the right metrics helps you make informed optimization decisions.
Key Metrics to Track: Focus on these essential performance metrics:
- Query Performance: Execution time, documents examined vs returned
- Index Efficiency: Hit ratios, index usage patterns
- Resource Usage: Memory, CPU, disk I/O, connection pool
- Collection Statistics: Document count, size, growth patterns
- Error Rates: Failed operations, timeout frequency
Collection Metrics: Collection-level metrics provide insights into data characteristics:
- Document Count: Total number of documents in the collection
- Storage Size: Total disk space used by the collection
- Average Document Size: Helps understand data structure efficiency
- Index Count: Number of indexes and their total size
Index Metrics: Index performance metrics help optimize query performance:
- Access Patterns: How often indexes are used
- Hit Ratios: Percentage of successful index lookups
- Miss Patterns: When indexes fail to help queries
- Storage Impact: Disk space used by indexes
# Collection statistics
collection_stats = User.collection.stats
puts "Collection: #{collection_stats['ns']}"
puts "Document count: #{collection_stats['count']}"
puts "Total size: #{collection_stats['size']} bytes"
puts "Average document size: #{collection_stats['avgObjSize']} bytes"
puts "Index count: #{collection_stats['nindexes']}"
puts "Total index size: #{collection_stats['totalIndexSize']} bytes"
# Index statistics
index_stats = User.collection.aggregate([
{ "$indexStats" => {} }
])
index_stats.each do |index|
puts "Index: #{index['name']}"
puts "Accesses: #{index['accesses']['ops']}"
puts "Hits: #{index['accesses']['hits']}"
puts "Misses: #{index['accesses']['misses']}"
puts "Hit ratio: #{(index['accesses']['hits'].to_f / index['accesses']['ops'] * 100).round(2)}%"
end
# Query performance metrics
class QueryMetrics
def self.track_query_performance
start_time = Time.current
result = yield
duration = ((Time.current - start_time) * 1000).round
Rails.logger.info "[METRICS] Query took #{duration}ms"
# Store metrics for analysis
QueryMetric.create(
duration: duration,
timestamp: Time.current,
collection: result.class.name.downcase
)
result
end
end
# Database health check
class DatabaseHealthCheck
def self.perform
begin
# Test connection
Mongoid.default_client.database.command(ping: 1)
# Test basic operations
test_user = User.create(name: "Health Check", email: "[email protected]")
test_user.destroy
{ status: "healthy", message: "Database is responding normally" }
rescue => e
{ status: "unhealthy", message: e.message }
end
end
end
Bulk Operations
What are Bulk Operations? Bulk operations allow you to perform multiple database operations in a single request, significantly improving performance compared to individual operations. Instead of making separate network calls for each document, bulk operations batch multiple operations together, reducing network overhead and improving throughput.
Why Use Bulk Operations? Bulk operations provide several performance benefits:
- Network Efficiency: Reduce network round trips between application and database
- Atomic Operations: Multiple operations can be executed atomically
- Better Throughput: Handle large datasets more efficiently
- Reduced Overhead: Less per-operation overhead
- Batch Processing: Ideal for data migration and batch updates
Types of Bulk Operations: MongoDB supports several bulk operation types:
- Bulk Insert: Insert multiple documents at once
- Bulk Update: Update multiple documents with a single operation
- Bulk Upsert: Update existing documents or insert new ones
- Bulk Delete: Remove multiple documents efficiently
When to Use Bulk Operations: Consider bulk operations for:
- Data Migration: Moving large amounts of data
- Batch Processing: Processing large datasets
- Data Cleanup: Removing or updating many documents
- Initial Data Loading: Setting up test or production data
- Periodic Updates: Scheduled maintenance operations
# Bulk insert
users_data = [
{ name: "John", email: "[email protected]", status: "active" },
{ name: "Jane", email: "[email protected]", status: "active" },
{ name: "Bob", email: "[email protected]", status: "active" }
]
User.collection.insert_many(users_data)
# Bulk update
User.collection.update_many(
{ status: "pending" },
{ "$set" => { status: "active", updated_at: Time.current } }
)
# Bulk upsert
User.collection.update_many(
{ email: "[email protected]" },
{ "$set" => { name: "John Updated", updated_at: Time.current } },
{ upsert: true }
)
# Bulk delete
User.collection.delete_many({ status: "inactive" })
# Using Mongoid for bulk operations
User.where(status: "pending").update_all(status: "active")
# Batch processing
User.where(status: "pending").find_in_batches(batch_size: 1000) do |batch|
batch.each do |user|
user.update(status: "active")
end
end
# Bulk write operations
User.collection.bulk_write([
{
update_one: {
filter: { status: "pending" },
update: { "$set" => { status: "active" } }
}
},
{
update_one: {
filter: { status: "inactive" },
update: { "$set" => { status: "deleted" } }
}
}
])
Caching Strategies
What is Caching? Caching is a technique that stores frequently accessed data in fast-access storage to improve application performance. Instead of repeatedly querying the database for the same data, caching stores the results in memory or a fast storage system, dramatically reducing response times and database load.
Why Use Caching? Caching provides several performance benefits:
- Faster Response Times: Retrieve data from memory instead of disk
- Reduced Database Load: Fewer queries to the database
- Better Scalability: Handle more users with the same resources
- Improved User Experience: Faster page loads and interactions
- Cost Efficiency: Reduce database resource requirements
Caching Strategies: Different caching approaches work for different scenarios:
- Application-Level Caching: Cache data in your Rails application
- Database Query Caching: Cache query results
- Object Caching: Cache entire objects or computed values
- Fragment Caching: Cache parts of views or templates
- CDN Caching: Cache static assets and content
Cache Invalidation: Managing cache freshness is crucial:
- Time-Based Expiration: Automatically expire cache entries
- Version-Based Invalidation: Use version numbers to invalidate cache
- Event-Based Invalidation: Invalidate cache when data changes
- Manual Invalidation: Explicitly clear cache when needed
# Redis caching for frequently accessed data
class User
include Mongoid::Document
field :email, type: String
field :name, type: String
field :status, type: String
# Cache frequently accessed data
def cached_profile
Rails.cache.fetch("user_profile_#{id}", expires_in: 1.hour) do
{
name: name,
email: email,
status: status,
post_count: posts.count,
last_activity: last_login_at
}
end
end
# Cache with versioning
def cached_data
Rails.cache.fetch("user_#{id}_v#{cache_version}", expires_in: 30.minutes) do
# Expensive computation
calculate_user_stats
end
end
private
def cache_version
updated_at.to_i
end
def calculate_user_stats
{
total_posts: posts.count,
total_comments: comments.count,
engagement_score: calculate_engagement_score
}
end
def calculate_engagement_score
# Complex calculation
(posts.count * 2) + comments.count
end
end
# Fragment caching for views
# In view: <%= render partial: 'user_profile', cached: true %>
# Cache invalidation
class User
after_save :invalidate_cache
after_destroy :invalidate_cache
private
def invalidate_cache
Rails.cache.delete("user_profile_#{id}")
Rails.cache.delete("user_#{id}_v#{cache_version}")
end
end
# Background job for cache warming
class CacheWarmingJob < ApplicationJob
queue_as :default
def perform
User.active.find_each do |user|
user.cached_profile # Warm cache
end
end
end
Connection Pool Optimization
What is Connection Pooling? Connection pooling is a technique that maintains a pool of database connections that can be reused by multiple requests. Instead of creating a new connection for each database operation, the application borrows a connection from the pool, uses it, and returns it when done. This significantly reduces the overhead of establishing new connections.
Why Use Connection Pooling? Connection pooling provides several performance benefits:
- Reduced Connection Overhead: Avoid the cost of creating new connections
- Better Resource Utilization: Efficiently manage database connections
- Improved Response Times: Faster database operations
- Scalability: Handle more concurrent users efficiently
- Connection Management: Automatic cleanup and health monitoring
Connection Pool Parameters: Key configuration options include:
- Max Pool Size: Maximum number of connections in the pool
- Min Pool Size: Minimum number of connections to maintain
- Max Idle Time: How long connections can remain idle
- Wait Queue Timeout: How long to wait for an available connection
- Server Selection Timeout: Timeout for server selection
Optimization Strategies: Effective connection pool management involves:
- Right-Size the Pool: Match pool size to your application needs
- Monitor Usage: Track connection pool utilization
- Handle Failures: Implement proper error handling and retry logic
- Health Checks: Monitor connection health and remove bad connections
# Configure connection pool
# config/mongoid.yml
development:
clients:
default:
uri: mongodb://localhost:27017/myapp_development
options:
server_selection_timeout: 5
max_pool_size: 10
min_pool_size: 2
max_idle_time: 300
wait_queue_timeout: 2500
production:
clients:
default:
uri: <%= ENV['MONGODB_URI'] %>
options:
server_selection_timeout: 5
max_pool_size: 20
min_pool_size: 5
max_idle_time: 300
wait_queue_timeout: 2500
read_preference: :secondary
write_concern: { w: 1, j: true }
# Monitor connection pool
class ConnectionPoolMonitor
def self.stats
client = Mongoid.default_client
pool = client.cluster.pool_stats
{
active_connections: pool['active'],
available_connections: pool['available'],
pending_connections: pool['pending'],
total_connections: pool['active'] + pool['available']
}
end
def self.health_check
stats = self.stats
if stats[:active_connections] > stats[:total_connections] * 0.8
Rails.logger.warn "High connection pool usage: #{stats[:active_connections]}/#{stats[:total_connections]}"
end
stats
end
end
# Connection pool monitoring job
class ConnectionPoolMonitoringJob < ApplicationJob
queue_as :default
def perform
stats = ConnectionPoolMonitor.stats
Rails.logger.info "Connection pool stats: #{stats}"
# Alert if pool is nearly full
if stats[:active_connections] > stats[:total_connections] * 0.9
# Send alert
AlertService.send_alert("High MongoDB connection pool usage")
end
end
end
E-commerce Performance Optimization
E-commerce Performance Challenges: E-commerce applications face unique performance challenges due to high user traffic, complex product catalogs, and real-time inventory management. Users expect fast page loads, accurate search results, and seamless checkout experiences, making performance optimization critical for business success.
Key Performance Requirements: E-commerce applications must handle:
- High Concurrency: Multiple users browsing and purchasing simultaneously
- Complex Queries: Product searches with multiple filters and sorting
- Real-Time Inventory: Accurate stock levels across multiple locations
- Fast Search: Quick product discovery and recommendations
- Order Processing: Efficient checkout and payment processing
Optimization Strategies: Effective e-commerce optimization involves:
- Strategic Indexing: Indexes for common search and filter patterns
- Product Caching: Cache frequently accessed product data
- Bulk Operations: Efficient inventory and order updates
- Query Optimization: Optimize complex product search queries
- Performance Monitoring: Track key metrics and user experience
User Experience Impact: Performance directly affects business metrics:
- Conversion Rates: Faster sites convert more visitors to customers
- Search Quality: Better search results improve product discovery
- Inventory Accuracy: Real-time stock levels prevent overselling
- Mobile Performance: Optimized for mobile shopping experiences
class Product
include Mongoid::Document
field :name, type: String
field :price, type: Float
field :category, type: String
field :status, type: String, default: "active"
field :inventory, type: Integer, default: 0
field :tags, type: Array, default: []
field :created_at, type: DateTime
# Optimized indexes for e-commerce queries
index({ category: 1, status: 1 })
index({ price: 1 })
index({ tags: 1 })
index({ status: 1, created_at: -1 })
index({ name: "text", description: "text" })
# Compound indexes for common query patterns
index({ category: 1, price: 1, status: 1 })
index({ status: 1, inventory: 1 })
# Performance-optimized scopes
scope :active, -> { where(status: "active") }
scope :in_stock, -> { where(:inventory.gt => 0) }
scope :by_category, ->(category) { where(category: category) }
scope :price_range, ->(min, max) { where(:price.gte => min, :price.lte => max) }
# Cached methods for expensive operations
def cached_average_rating
Rails.cache.fetch("product_#{id}_rating", expires_in: 1.hour) do
reviews.avg(:rating) || 0
end
end
def cached_review_count
Rails.cache.fetch("product_#{id}_review_count", expires_in: 30.minutes) do
reviews.count
end
end
# Bulk operations for inventory updates
def self.update_inventory_bulk(product_ids, quantities)
bulk_operations = product_ids.map.with_index do |product_id, index|
{
update_one: {
filter: { _id: product_id },
update: { "$inc" => { inventory: quantities[index] } }
}
}
end
collection.bulk_write(bulk_operations)
end
# Performance monitoring
def self.with_performance_logging
start_time = Time.current
result = yield
duration = ((Time.current - start_time) * 1000).round
Rails.logger.info "[PERF] Product query took #{duration}ms"
result
end
end
class Order
include Mongoid::Document
field :order_number, type: String
field :total_amount, type: Float
field :status, type: String, default: "pending"
field :created_at, type: DateTime
belongs_to :user
has_many :order_items
# Optimized indexes for order queries
index({ user_id: 1, created_at: -1 })
index({ status: 1, created_at: -1 })
index({ order_number: 1 }, { unique: true })
# Aggregation for order analytics
def self.sales_analytics(start_date, end_date)
collection.aggregate([
{ "$match" => {
created_at: { "$gte" => start_date, "$lte" => end_date },
status: "completed"
}},
{ "$group" => {
"_id" => {
"year" => { "$year" => "$created_at" },
"month" => { "$month" => "$created_at" },
"day" => { "$dayOfMonth" => "$created_at" }
},
"total_sales" => { "$sum" => "$total_amount" },
"order_count" => { "$sum" => 1 }
}},
{ "$sort" => { "_id" => 1 } }
])
end
# Bulk order processing
def self.process_bulk_orders(order_ids)
bulk_operations = order_ids.map do |order_id|
{
update_one: {
filter: { _id: order_id, status: "pending" },
update: { "$set" => { status: "processing", processed_at: Time.current } }
}
}
end
collection.bulk_write(bulk_operations)
end
end
Social Media Performance Optimization
Social Media Performance Challenges: Social media platforms face unique performance challenges due to massive user bases, real-time content updates, and complex engagement patterns. Users expect instant content delivery, real-time notifications, and seamless interactions across multiple devices and platforms.
Key Performance Requirements: Social media applications must handle:
- Real-Time Content: Instant posting and content delivery
- High Engagement: Likes, comments, shares, and interactions
- Content Discovery: Personalized feeds and recommendations
- Geographic Features: Location-based content and services
- Scalable Architecture: Handle millions of concurrent users
Optimization Strategies: Effective social media optimization involves:
- Content Caching: Cache popular posts and user feeds
- Geospatial Indexing: Optimize location-based queries
- Engagement Tracking: Efficient like and comment processing
- Feed Optimization: Fast personalized content delivery
- Real-Time Features: Live updates and notifications
User Experience Impact: Performance directly affects user engagement:
- Content Discovery: Fast feeds improve user engagement
- Real-Time Interaction: Instant likes and comments
- Mobile Experience: Optimized for mobile browsing
- Content Relevance: Better recommendations through performance
class Post
include Mongoid::Document
field :content, type: String
field :visibility, type: String, default: "public"
field :status, type: String, default: "published"
field :created_at, type: DateTime
field :hashtags, type: Array, default: []
field :location, type: Array # [longitude, latitude]
belongs_to :user
has_many :comments
has_many :likes
# Optimized indexes for social media queries
index({ user_id: 1, created_at: -1 })
index({ visibility: 1, created_at: -1 })
index({ hashtags: 1 })
index({ location: "2dsphere" })
index({ status: 1, created_at: -1 })
# Text search for content
index({ content: "text" })
# Performance-optimized scopes
scope :public_posts, -> { where(visibility: "public", status: "published") }
scope :recent, -> { order(created_at: -1) }
scope :by_user, ->(user) { where(user: user) }
scope :with_hashtag, ->(hashtag) { where(hashtags: hashtag) }
# Cached engagement metrics
def cached_engagement_score
Rails.cache.fetch("post_#{id}_engagement", expires_in: 15.minutes) do
like_count + comment_count * 2
end
end
def like_count
Rails.cache.fetch("post_#{id}_likes", expires_in: 5.minutes) do
likes.count
end
end
def comment_count
Rails.cache.fetch("post_#{id}_comments", expires_in: 5.minutes) do
comments.count
end
end
# Feed generation with optimization
def self.user_feed(user, limit = 20)
followed_user_ids = user.followed_users.pluck(:id)
collection.aggregate([
{ "$match" => {
"$or" => [
{ user_id: { "$in" => followed_user_ids } },
{ user_id: user.id }
],
visibility: "public",
status: "published"
}},
{ "$lookup" => {
"from" => "users",
"localField" => "user_id",
"foreignField" => "_id",
"as" => "user"
}},
{ "$unwind" => "$user" },
{ "$sort" => { created_at: -1 } },
{ "$limit" => limit },
{ "$project" => {
"content" => 1,
"created_at" => 1,
"hashtags" => 1,
"user.name" => 1,
"user.username" => 1
}}
])
end
# Trending hashtags with caching
def self.trending_hashtags(hours = 24)
Rails.cache.fetch("trending_hashtags_#{hours}", expires_in: 10.minutes) do
collection.aggregate([
{ "$match" => {
created_at: { "$gte" => hours.hours.ago },
visibility: "public"
}},
{ "$unwind" => "$hashtags" },
{ "$group" => {
"_id" => "$hashtags",
"count" => { "$sum" => 1 }
}},
{ "$sort" => { "count" => -1 } },
{ "$limit" => 10 }
])
end
end
# Bulk operations for engagement
def self.update_engagement_metrics
collection.aggregate([
{ "$lookup" => {
"from" => "likes",
"localField" => "_id",
"foreignField" => "post_id",
"as" => "likes"
}},
{ "$lookup" => {
"from" => "comments",
"localField" => "_id",
"foreignField" => "post_id",
"as" => "comments"
}},
{ "$addFields" => {
"like_count" => { "$size" => "$likes" },
"comment_count" => { "$size" => "$comments" },
"engagement_score" => {
"$add" => [
{ "$size" => "$likes" },
{ "$multiply" => [{ "$size" => "$comments" }, 2] }
]
}
}},
{ "$out" => "posts_with_engagement" }
])
end
end
Deployment & Production
Production Configuration
What is Production Configuration? Production configuration involves setting up MongoDB and Rails for a live, high-traffic environment. Unlike development, production environments require careful attention to security, performance, reliability, and scalability. The configuration must handle real user traffic, protect sensitive data, and maintain high availability.
Why Production Configuration Matters: Proper production configuration is critical for:
- Security: Protect data and prevent unauthorized access
- Performance: Optimize for high traffic and fast response times
- Reliability: Ensure consistent uptime and data integrity
- Scalability: Handle growth in users and data volume
- Monitoring: Track system health and performance
Key Configuration Areas: Production setup involves several critical areas:
- Connection Settings: Optimize connection pools and timeouts
- Security Settings: SSL/TLS, authentication, and access controls
- Performance Tuning: Read/write concerns and preferences
- Error Handling: Retry logic and failure recovery
- Logging: Appropriate log levels and monitoring
Environment Differences: Production differs from development in several ways:
- Security Requirements: Stricter authentication and encryption
- Performance Demands: Higher connection limits and optimizations
- Monitoring Needs: Comprehensive logging and alerting
- Reliability Focus: Redundancy and failover capabilities
# config/mongoid.yml
production:
clients:
default:
uri: <%= ENV['MONGODB_URI'] %>
options:
server_selection_timeout: 5
max_pool_size: 20
min_pool_size: 5
max_idle_time: 300
wait_queue_timeout: 2500
read_preference: :secondary
write_concern: { w: 1, j: true }
ssl: true
ssl_ca_cert: <%= ENV['MONGODB_SSL_CA_CERT'] %>
ssl_cert: <%= ENV['MONGODB_SSL_CERT'] %>
ssl_key: <%= ENV['MONGODB_SSL_KEY'] %>
retry_writes: true
retry_reads: true
max_connecting: 10
heartbeat_frequency: 10
server_selection_timeout: 30
socket_timeout: 5
connect_timeout: 10
staging:
clients:
default:
uri: <%= ENV['MONGODB_STAGING_URI'] %>
options:
server_selection_timeout: 5
max_pool_size: 10
min_pool_size: 2
ssl: true
read_preference: :secondary
write_concern: { w: 1, j: true }
# Environment-specific configuration
# config/environments/production.rb
Rails.application.configure do
# ... other configuration ...
# MongoDB configuration
config.mongoid.logger = Rails.logger
config.mongoid.logger.level = Logger::INFO
# Enable query logging in development only
if Rails.env.development?
config.mongoid.logger.level = Logger::DEBUG
end
end
Environment Variables
What are Environment Variables? Environment variables are configuration values that are set outside of your application code and can be accessed by your application at runtime. They provide a secure and flexible way to manage configuration across different environments without hardcoding sensitive information in your code.
Why Use Environment Variables? Environment variables offer several advantages:
- Security: Keep sensitive data out of source code
- Flexibility: Different configurations for different environments
- Deployment Safety: No risk of committing secrets to version control
- Scalability: Easy to manage across multiple servers
- Compliance: Meet security and audit requirements
Key Environment Variables: Essential variables for MongoDB production:
- Database Connection: MongoDB URI and connection parameters
- Security Credentials: SSL certificates and authentication
- Performance Settings: Connection pool sizes and timeouts
- Read/Write Preferences: Database consistency settings
- Monitoring: Log levels and monitoring configurations
Best Practices: Effective environment variable management involves:
- Secure Storage: Use secure vaults for sensitive data
- Documentation: Document all required variables
- Validation: Validate required variables at startup
- Backup Strategy: Secure backup of configuration
# .env.production
MONGODB_URI=mongodb+srv://username:[email protected]/myapp_production
MONGODB_SSL_CA_CERT=/path/to/ca-certificate.crt
MONGODB_SSL_CERT=/path/to/client-certificate.crt
MONGODB_SSL_KEY=/path/to/client-key.pem
MONGODB_READ_PREFERENCE=secondary
MONGODB_WRITE_CONCERN=majority
MONGODB_MAX_POOL_SIZE=20
MONGODB_MIN_POOL_SIZE=5
# .env.staging
MONGODB_STAGING_URI=mongodb+srv://username:[email protected]/myapp_staging
MONGODB_SSL_CA_CERT=/path/to/ca-certificate.crt
MONGODB_SSL_CERT=/path/to/client-certificate.crt
MONGODB_SSL_KEY=/path/to/client-key.pem
# Docker environment variables
# docker-compose.yml
version: '3.8'
services:
web:
build: .
environment:
- MONGODB_URI=mongodb://mongodb:27017/myapp_production
- RAILS_ENV=production
depends_on:
- mongodb
mongodb:
image: mongo:6.0
environment:
- MONGO_INITDB_ROOT_USERNAME=admin
- MONGO_INITDB_ROOT_PASSWORD=password
volumes:
- mongodb_data:/data/db
- ./mongo-init.js:/docker-entrypoint-initdb.d/mongo-init.js:ro
volumes:
mongodb_data:
Security Configuration
What is Security Configuration? Security configuration involves setting up authentication, authorization, encryption, and access controls for your MongoDB database in production. This includes user management, SSL/TLS encryption, network security, and audit logging to protect your data and comply with security requirements.
Why Security Configuration Matters: Proper security configuration is essential for:
- Data Protection: Prevent unauthorized access to sensitive data
- Compliance: Meet industry and regulatory requirements
- Risk Mitigation: Reduce the risk of data breaches
- Audit Requirements: Maintain audit trails for security events
- Business Continuity: Protect against security incidents
Key Security Components: Production security involves several layers:
- Authentication: User credentials and identity verification
- Authorization: Role-based access controls and permissions
- Encryption: SSL/TLS for data in transit and at rest
- Network Security: Firewall rules and network isolation
- Audit Logging: Track access and changes for compliance
Security Best Practices: Effective security implementation follows:
- Principle of Least Privilege: Grant minimum necessary permissions
- Defense in Depth: Multiple layers of security controls
- Regular Updates: Keep security configurations current
- Monitoring: Continuous security monitoring and alerting
# MongoDB security configuration
# mongo-init.js
db.createUser({
user: "app_user",
pwd: "secure_password",
roles: [
{ role: "readWrite", db: "myapp_production" },
{ role: "dbAdmin", db: "myapp_production" }
]
})
# Create indexes for security
db.users.createIndex({ "email": 1 }, { unique: true })
db.users.createIndex({ "username": 1 }, { unique: true })
# Enable authentication
# /etc/mongod.conf
security:
authorization: enabled
net:
port: 27017
bindIp: 127.0.0.1
# SSL/TLS configuration
net:
ssl:
mode: requireSSL
PEMKeyFile: /path/to/mongodb.pem
CAFile: /path/to/ca.crt
# Firewall configuration
# Allow only application server IPs
sudo ufw allow from 10.0.0.0/8 to any port 27017
sudo ufw allow from 172.16.0.0/12 to any port 27017
sudo ufw allow from 192.168.0.0/16 to any port 27017
Authentication & Authorization
What is Authentication & Authorization? Authentication verifies the identity of users or applications trying to access the database, while authorization determines what actions they can perform. Together, they form the foundation of database security, ensuring that only authorized users can access specific data and perform allowed operations.
Why Authentication & Authorization Matter: Proper access control is critical for:
- Data Security: Prevent unauthorized access to sensitive information
- Compliance: Meet regulatory requirements for data protection
- Audit Trails: Track who accessed what and when
- Risk Management: Minimize the impact of security incidents
- Business Continuity: Protect against data breaches and loss
Authentication Methods: MongoDB supports several authentication approaches:
- Username/Password: Traditional credential-based authentication
- X.509 Certificates: Certificate-based authentication for applications
- LDAP Integration: Enterprise directory integration
- OAuth/SAML: External identity provider integration
Authorization Strategies: Effective authorization involves:
- Role-Based Access: Assign permissions based on user roles
- Database-Level Permissions: Control access to specific databases
- Collection-Level Permissions: Fine-grained access control
- Operation-Level Permissions: Control specific operations (read, write, etc.)
# User management for MongoDB
# Create application user
use myapp_production
db.createUser({
user: "app_user",
pwd: "secure_password_here",
roles: [
{ role: "readWrite", db: "myapp_production" },
{ role: "dbAdmin", db: "myapp_production" }
]
})
# Create read-only user for analytics
db.createUser({
user: "analytics_user",
pwd: "analytics_password",
roles: [
{ role: "read", db: "myapp_production" }
]
})
# Create admin user for maintenance
use admin
db.createUser({
user: "admin_user",
pwd: "admin_password",
roles: [
{ role: "userAdminAnyDatabase", db: "admin" },
{ role: "dbAdminAnyDatabase", db: "admin" },
{ role: "clusterAdmin", db: "admin" }
]
})
# Role-based access control
# Custom roles for specific operations
db.createRole({
role: "data_analyst",
privileges: [
{ resource: { db: "myapp_production", collection: "" }, actions: ["find"] },
{ resource: { db: "myapp_production", collection: "users" }, actions: ["find"] }
],
roles: []
})
# Assign role to user
db.grantRolesToUser("analytics_user", [{ role: "data_analyst", db: "myapp_production" }])
Network Security
# Network security configuration
# MongoDB configuration file
# /etc/mongod.conf
# Network settings
net:
port: 27017
bindIp: 127.0.0.1,10.0.0.5 # Only allow specific IPs
maxInMemoryConnections: 100
# SSL/TLS configuration
net:
ssl:
mode: requireSSL
PEMKeyFile: /etc/ssl/mongodb.pem
CAFile: /etc/ssl/ca.crt
allowConnectionsWithoutCertificates: false
# Security settings
security:
authorization: enabled
keyFile: /etc/mongodb/keyfile
clusterAuthMode: keyFile
# Firewall rules
# Allow only application servers
sudo ufw allow from 10.0.0.0/8 to any port 27017
sudo ufw allow from 172.16.0.0/12 to any port 27017
# Block external access
sudo ufw deny 27017
# VPN configuration for remote access
# Only allow connections through VPN
sudo ufw allow from 10.8.0.0/24 to any port 27017
# Rate limiting
# Limit connections per IP
sudo iptables -A INPUT -p tcp --dport 27017 -m connlimit --connlimit-above 10 -j DROP
Data Encryption
# Encryption at rest
# MongoDB Enterprise configuration
security:
encryption:
keyFile: /etc/mongodb/keyfile
enableEncryption: true
# Application-level encryption
class User
include Mongoid::Document
field :email, type: String
field :encrypted_ssn, type: String
field :encrypted_credit_card, type: String
# Encrypt sensitive data before saving
before_save :encrypt_sensitive_data
after_find :decrypt_sensitive_data
private
def encrypt_sensitive_data
if ssn_changed?
self.encrypted_ssn = encrypt_field(ssn)
end
if credit_card_changed?
self.encrypted_credit_card = encrypt_field(credit_card)
end
end
def decrypt_sensitive_data
if encrypted_ssn.present?
self.ssn = decrypt_field(encrypted_ssn)
end
if encrypted_credit_card.present?
self.credit_card = decrypt_field(encrypted_credit_card)
end
end
def encrypt_field(value)
return nil if value.blank?
cipher = OpenSSL::Cipher.new('AES-256-GCM')
cipher.encrypt
key = Rails.application.credentials.mongodb_encryption_key
cipher.key = key
encrypted = cipher.update(value) + cipher.final
auth_tag = cipher.auth_tag
Base64.strict_encode64(encrypted + auth_tag)
end
def decrypt_field(encrypted_value)
return nil if encrypted_value.blank?
cipher = OpenSSL::Cipher.new('AES-256-GCM')
cipher.decrypt
key = Rails.application.credentials.mongodb_encryption_key
cipher.key = key
decoded = Base64.strict_decode64(encrypted_value)
auth_tag_length = 16
encrypted = decoded[0...-auth_tag_length]
auth_tag = decoded[-auth_tag_length..-1]
cipher.auth_tag = auth_tag
cipher.auth_data = ""
cipher.update(encrypted) + cipher.final
end
end
# Credentials management
# config/credentials.yml.enc
mongodb_encryption_key: <%= Rails.application.credentials.mongodb_encryption_key %>
Health Monitoring
What is Health Monitoring? Health monitoring involves continuously checking the status and performance of your MongoDB database to ensure it's operating correctly and efficiently. This includes monitoring connection status, performance metrics, storage utilization, and replication health to proactively identify and resolve issues before they impact users.
Why Health Monitoring Matters: Comprehensive monitoring is essential for:
- Proactive Issue Detection: Identify problems before they affect users
- Performance Optimization: Track metrics to optimize database performance
- Capacity Planning: Monitor growth trends and plan for scaling
- Uptime Assurance: Ensure high availability and reliability
- Compliance Requirements: Meet audit and regulatory requirements
Key Monitoring Areas: Effective health monitoring covers:
- Connection Health: Database connectivity and response times
- Performance Metrics: Query performance and resource utilization
- Storage Health: Disk usage, fragmentation, and growth
- Replication Status: Replica set health and synchronization
- Security Events: Authentication failures and access patterns
Monitoring Strategy: Effective monitoring implementation involves:
- Automated Checks: Regular automated health assessments
- Alert Thresholds: Set appropriate alerting levels
- Historical Tracking: Maintain metrics for trend analysis
- Response Procedures: Clear escalation and response processes
# Database health monitoring
class DatabaseHealthMonitor
def self.check_health
{
connection: check_connection,
performance: check_performance,
storage: check_storage,
replication: check_replication
}
end
def self.check_connection
begin
Mongoid.default_client.database.command(ping: 1)
{ status: "healthy", message: "Database is responding" }
rescue => e
{ status: "unhealthy", message: e.message }
end
end
def self.check_performance
stats = Mongoid.default_client.database.command(dbStats: 1).first
{
collections: stats['collections'],
data_size: stats['dataSize'],
storage_size: stats['storageSize'],
indexes: stats['indexes'],
index_size: stats['indexSize']
}
end
def self.check_storage
stats = Mongoid.default_client.database.command(dbStats: 1).first
total_size = stats['storageSize']
data_size = stats['dataSize']
fragmentation = ((total_size - data_size) / total_size.to_f * 100).round(2)
{
total_size: total_size,
data_size: data_size,
fragmentation_percentage: fragmentation,
status: fragmentation > 50 ? "warning" : "healthy"
}
end
def self.check_replication
begin
status = Mongoid.default_client.database.command(replSetGetStatus: 1).first
{
status: status['ok'] == 1 ? "healthy" : "unhealthy",
members: status['members'].map { |m| m['stateStr'] },
primary: status['members'].find { |m| m['state'] == 1 }&.dig('name')
}
rescue => e
{ status: "unhealthy", message: e.message }
end
end
end
# Monitoring job
class DatabaseMonitoringJob < ApplicationJob
queue_as :default
def perform
health = DatabaseHealthMonitor.check_health
# Log health status
Rails.logger.info "Database health: #{health}"
# Send alerts for unhealthy status
if health[:connection][:status] == "unhealthy"
AlertService.send_alert("Database connection failed: #{health[:connection][:message]}")
end
if health[:storage][:status] == "warning"
AlertService.send_alert("High database fragmentation: #{health[:storage][:fragmentation_percentage]}%")
end
# Store metrics
DatabaseMetric.create(
timestamp: Time.current,
connection_status: health[:connection][:status],
storage_fragmentation: health[:storage][:fragmentation_percentage],
performance_metrics: health[:performance]
)
end
end
Performance Monitoring
What is Performance Monitoring? Performance monitoring tracks key metrics that indicate how well your MongoDB database is performing. This includes query execution times, resource utilization, throughput, and response times to identify bottlenecks and optimize database performance for better user experience.
Why Performance Monitoring Matters: Performance monitoring is critical for:
- User Experience: Fast response times improve user satisfaction
- Capacity Planning: Understand current usage and plan for growth
- Cost Optimization: Use resources efficiently and reduce costs
- Problem Resolution: Quickly identify and fix performance issues
- Business Impact: Performance directly affects business metrics
Key Performance Metrics: Essential metrics to monitor include:
- Query Performance: Execution times and throughput
- Resource Utilization: CPU, memory, and disk usage
- Connection Pool: Connection usage and availability
- Index Efficiency: Index hit rates and usage patterns
- Error Rates: Failed operations and timeout frequency
Monitoring Implementation: Effective performance monitoring involves:
- Real-Time Monitoring: Track metrics in real-time
- Historical Analysis: Maintain data for trend analysis
- Alerting: Notify when thresholds are exceeded
- Dashboard Visualization: Clear metrics presentation
# Query performance monitoring
class QueryPerformanceMonitor
def self.track_query(query, duration, collection = nil)
Rails.logger.info "[QUERY] #{query} on #{collection} took #{duration}ms"
# Store slow queries
if duration > 100
SlowQuery.create(
query: query,
duration: duration,
collection: collection,
timestamp: Time.current
)
end
# Update metrics
QueryMetric.create(
query: query,
duration: duration,
collection: collection,
timestamp: Time.current
)
end
def self.analyze_slow_queries
slow_queries = SlowQuery.where(:timestamp.gte => 1.hour.ago)
slow_queries.group_by(&:collection).each do |collection, queries|
avg_duration = queries.map(&:duration).sum / queries.length
Rails.logger.warn "Slow queries detected in #{collection}: avg #{avg_duration}ms"
if avg_duration > 500
AlertService.send_alert("High average query time in #{collection}: #{avg_duration}ms")
end
end
end
end
# Connection pool monitoring
class ConnectionPoolMonitor
def self.stats
client = Mongoid.default_client
pool = client.cluster.pool_stats
{
active_connections: pool['active'],
available_connections: pool['available'],
pending_connections: pool['pending'],
total_connections: pool['active'] + pool['available']
}
end
def self.check_pool_health
stats = self.stats
if stats[:active_connections] > stats[:total_connections] * 0.8
AlertService.send_alert("High connection pool usage: #{stats[:active_connections]}/#{stats[:total_connections]}")
end
if stats[:pending_connections] > 0
AlertService.send_alert("Connection pool exhausted: #{stats[:pending_connections]} pending connections")
end
stats
end
end
# Index usage monitoring
class IndexUsageMonitor
def self.analyze_index_usage
collections = Mongoid.default_client.database.collection_names
collections.each do |collection_name|
index_stats = Mongoid.default_client.database.collection(collection_name).aggregate([
{ "$indexStats" => {} }
])
index_stats.each do |index|
hit_ratio = index['accesses']['hits'].to_f / index['accesses']['ops'] * 100
if hit_ratio < 10 && index['accesses']['ops'] > 100
Rails.logger.warn "Low hit ratio for index #{index['name']} in #{collection_name}: #{hit_ratio.round(2)}%"
end
end
end
end
end
Backup Strategies
What are Backup Strategies? Backup strategies involve creating and managing copies of your MongoDB data to protect against data loss, corruption, or accidental deletion. A comprehensive backup strategy includes regular automated backups, secure storage, testing procedures, and recovery plans to ensure business continuity in case of disasters.
Why Backup Strategies Matter: Proper backup strategies are critical for:
- Data Protection: Safeguard against data loss and corruption
- Business Continuity: Ensure rapid recovery from disasters
- Compliance Requirements: Meet regulatory backup requirements
- Risk Mitigation: Reduce the impact of system failures
- Peace of Mind: Confidence in data safety and recovery
Backup Types: Different backup approaches serve different purposes:
- Full Backups: Complete database snapshots
- Incremental Backups: Only changed data since last backup
- Point-in-Time Backups: Consistent snapshots at specific times
- Continuous Backups: Real-time data protection
Backup Best Practices: Effective backup strategies include:
- Automated Scheduling: Regular automated backup processes
- Secure Storage: Encrypted backups in multiple locations
- Testing Procedures: Regular backup restoration testing
- Retention Policies: Clear backup retention and cleanup
# Automated backup script
#!/bin/bash
# backup_mongodb.sh
# Configuration
BACKUP_DIR="/backups/mongodb"
DATE=$(date +%Y%m%d_%H%M%S)
DB_NAME="myapp_production"
RETENTION_DAYS=30
# Create backup directory
mkdir -p $BACKUP_DIR
# Perform backup
mongodump \
--uri="mongodb+srv://username:[email protected]/$DB_NAME" \
--out="$BACKUP_DIR/backup_$DATE" \
--gzip
# Compress backup
tar -czf "$BACKUP_DIR/backup_$DATE.tar.gz" -C "$BACKUP_DIR" "backup_$DATE"
# Remove uncompressed backup
rm -rf "$BACKUP_DIR/backup_$DATE"
# Upload to cloud storage
aws s3 cp "$BACKUP_DIR/backup_$DATE.tar.gz" "s3://my-backup-bucket/mongodb/"
# Clean up old backups
find $BACKUP_DIR -name "backup_*.tar.gz" -mtime +$RETENTION_DAYS -delete
# Log backup completion
echo "Backup completed: backup_$DATE.tar.gz" >> /var/log/mongodb_backup.log
# Rails backup task
# lib/tasks/mongodb.rake
namespace :mongodb do
desc "Create database backup"
task backup: :environment do
backup_dir = Rails.root.join("backups")
timestamp = Time.current.strftime("%Y%m%d_%H%M%S")
backup_path = backup_dir.join("backup_#{timestamp}")
# Create backup directory
FileUtils.mkdir_p(backup_path)
# Perform backup
system("mongodump --uri='#{ENV['MONGODB_URI']}' --out='#{backup_path}' --gzip")
# Compress backup
system("tar -czf '#{backup_path}.tar.gz' -C '#{backup_dir}' 'backup_#{timestamp}'")
# Remove uncompressed backup
FileUtils.rm_rf(backup_path)
puts "Backup created: #{backup_path}.tar.gz"
end
desc "Restore database from backup"
task :restore, [:backup_file] => :environment do |task, args|
backup_file = args[:backup_file]
unless backup_file
puts "Usage: rake mongodb:restore[backup_file.tar.gz]"
exit 1
end
backup_path = Rails.root.join("backups", backup_file)
unless File.exist?(backup_path)
puts "Backup file not found: #{backup_path}"
exit 1
end
# Extract backup
system("tar -xzf '#{backup_path}' -C '#{Rails.root.join('backups')}'")
# Restore database
extracted_dir = backup_path.to_s.gsub('.tar.gz', '')
system("mongorestore --uri='#{ENV['MONGODB_URI']}' --drop '#{extracted_dir}'")
# Clean up extracted files
FileUtils.rm_rf(extracted_dir)
puts "Database restored from: #{backup_file}"
end
end
Recovery Procedures
What are Recovery Procedures? Recovery procedures are step-by-step processes for restoring your MongoDB database from backups in case of data loss, corruption, or system failures. These procedures ensure that you can quickly and safely restore your data to a consistent state and resume normal operations with minimal downtime.
Why Recovery Procedures Matter: Proper recovery procedures are essential for:
- Minimize Downtime: Quick restoration reduces business impact
- Data Integrity: Ensure restored data is consistent and complete
- Business Continuity: Maintain operations during disasters
- Risk Management: Reduce the impact of data loss incidents
- Compliance: Meet recovery time objectives (RTO)
Recovery Types: Different recovery scenarios require different approaches:
- Full Recovery: Complete database restoration from backup
- Point-in-Time Recovery: Restore to a specific moment
- Partial Recovery: Restore specific collections or data
- Disaster Recovery: Complete system restoration
Recovery Best Practices: Effective recovery procedures include:
- Documented Procedures: Clear, step-by-step recovery instructions
- Testing: Regular recovery testing and validation
- Automation: Automated recovery scripts and processes
- Validation: Post-recovery verification and testing
# Disaster recovery script
#!/bin/bash
# disaster_recovery.sh
# Configuration
BACKUP_S3_BUCKET="my-backup-bucket"
BACKUP_DIR="/backups/mongodb"
DB_NAME="myapp_production"
# Get latest backup from S3
LATEST_BACKUP=$(aws s3 ls "s3://$BACKUP_S3_BUCKET/mongodb/" | sort | tail -1 | awk '{print $4}')
if [ -z "$LATEST_BACKUP" ]; then
echo "No backup found in S3"
exit 1
fi
echo "Restoring from backup: $LATEST_BACKUP"
# Download backup from S3
aws s3 cp "s3://$BACKUP_S3_BUCKET/mongodb/$LATEST_BACKUP" "$BACKUP_DIR/"
# Extract backup
tar -xzf "$BACKUP_DIR/$LATEST_BACKUP" -C "$BACKUP_DIR"
# Restore database
mongorestore \
--uri="mongodb+srv://username:[email protected]/$DB_NAME" \
--drop \
"$BACKUP_DIR/backup_*"
# Clean up
rm -rf "$BACKUP_DIR/backup_*"
echo "Disaster recovery completed"
# Rails recovery task
# lib/tasks/mongodb.rake
namespace :mongodb do
desc "Perform disaster recovery"
task disaster_recovery: :environment do
puts "Starting disaster recovery..."
# Stop application
system("sudo systemctl stop rails-app")
# Get latest backup from cloud storage
backup_file = get_latest_backup_from_s3
if backup_file.nil?
puts "No backup found in S3"
exit 1
end
puts "Restoring from backup: #{backup_file}"
# Download and restore backup
download_backup_from_s3(backup_file)
restore_database_from_backup(backup_file)
# Verify restoration
if verify_database_restoration
puts "Disaster recovery completed successfully"
# Restart application
system("sudo systemctl start rails-app")
else
puts "Disaster recovery failed"
exit 1
end
end
private
def get_latest_backup_from_s3
# Implementation to get latest backup from S3
# Returns backup filename or nil
end
def download_backup_from_s3(backup_file)
# Implementation to download backup from S3
end
def restore_database_from_backup(backup_file)
# Implementation to restore database
end
def verify_database_restoration
# Implementation to verify restoration
# Returns true if successful, false otherwise
end
end
# Point-in-time recovery
class PointInTimeRecovery
def self.recover_to_timestamp(timestamp)
# Get oplog entries up to timestamp
oplog_entries = get_oplog_entries_until(timestamp)
# Apply oplog entries to restore to point in time
apply_oplog_entries(oplog_entries)
puts "Recovered database to #{timestamp}"
end
private
def self.get_oplog_entries_until(timestamp)
# Implementation to get oplog entries
end
def self.apply_oplog_entries(entries)
# Implementation to apply oplog entries
end
end
Docker Deployment
What is Docker Deployment? Docker deployment involves packaging your Rails application and MongoDB database into containers that can be easily deployed, scaled, and managed. Docker provides consistency across environments, simplifies deployment processes, and enables efficient resource utilization through containerization.
Why Use Docker Deployment? Docker deployment offers several advantages:
- Environment Consistency: Same environment across development and production
- Easy Scaling: Simple horizontal scaling with container orchestration
- Isolation: Applications run in isolated environments
- Portability: Easy deployment across different platforms
- Resource Efficiency: Better resource utilization than traditional VMs
Docker Components: A complete Docker deployment includes:
- Application Container: Rails application with all dependencies
- Database Container: MongoDB with proper configuration
- Web Server: Nginx for load balancing and SSL termination
- Cache Container: Redis for session and cache storage
- Networking: Container communication and external access
Deployment Best Practices: Effective Docker deployment involves:
- Multi-Stage Builds: Optimize container size and security
- Environment Variables: Secure configuration management
- Health Checks: Monitor container health and availability
- Persistent Storage: Proper data persistence for databases
# Docker Compose for production
# docker-compose.prod.yml
version: '3.8'
services:
web:
build: .
environment:
- RAILS_ENV=production
- MONGODB_URI=mongodb://mongodb:27017/myapp_production
- REDIS_URL=redis://redis:6379/0
ports:
- "3000:3000"
depends_on:
- mongodb
- redis
restart: unless-stopped
volumes:
- ./logs:/app/logs
networks:
- app-network
mongodb:
image: mongo:6.0
environment:
- MONGO_INITDB_ROOT_USERNAME=admin
- MONGO_INITDB_ROOT_PASSWORD=secure_password
volumes:
- mongodb_data:/data/db
- ./mongo-init.js:/docker-entrypoint-initdb.d/mongo-init.js:ro
- ./mongodb.conf:/etc/mongod.conf:ro
ports:
- "27017:27017"
restart: unless-stopped
networks:
- app-network
redis:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis_data:/data
restart: unless-stopped
networks:
- app-network
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./ssl:/etc/nginx/ssl:ro
depends_on:
- web
restart: unless-stopped
networks:
- app-network
volumes:
mongodb_data:
redis_data:
networks:
app-network:
driver: bridge
# MongoDB configuration for Docker
# mongodb.conf
storage:
dbPath: /data/db
journal:
enabled: true
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
net:
port: 27017
bindIp: 0.0.0.0
security:
authorization: enabled
# Kubernetes deployment
# k8s/mongodb-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb
spec:
replicas: 3
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: mongo:6.0
ports:
- containerPort: 27017
env:
- name: MONGO_INITDB_ROOT_USERNAME
valueFrom:
secretKeyRef:
name: mongodb-secret
key: username
- name: MONGO_INITDB_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mongodb-secret
key: password
volumeMounts:
- name: mongodb-data
mountPath: /data/db
- name: mongodb-config
mountPath: /etc/mongod.conf
subPath: mongod.conf
volumes:
- name: mongodb-data
persistentVolumeClaim:
claimName: mongodb-pvc
- name: mongodb-config
configMap:
name: mongodb-config
---
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
spec:
selector:
app: mongodb
ports:
- port: 27017
targetPort: 27017
type: ClusterIP
CI/CD Pipeline
What is CI/CD Pipeline? CI/CD (Continuous Integration/Continuous Deployment) pipeline automates the process of testing, building, and deploying your Rails application with MongoDB. It ensures code quality, automates testing, and provides reliable, repeatable deployments with minimal human intervention.
Why Use CI/CD Pipeline? CI/CD pipelines provide several benefits:
- Automated Testing: Run tests automatically on every code change
- Faster Deployment: Reduce time from code to production
- Quality Assurance: Catch issues early in development
- Consistent Deployments: Repeatable and reliable deployment process
- Rollback Capability: Quick rollback to previous versions
Pipeline Stages: A complete CI/CD pipeline includes:
- Code Analysis: Static code analysis and security checks
- Testing: Unit, integration, and end-to-end tests
- Building: Create deployable artifacts
- Deployment: Automated deployment to staging/production
- Monitoring: Post-deployment health checks
Database Considerations: CI/CD with MongoDB requires special attention:
- Migration Testing: Test database migrations in CI
- Data Integrity: Ensure migrations don't break existing data
- Rollback Strategy: Plan for migration rollbacks
- Environment Isolation: Separate test and production databases
# GitHub Actions workflow
# .github/workflows/deploy.yml
name: Deploy to Production
on:
push:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
services:
mongodb:
image: mongo:6.0
env:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: password
options: >-
--health-cmd "mongosh --eval 'db.runCommand(\"ping\").ok'"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 27017:27017
steps:
- uses: actions/checkout@v3
- name: Set up Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: 3.2.0
bundler-cache: true
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y mongodb-clients
- name: Run tests
env:
MONGODB_URI: mongodb://admin:password@localhost:27017/test
run: |
bundle install
bundle exec rake db:create
bundle exec rake db:test:prepare
bundle exec rspec
- name: Run security checks
run: |
bundle exec brakeman
bundle exec bundle-audit check --update
deploy:
needs: test
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v3
- name: Deploy to production
env:
DEPLOY_KEY: ${{ secrets.DEPLOY_KEY }}
MONGODB_URI: ${{ secrets.MONGODB_URI }}
run: |
# Deploy to production server
echo "Deploying to production..."
# Run database migrations
bundle exec rake db:migrate
# Restart application
sudo systemctl restart rails-app
# Health check
curl -f http://localhost/health || exit 1
# Database migration script
# lib/tasks/deploy.rake
namespace :deploy do
desc "Run database migrations"
task migrate: :environment do
puts "Running database migrations..."
# Backup before migration
Rake::Task['mongodb:backup'].invoke
# Run migrations
Rake::Task['db:migrate'].invoke
# Verify migration
if verify_migration_success
puts "Migration completed successfully"
else
puts "Migration failed, rolling back..."
Rake::Task['mongodb:restore'].invoke
exit 1
end
end
desc "Rollback database migration"
task rollback: :environment do
puts "Rolling back database migration..."
# Restore from backup
Rake::Task['mongodb:restore'].invoke
puts "Rollback completed"
end
private
def verify_migration_success
# Implementation to verify migration
# Returns true if successful, false otherwise
end
end
Real-World Examples
🎯 Project Features
- User Management: Registration, authentication, and profile management
- Project Organization: Create and manage multiple projects
- Task Management: Create, assign, and track tasks with priorities
- Real-time Updates: Activity feeds and notifications
🏗️ Project Setup
Let's start by setting up our Rails application with MongoDB. We'll use Mongoid as our ODM and include essential gems for authentication and API handling.
# Gemfile
source 'https://rubygems.org'
gem 'rails', '~> 7.0'
gem 'mongoid', '~> 8.0'
gem 'bcrypt', '~> 3.1'
gem 'jwt', '~> 2.2'
gem 'pundit', '~> 2.1'
gem 'kaminari', '~> 1.2'
gem 'ransack', '~> 4.0'
group :development, :test do
gem 'rspec-rails', '~> 6.0'
gem 'factory_bot_rails', '~> 6.2'
gem 'faker', '~> 3.1'
end
# config/application.rb
require_relative "boot"
require "rails/all"
Bundler.require(*Rails.groups)
module TaskManager
class Application < Rails::Application
config.load_defaults 7.0
config.api_only = true
config.middleware.use ActionDispatch::Session::CookieStore
end
end
👤 User Model
Simple User model with basic authentication fields and project relationship.
# app/models/user.rb
class User
include Mongoid::Document
include Mongoid::Timestamps
field :email, type: String
field :username, type: String
field :password_digest, type: String
has_many :projects, dependent: :destroy
validates :email, presence: true, uniqueness: true
validates :username, presence: true, uniqueness: true
index({ email: 1 }, { unique: true })
index({ username: 1 }, { unique: true })
end
📁 Project Model
Project model with basic fields and user relationship.
# app/models/project.rb
class Project
include Mongoid::Document
include Mongoid::Timestamps
field :name, type: String
field :description, type: String
field :status, type: String, default: 'active'
belongs_to :owner, class_name: 'User'
validates :name, presence: true
validates :status, inclusion: { in: %w[active archived] }
index({ owner_id: 1, status: 1 })
index({ name: "text", description: "text" })
scope :active, -> { where(status: 'active') }
scope :owned_by, ->(user) { where(owner: user) }
end
🔍 Basic Queries
Simple queries to test the relationship between User and Project models.
# Rails console examples
# Create a user
user = User.create!(email: "[email protected]", username: "john_doe")
# Create a project for the user
project = Project.create!(
name: "My First Project",
description: "A sample project",
owner: user
)
# Find user's projects
user.projects.count
user.projects.where(status: "active")
# Find project owner
project.owner.email
# Search projects by name
Project.where(name: /First/)
# Get all active projects
Project.active.count
🚀 Local Development & Testing
Let's set up and run our Task Management app locally. This will show you how to get the application running on your machine and test all the features.
# config/mongoid.yml
development:
clients:
default:
uri: mongodb://localhost:27017/task_manager_dev
options:
server_selection_timeout: 5
test:
clients:
default:
uri: mongodb://localhost:27017/task_manager_test
options:
server_selection_timeout: 5
📦 Setup Steps
Follow these steps to get the application running locally:
# 1. Install MongoDB locally
# macOS (using Homebrew)
brew tap mongodb/brew
brew install mongodb-community
brew services start mongodb/brew/mongodb-community
# Ubuntu/Debian
sudo apt-get install mongodb
sudo systemctl start mongodb
# Windows
# Download and install from https://www.mongodb.com/try/download/community
# 2. Create Rails app
rails new task_manager --api --skip-active-record
cd task_manager
# 3. Add gems to Gemfile
gem 'mongoid', '~> 8.0'
gem 'bcrypt', '~> 3.1'
gem 'jwt', '~> 2.2'
gem 'kaminari', '~> 1.2'
# 4. Install gems
bundle install
# 5. Initialize Mongoid
rails g mongoid:config
# 6. Generate models
rails g model User email:string username:string password_digest:string
rails g model Project name:string description:string status:string owner:references
rails g model Task title:string description:string status:string project:references creator:references assignee:references
# 7. Run migrations (Mongoid doesn't use migrations, but we'll create indexes)
rails runner "User.create_indexes"
rails runner "Project.create_indexes"
rails runner "Task.create_indexes"
# 8. Start the server
rails server
🧪 Testing the Application
Let's test our application using Rails console and create some sample data to see how everything works together.
# Start Rails console
rails console
# Create a test user
user = User.create!(
email: "[email protected]",
username: "john_doe"
)
# Create a project
project = Project.create!(
name: "My First Project",
description: "A sample project to test our app",
owner: user
)
# Test queries
puts "User projects: #{user.projects.count}"
puts "Project owner: #{project.owner.email}"
# Test search
search_results = Project.where(name: /First/)
puts "Search results: #{search_results.count}"
# Test relationships
puts "User has #{user.projects.count} projects"
puts "Project belongs to: #{project.owner.username}"
🔍 Database Exploration
Explore the MongoDB database directly to understand how data is stored:
# Connect to MongoDB shell
mongosh
# Switch to our database
use task_manager_dev
# View collections
show collections
# View users
db.users.find().pretty()
# View projects
db.projects.find().pretty()
# Count documents
db.users.countDocuments()
db.projects.countDocuments()
# View indexes
db.users.getIndexes()
db.projects.getIndexes()
# Find projects by status
db.projects.find({status: "active"}).pretty()
# Search projects by name
db.projects.find({name: {$regex: "First"}}).pretty()
🐛 Debugging Tips
Common issues and how to debug them when running locally:
# Check if MongoDB is running
ps aux | grep mongod
# Check MongoDB connection
rails runner "puts Mongoid.default_client.database.name"
# View Rails logs
tail -f log/development.log
# Reset database (if needed)
rails runner "Mongoid.purge!"
# Check model validations
rails runner "user = User.new; puts user.valid?; puts user.errors.full_messages"
# Test model associations
rails runner "user = User.first; puts user.projects.count"
# Monitor database queries
# Add to config/application.rb:
config.mongoid.logger = Logger.new(STDOUT)
Reference & Commands
Command | Description | Example |
---|---|---|
User.create | Create and save a document | User.create(name: "John", email: "[email protected]") |
User.find | Find by ID | User.find("507f1f77bcf86cd799439011") |
User.where | Find by criteria | User.where(active: true) |
User.count | Count documents | User.count |
https://shorturl.fm/DiNsT
https://shorturl.fm/WEwvW
https://shorturl.fm/Gg54K
https://shorturl.fm/Xz7rr
https://shorturl.fm/NcRLk
https://shorturl.fm/him2q
https://shorturl.fm/0Hk12
https://shorturl.fm/GZEt2
https://shorturl.fm/sNLFe
https://shorturl.fm/zMBwU
https://shorturl.fm/FgZmY
https://shorturl.fm/Mo3s1
https://shorturl.fm/8UCNn
https://shorturl.fm/SHLI8
https://shorturl.fm/IslDN
https://shorturl.fm/ei2l4
https://shorturl.fm/qPxat
https://shorturl.fm/twKO1
https://shorturl.fm/FJwCm
https://shorturl.fm/I0pmW
https://shorturl.fm/unW95
https://shorturl.fm/2Kzc1
https://shorturl.fm/vJ1ml
https://shorturl.fm/J1hBz
https://shorturl.fm/H3bSM
https://shorturl.fm/9rwVc
https://shorturl.fm/qfve2
https://shorturl.fm/6NfZ7
https://shorturl.fm/O4zDb
https://shorturl.fm/PZQ8n
https://shorturl.fm/A9sux
https://shorturl.fm/c0nAP
https://shorturl.fm/JMrRN
https://shorturl.fm/JfCG4
https://shorturl.fm/Kxeuy
https://shorturl.fm/Tue1L
https://shorturl.fm/zr0qn
https://shorturl.fm/bNi5N
https://shorturl.fm/pwFyG
https://shorturl.fm/ftvDB
https://shorturl.fm/51nUh
https://shorturl.fm/Gn5sk
https://shorturl.fm/oXN3m
https://shorturl.fm/CGEum
https://shorturl.fm/Ifau5
https://shorturl.fm/oCur9
https://shorturl.fm/OFGAl
https://shorturl.fm/tXJcd
https://shorturl.fm/eJtVJ
https://shorturl.fm/2EWfu
https://shorturl.fm/vqNNE
https://shorturl.fm/AHf8v
https://shorturl.fm/LjNQX
https://shorturl.fm/5GfuE
https://shorturl.fm/lcQBj
https://shorturl.fm/EMR3a
https://shorturl.fm/9ckE3
https://shorturl.fm/0g6PF
https://shorturl.fm/P5bsf
https://shorturl.fm/gymnx
https://shorturl.fm/jY53p
https://shorturl.fm/Ofl8v
https://shorturl.fm/eYEqV
https://shorturl.fm/Ho24p
https://shorturl.fm/BCNbo
https://shorturl.fm/BUOyk
https://shorturl.fm/FQMkF
https://shorturl.fm/sCtUe
https://shorturl.fm/CW9Mg
https://shorturl.fm/qeWWj
https://shorturl.fm/nfTND
https://shorturl.fm/bt3Yx
https://shorturl.fm/OBIJY
https://shorturl.fm/D4E2O
https://shorturl.fm/OZyXd
https://shorturl.fm/qV7Kc
https://shorturl.fm/hULwL
https://shorturl.fm/Er3OW
https://shorturl.fm/ZuoMa
https://shorturl.fm/KqdXw
https://shorturl.fm/mjHsy
https://shorturl.fm/OeVfU
https://shorturl.fm/M35Et