MongoDB and Rails: A Complete Developer Guide

MongoDB with Ruby on Rails – Complete Guide

MongoDB with Ruby on Rails

Complete guide to NoSQL database integration

MongoDB is a popular NoSQL document database that stores data in flexible, JSON-like documents. When integrated with Rails using Mongoid, it provides a powerful alternative to traditional relational databases.

Fundamentals

What is MongoDB? MongoDB is a popular NoSQL document database that stores data in flexible, JSON-like documents. Unlike traditional relational databases, MongoDB doesn’t require a predefined schema, making it ideal for applications with evolving data structures.
Key Features & Architecture

Document-Oriented Storage

MongoDB stores data in BSON (Binary JSON) format, which is a binary representation of JSON documents. This allows for flexible, schema-less data storage where each document can have different fields.

// Example MongoDB Document
{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "name": "John Doe",
  "email": "[email protected]",
  "age": 30,
  "address": {
    "street": "123 Main St",
    "city": "New York",
    "zip": "10001"
  },
  "interests": ["programming", "music", "travel"],
  "created_at": ISODate("2024-01-15T10:30:00Z")
}

Core Features

  • Document-Oriented: Data is stored in BSON (Binary JSON) format with rich data types
  • Schema-Less: No predefined structure required – documents can evolve over time
  • Scalable: Horizontal scaling with sharding across multiple servers
  • High Performance: In-memory processing, efficient indexing, and fast queries
  • Rich Query Language: Support for complex queries, aggregation pipelines, and geospatial queries
  • Replication: Built-in replication for high availability and data redundancy
  • GridFS: File storage system for large files

MongoDB vs Relational Databases

AspectMongoDBRelational Databases
Data ModelDocument-oriented (JSON-like)Table-based (rows and columns)
SchemaFlexible, schema-lessRigid, predefined schema
RelationshipsEmbedded documents or referencesForeign keys and joins
ScalingHorizontal (sharding)Vertical (bigger hardware)
TransactionsLimited (single document)Full ACID support
Query LanguageMongoDB Query LanguageSQL
Why Use MongoDB with Rails?

Advantages for Rails Applications

✅ Development Benefits

  • Rapid Prototyping: No schema migrations needed for initial development
  • Flexible Data Models: Easy to evolve data structures as requirements change
  • JSON-like Structure: Natural fit for Rails hashes and JSON APIs
  • Rich Data Types: Support for arrays, nested objects, and complex data structures
  • Mongoid ORM: Rails-like interface with familiar ActiveRecord patterns

✅ Performance Benefits

  • Read-Heavy Workloads: Excellent performance for applications with more reads than writes
  • Horizontal Scaling: Can scale across multiple servers easily
  • In-Memory Processing: Fast queries with proper indexing
  • Aggregation Pipeline: Powerful data processing capabilities
  • Geospatial Queries: Built-in support for location-based features

⚠️ Considerations & Trade-offs

  • No Referential Integrity: Must handle relationships in application code
  • Learning Curve: Different query patterns and data modeling concepts
  • Transaction Limitations: Limited to single-document transactions
  • Storage Overhead: Document storage can be less space-efficient than normalized tables
  • Deployment Complexity: Different hosting and management considerations
When to Choose MongoDB

Ideal Use Cases

  • Content Management Systems: Flexible content structures with varying fields
  • Real-time Analytics: Fast aggregation and reporting on large datasets
  • IoT Applications: Time-series data and sensor readings
  • Mobile Apps: JSON APIs and flexible data models
  • E-commerce Catalogs: Product data with varying attributes
  • Social Media Platforms: User-generated content with complex relationships
  • Logging Systems: High-volume write operations

When to Avoid MongoDB

Consider alternatives when you need:
  • Complex multi-document transactions
  • Strict referential integrity requirements
  • Heavy write workloads with complex relationships
  • Existing SQL-based reporting systems
  • Team with limited NoSQL experience
MongoDB Architecture Overview

Core Components

Mongod (Database Server)

The primary database process that handles data storage, queries, and data management. It manages the data files and provides the database interface.

Mongos (Query Router)

Acts as a query router for sharded clusters. It routes client requests to the appropriate shards and aggregates results.

Config Servers

Store metadata and configuration settings for sharded clusters. They maintain information about data distribution across shards.

Data Organization

Database Structure:
└── Database (e.g., myapp_development)
    ├── Collection (e.g., users)
    │   ├── Document 1
    │   ├── Document 2
    │   └── Document 3
    ├── Collection (e.g., products)
    │   ├── Document 1
    │   └── Document 2
    └── Collection (e.g., orders)
        ├── Document 1
        └── Document 2

Indexing Strategy

MongoDB uses B-tree indexes for efficient query performance. Indexes can be created on single fields, compound fields, or special types like text and geospatial indexes.

// Common Index Types
// Single field index
db.users.createIndex({ "email": 1 })

// Compound index
db.users.createIndex({ "email": 1, "created_at": -1 })

// Text index
db.products.createIndex({ "name": "text", "description": "text" })

// Geospatial index
db.locations.createIndex({ "location": "2dsphere" })
Mongoid ORM Overview

What is Mongoid?

Mongoid is the official MongoDB ODM (Object Document Mapper) for Ruby. It provides a Rails-like interface for working with MongoDB documents, similar to how ActiveRecord works with relational databases.

Key Features

  • ActiveRecord-like Interface: Familiar methods like find, where, create
  • Validations: Rails-style validations for document integrity
  • Associations: Support for embedded and referenced relationships
  • Callbacks: Lifecycle hooks like before_save, after_create
  • Scopes: Reusable query chains
  • Indexing: Declarative index definitions
  • Serialization: Custom serialization for complex data types

Basic Mongoid Model

class User
  include Mongoid::Document
  include Mongoid::Timestamps
  
  # Field definitions
  field :email, type: String
  field :name, type: String
  field :age, type: Integer
  field :active, type: Boolean, default: true
  
  # Validations
  validates :email, presence: true, uniqueness: true
  validates :name, presence: true
  
  # Indexes
  index({ email: 1 }, { unique: true })
  
  # Scopes
  scope :active, -> { where(active: true) }
  scope :adults, -> { where(:age.gte => 18) }
  
  # Instance methods
  def full_name
    "#{name} (#{email})"
  end
end

Mongoid vs ActiveRecord Comparison

FeatureMongoidActiveRecord
Data StorageBSON DocumentsRelational Tables
SchemaDynamic fieldsPredefined columns
RelationshipsEmbedded + ReferencedForeign keys + Joins
QueriesMongoDB Query LanguageSQL
MigrationsNot neededRequired for schema changes
TransactionsSingle document onlyFull ACID support

Setup & Configuration

Setup Overview: This section covers MongoDB installation, Rails application configuration, and best practices for different environments. Follow the steps in order for a complete setup.
📋 Prerequisites & System Requirements

System Requirements

  • Operating System: macOS 10.14+, Ubuntu 18.04+, CentOS 7+, Windows 10+
  • Memory: Minimum 4GB RAM (8GB+ recommended for production)
  • Storage: SSD recommended for better performance
  • Ruby: Ruby 2.7+ with Rails 6.0+
  • Network: Port 27017 available for MongoDB

Required Tools

# Verify Ruby and Rails versions
ruby --version  # Should be 2.7+
rails --version # Should be 6.0+

# Check if MongoDB is already installed
mongod --version

# Verify network connectivity
telnet localhost 27017
🔧 MongoDB Installation

macOS Installation

Recommended Method: Using Homebrew for easy installation and updates.
# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Add MongoDB tap
brew tap mongodb/brew

# Install MongoDB Community Edition
brew install mongodb-community

# Start MongoDB service
brew services start mongodb/brew/mongodb-community

# Verify installation
mongod --version
mongo --version

Ubuntu/Debian Installation

Note: Replace ‘focal’ with your Ubuntu version (e.g., ‘bionic’ for Ubuntu 18.04)
# Import MongoDB public GPG key
wget -qO - https://www.mongodb.org/static/pgp/server-6.0.asc | sudo apt-key add -

# Add MongoDB repository
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-6.0.list

# Update package database
sudo apt-get update

# Install MongoDB
sudo apt-get install -y mongodb-org

# Start MongoDB service
sudo systemctl start mongod
sudo systemctl enable mongod

# Verify installation
mongod --version

Windows Installation

# Download MongoDB Community Server from:
# https://www.mongodb.com/try/download/community

# Extract to C:\mongodb
# Create data directory
mkdir C:\data\db

# Start MongoDB (from C:\mongodb\bin)
mongod --dbpath C:\data\db

# Or install as a Windows service
mongod --install --dbpath C:\data\db --logpath C:\mongodb\log\mongod.log

Docker Installation (Alternative)

# Pull MongoDB image
docker pull mongo:6.0

# Run MongoDB container
docker run -d \
  --name mongodb \
  -p 27017:27017 \
  -v mongodb_data:/data/db \
  mongo:6.0

# Connect to MongoDB
docker exec -it mongodb mongosh
🚀 Rails Application Setup

Step 1: Add Mongoid to Gemfile

# Gemfile
source 'https://rubygems.org'

# ... other gems ...

# MongoDB ODM
gem 'mongoid', '~> 8.0'

# Optional: Better performance for BSON operations
gem 'bson_ext', '~> 1.12'

# Optional: For MongoDB monitoring
gem 'mongo', '~> 2.18'

group :development, :test do
  # Optional: MongoDB GUI client
  gem 'mongodb_visualizer'
end

Step 2: Install Dependencies

# Install gems
bundle install

# Verify Mongoid installation
rails runner "puts Mongoid::VERSION"

Step 3: Generate Configuration

# Generate Mongoid configuration file
rails generate mongoid:config

# This creates:
# - config/mongoid.yml
# - config/application.rb (updates)

Step 4: Configure Database

# config/mongoid.yml
development:
  clients:
    default:
      uri: mongodb://localhost:27017/myapp_development
      options:
        server_selection_timeout: 5
        max_pool_size: 5
        min_pool_size: 1
        max_idle_time: 300
        wait_queue_timeout: 2500

test:
  clients:
    default:
      uri: mongodb://localhost:27017/myapp_test
      options:
        server_selection_timeout: 5
        max_pool_size: 5
        min_pool_size: 1

production:
  clients:
    default:
      uri: <%= ENV['MONGODB_URI'] %>
      options:
        server_selection_timeout: 5
        max_pool_size: 20
        min_pool_size: 5
        max_idle_time: 300
        wait_queue_timeout: 2500
        read_preference: :secondary
        write_concern: { w: 1, j: true }

Step 5: Update Application Configuration

# config/application.rb
require_relative "boot"
require "rails"
# Pick the frameworks you want:
require "action_controller/railtie"
require "action_mailer/railtie"
require "action_view/railtie"
require "action_cable/engine"
require "rails/test_unit/railtie"
# require "active_record/railtie" # Comment out if using MongoDB only

module MyApp
  class Application < Rails::Application
    # ... other configuration ...
    
    # Initialize Mongoid
    config.mongoid.logger = Rails.logger
  end
end
⚙️ Environment-Specific Configuration

Development Environment

Local Development Setup

# .env.development
MONGODB_URI=mongodb://localhost:27017/myapp_development
MONGODB_USERNAME=
MONGODB_PASSWORD=

# Enable query logging in development
# config/environments/development.rb
config.mongoid.logger = Rails.logger
config.mongoid.logger.level = Logger::DEBUG

Test Environment

Test Database Configuration

# config/environments/test.rb
config.mongoid.logger = Rails.logger
config.mongoid.logger.level = Logger::INFO

# Optional: Use in-memory MongoDB for faster tests
# Add to Gemfile: gem 'mongodb-memory-server'

Production Environment

Security Best Practices: Always use environment variables for production credentials.
# Production environment variables
MONGODB_URI=mongodb+srv://username:[email protected]/myapp_production
MONGODB_SSL_CA_CERT=/path/to/ca-certificate.crt
MONGODB_SSL_CERT=/path/to/client-certificate.crt
MONGODB_SSL_KEY=/path/to/client-key.pem

# Enhanced production configuration
# config/mongoid.yml (production section)
production:
  clients:
    default:
      uri: <%= ENV['MONGODB_URI'] %>
      options:
        server_selection_timeout: 5
        max_pool_size: 20
        min_pool_size: 5
        max_idle_time: 300
        wait_queue_timeout: 2500
        read_preference: :secondary
        write_concern: { w: 1, j: true }
        ssl: true
        ssl_ca_cert: <%= ENV['MONGODB_SSL_CA_CERT'] %>
        ssl_cert: <%= ENV['MONGODB_SSL_CERT'] %>
        ssl_key: <%= ENV['MONGODB_SSL_KEY'] %>
        retry_writes: true
        retry_reads: true
🔍 Verification & Testing

Database Connection Test

# Test MongoDB connection
rails runner "puts 'MongoDB connected!' if Mongoid.default_client.database"

# Test basic operations
rails runner "
  user = User.create(name: 'Test User', email: '[email protected]')
  puts 'User created: ' + user.name
  user.destroy
  puts 'Test completed successfully!'
"

Health Check Script

# lib/tasks/mongodb.rake
namespace :mongodb do
  desc "Check MongoDB connection and basic operations"
  task health_check: :environment do
    begin
      # Test connection
      client = Mongoid.default_client
      database = client.database
      
      puts "✅ MongoDB Connection: OK"
      puts "✅ Database: #{database.name}"
      
      # Test write operation
      test_collection = database.collection('health_check')
      test_collection.insert_one({ test: true, timestamp: Time.current })
      
      puts "✅ Write Operation: OK"
      
      # Test read operation
      result = test_collection.find({ test: true }).first
      puts "✅ Read Operation: OK"
      
      # Cleanup
      test_collection.delete_many({ test: true })
      puts "✅ Cleanup: OK"
      
      puts "\n🎉 MongoDB Health Check: PASSED"
      
    rescue => e
      puts "❌ MongoDB Health Check: FAILED"
      puts "Error: #{e.message}"
      exit 1
    end
  end
end

Performance Testing

# Test connection pool and performance
rails runner "
  start_time = Time.current
  
  # Test bulk operations
  100.times do |i|
    User.create(name: \"User #{i}\", email: \"user#{i}@example.com\")
  end
  
  end_time = Time.current
  puts \"Created 100 users in #{end_time - start_time} seconds\"
  
  # Test queries
  start_time = Time.current
  users = User.where(:name => /User/).limit(10)
  end_time = Time.current
  puts \"Queried 10 users in #{end_time - start_time} seconds\"
  
  # Cleanup
  User.delete_all
  puts \"Cleanup completed\"
"
🛠️ Troubleshooting

Common Issues & Solutions

Connection Refused

Problem: Cannot connect to MongoDB server

# Check if MongoDB is running
brew services list | grep mongodb
sudo systemctl status mongod

# Start MongoDB if not running
brew services start mongodb/brew/mongodb-community
sudo systemctl start mongod

# Check port availability
lsof -i :27017

Authentication Errors

Problem: Authentication failed

# Check connection string format
# Correct: mongodb://username:password@host:port/database
# Wrong: mongodb://host:port/database

# Test connection without authentication first
# Then gradually add authentication requirements

SSL/TLS Issues

Problem: SSL certificate validation errors

# For development, you can disable SSL verification
# (NOT recommended for production)
options:
  ssl: true
  ssl_verify_cert: false

# For production, ensure proper certificate paths
ssl_ca_cert: /path/to/ca-certificate.crt
ssl_cert: /path/to/client-certificate.crt
ssl_key: /path/to/client-key.pem

Debugging Commands

# Check MongoDB logs
tail -f /usr/local/var/log/mongodb/mongo.log
sudo tail -f /var/log/mongodb/mongod.log

# Check Rails logs for MongoDB queries
tail -f log/development.log | grep -i mongo

# Test MongoDB shell connection
mongosh
# or
mongo

# Check Mongoid configuration
rails runner "puts Mongoid.clients"
rails runner "puts Mongoid.default_client.database.name"

Models & Data Modeling

Data Modeling Overview: MongoDB's document-oriented approach requires different modeling strategies than relational databases. This section covers field types, relationships, indexing, and best practices for designing efficient data structures.
🏗️ Basic Model Structure

Creating Your First Model

# app/models/user.rb
class User
  include Mongoid::Document
  include Mongoid::Timestamps
  
  # Field definitions
  field :email, type: String
  field :name, type: String
  field :age, type: Integer
  field :active, type: Boolean, default: true
  field :preferences, type: Hash, default: {}
  field :tags, type: Array, default: []
  
  # Validations
  validates :email, presence: true, uniqueness: true
  validates :name, presence: true
  validates :age, numericality: { greater_than: 0, less_than: 150 }
  
  # Indexes
  index({ email: 1 }, { unique: true })
  index({ name: 1 })
  index({ active: 1, created_at: -1 })
  
  # Scopes
  scope :active, -> { where(active: true) }
  scope :adults, -> { where(:age.gte => 18) }
  scope :recent, -> { order(created_at: -1) }
  
  # Instance methods
  def full_name
    "#{name} (#{email})"
  end
  
  def adult?
    age >= 18
  end
end

Model Components Explained

  • Mongoid::Document: Makes the class a MongoDB document
  • Mongoid::Timestamps: Adds created_at and updated_at fields
  • field: Defines document fields with types and options
  • validates: Rails-style validations for data integrity
  • index: Creates database indexes for performance
  • scope: Reusable query chains

Field Types Reference

Mongoid TypeRuby TypeDescriptionExample
StringStringText data"John Doe"
IntegerIntegerWhole numbers25
FloatFloatDecimal numbers99.99
BooleanTrueClass/FalseClassTrue/false valuestrue
DateDateDate without timeDate.new(2024, 1, 15)
DateTimeDateTimeDate with timeDateTime.current
ArrayArrayList of values["ruby", "rails"]
HashHashKey-value pairs{theme: "dark"}
ObjectIdBSON::ObjectIdMongoDB ObjectIdBSON::ObjectId.new
⚙️ Advanced Field Options

Field Configuration Options

class Product
  include Mongoid::Document
  
  # Basic field with type
  field :name, type: String
  
  # Field with default value
  field :price, type: Float, default: 0.0
  
  # Field with custom getter/setter
  field :slug, type: String
  
  def slug=(value)
    super(value&.downcase&.gsub(/\s+/, '-'))
  end
  
  # Field with localization
  field :description, type: String, localize: true
  
  # Field with custom serialization
  field :metadata, type: Hash, default: {}
  
  # Virtual attributes (not stored in database)
  attr_accessor :temporary_note
  
  # Custom field methods
  def display_price
    "$#{price.round(2)}"
  end
  
  def expensive?
    price > 100
  end
end

Field Options Reference

OptionTypeDescriptionExample
typeClassData type for the fieldtype: String
defaultAnyDefault value for new documentsdefault: true
localizeBooleanEnable localization for the fieldlocalize: true
asSymbolAlias for the field nameas: :title
overwriteBooleanOverwrite existing field definitionoverwrite: true

Custom Field Types

# Custom field type for Money
class Money
  include Mongoid::Fields::Serializable
  
  def initialize(amount = 0, currency = 'USD')
    @amount = amount.to_f
    @currency = currency
  end
  
  def serialize
    { amount: @amount, currency: @currency }
  end
  
  def deserialize(object)
    return self if object.nil?
    @amount = object['amount'].to_f
    @currency = object['currency']
    self
  end
  
  def to_s
    "#{@currency} #{@amount}"
  end
end

# Usage in model
class Product
  include Mongoid::Document
  
  field :price, type: Money, default: -> { Money.new(0) }
end
📊 Data Modeling Strategies

Embedded vs Referenced Documents

Embedded Documents (One-to-Few)

Use when the related data is small, doesn't change frequently, and is always accessed together with the parent.

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  
  # Embedded one-to-one
  embeds_one :profile
  
  # Embedded one-to-many
  embeds_many :addresses
end

class Profile
  include Mongoid::Document
  
  field :bio, type: String
  field :avatar_url, type: String
  field :location, type: String
  
  embedded_in :user
end

class Address
  include Mongoid::Document
  
  field :street, type: String
  field :city, type: String
  field :state, type: String
  field :zip_code, type: String
  field :primary, type: Boolean, default: false
  
  embedded_in :user
end

Referenced Documents (One-to-Many/Many-to-Many)

Use when data is large, changes frequently, or needs to be shared across multiple documents.

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  
  # Referenced one-to-many
  has_many :posts
  
  # Referenced many-to-many
  has_and_belongs_to_many :roles
end

class Post
  include Mongoid::Document
  
  field :title, type: String
  field :content, type: String
  field :published_at, type: DateTime
  
  belongs_to :user
  
  validates :title, presence: true
end

class Role
  include Mongoid::Document
  
  field :name, type: String
  field :description, type: String
  
  has_and_belongs_to_many :users
end

When to Use Each Strategy

StrategyUse WhenAdvantagesDisadvantages
EmbeddedSmall data, always accessed togetherFast reads, atomic updatesDocument size limits, no sharing
ReferencedLarge data, shared across documentsFlexible, reusable, smaller documentsMultiple queries, no atomic updates
🔍 Indexing Strategies

Index Types and Usage

class User
  include Mongoid::Document
  
  field :email, type: String
  field :username, type: String
  field :name, type: String
  field :status, type: String, default: "active"
  field :created_at, type: DateTime
  field :last_login_at, type: DateTime
  field :location, type: Array # [longitude, latitude]
  field :tags, type: Array, default: []
  field :metadata, type: Hash, default: {}
  
  # Single field indexes
  index({ email: 1 }, { unique: true })
  index({ username: 1 }, { unique: true })
  index({ status: 1 })
  index({ created_at: -1 })
  index({ last_login_at: -1 })
  
  # Compound indexes (order matters!)
  index({ status: 1, created_at: -1 })
  index({ email: 1, status: 1 })
  index({ status: 1, last_login_at: -1 })
  
  # Text search indexes
  index({ name: "text", bio: "text" })
  
  # Geospatial indexes
  index({ location: "2dsphere" })
  
  # Array indexes
  index({ tags: 1 })
  
  # Sparse indexes (skip null values)
  index({ phone: 1 }, { sparse: true })
  
  # TTL indexes (auto-delete after time)
  index({ created_at: 1 }, { expire_after_seconds: 86400 }) # 24 hours
  
  # Partial indexes (only for specific conditions)
  index({ email: 1 }, { partialFilterExpression: { status: "active" } })
  
  # Background indexes for large collections
  index({ username: 1 }, { background: true })
end

Index Management

# Create all indexes for a model
User.create_indexes

# Create indexes for all models
Mongoid.create_indexes

# Drop all indexes for a model
User.remove_indexes

# Check existing indexes
User.collection.indexes.each do |index|
  puts "Index: #{index['name']}"
  puts "Keys: #{index['key']}"
  puts "Options: #{index['options']}"
  puts "---"
end

# Create indexes with specific options
User.collection.indexes.create_one(
  { email: 1, status: 1 },
  { 
    background: true,
    name: "email_status_idx"
  }
)

# Drop specific index
User.collection.indexes.drop_one("email_status_idx")

# Check index usage statistics
User.collection.aggregate([
  { "$indexStats" => {} }
])

Index Best Practices

  • Compound Index Order: Most selective field first
  • Covered Queries: Include all queried fields in index
  • Avoid Over-Indexing: Each index has write overhead
  • Background Indexing: Use for large collections
  • Monitor Usage: Remove unused indexes
  • TTL Indexes: For time-based data cleanup
  • Partial Indexes: For conditional queries
🎯 Scopes and Query Methods

Defining Scopes

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  field :age, type: Integer
  field :status, type: String
  field :created_at, type: DateTime
  
  # Basic scopes
  scope :active, -> { where(status: "active") }
  scope :inactive, -> { where(status: "inactive") }
  scope :adults, -> { where(:age.gte => 18) }
  scope :recent, -> { order(created_at: -1) }
  
  # Scopes with parameters
  scope :older_than, ->(age) { where(:age.gt => age) }
  scope :created_after, ->(date) { where(:created_at.gt => date) }
  
  # Chained scopes
  scope :active_adults, -> { active.adults }
  scope :recent_active, -> { active.recent }
  
  # Scopes with complex logic
  scope :search, ->(query) {
    any_of(
      { name: /#{query}/i },
      { email: /#{query}/i }
    )
  }
end

# Usage
User.active.adults.recent.limit(10)
User.search("john").older_than(25)

Custom Query Methods

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  field :age, type: Integer
  field :location, type: Array
  
  # Class methods for complex queries
  def self.find_by_email_domain(domain)
    where(email: /@#{domain}$/)
  end
  
  def self.find_nearby(lat, lng, radius_km = 10)
    where(
      location: {
        "$near" => {
          "$geometry" => {
            "type" => "Point",
            "coordinates" => [lng, lat]
          },
          "$maxDistance" => radius_km * 1000
        }
      }
    )
  end
  
  def self.age_distribution
    collection.aggregate([
      { "$group" => {
        "_id" => "$age",
        "count" => { "$sum" => 1 }
      }},
      { "$sort" => { "_id" => 1 } }
    ])
  end
  
  # Instance methods
  def display_name
    name.present? ? name : email
  end
  
  def age_group
    case age
    when 0..17 then "minor"
    when 18..25 then "young_adult"
    when 26..64 then "adult"
    else "senior"
    end
  end
end
🔄 Callbacks and Lifecycle

Model Callbacks

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  field :slug, type: String
  field :last_login_at, type: DateTime
  
  # Before callbacks
  before_create :generate_slug
  before_save :normalize_email
  before_update :track_changes
  
  # After callbacks
  after_create :send_welcome_email
  after_save :update_search_index
  after_destroy :cleanup_related_data
  
  # Around callbacks
  around_save :log_operation_time
  
  private
  
  def generate_slug
    self.slug = name.parameterize
  end
  
  def normalize_email
    self.email = email.downcase.strip
  end
  
  def track_changes
    Rails.logger.info "User #{id} changed: #{changes.keys.join(', ')}"
  end
  
  def send_welcome_email
    UserMailer.welcome(self).deliver_later
  end
  
  def update_search_index
    SearchIndexJob.perform_later(self)
  end
  
  def cleanup_related_data
    Post.where(user_id: id).destroy_all
  end
  
  def log_operation_time
    start_time = Time.current
    yield
    Rails.logger.info "Operation took #{Time.current - start_time} seconds"
  end
end

Callback Types

CallbackTriggered WhenCommon Uses
before_createBefore document is createdGenerate slugs, set defaults
after_createAfter document is createdSend notifications, create related data
before_saveBefore any save operationNormalize data, validate custom rules
after_saveAfter any save operationUpdate search indexes, cache invalidation
before_destroyBefore document is deletedValidate deletion, backup data
after_destroyAfter document is deletedCleanup related data, audit logging
📝 Real-World Examples

E-commerce Product Model

class Product
  include Mongoid::Document
  include Mongoid::Timestamps
  
  # Basic fields
  field :name, type: String
  field :description, type: String
  field :sku, type: String
  field :price, type: Float
  field :cost, type: Float
  field :status, type: String, default: "draft"
  field :category, type: String
  field :tags, type: Array, default: []
  field :inventory, type: Integer, default: 0
  field :weight, type: Float
  field :dimensions, type: Hash, default: {}
  
  # Complex fields
  field :metadata, type: Hash, default: {}
  field :seo_data, type: Hash, default: {}
  field :pricing_tiers, type: Array, default: []
  
  # Embedded documents
  embeds_many :variants
  embeds_many :images
  embeds_many :reviews
  
  # Referenced associations
  belongs_to :brand
  has_many :order_items
  
  # Validations
  validates :name, presence: true
  validates :sku, presence: true, uniqueness: true
  validates :price, numericality: { greater_than: 0 }
  validates :inventory, numericality: { greater_than_or_equal_to: 0 }
  
  # Indexes
  index({ sku: 1 }, { unique: true })
  index({ name: "text", description: "text" })
  index({ category: 1, status: 1 })
  index({ price: 1 })
  index({ tags: 1 })
  index({ "brand.name" => 1 })
  
  # Scopes
  scope :active, -> { where(status: "active") }
  scope :in_stock, -> { where(:inventory.gt => 0) }
  scope :by_category, ->(category) { where(category: category) }
  scope :price_range, ->(min, max) { where(:price.gte => min, :price.lte => max) }
  scope :featured, -> { where("metadata.featured" => true) }
  
  # Callbacks
  before_save :update_seo_slug
  after_save :update_search_index
  
  # Instance methods
  def available?
    active? && inventory > 0
  end
  
  def profit_margin
    return 0 if cost.zero?
    ((price - cost) / cost * 100).round(2)
  end
  
  def update_inventory(quantity)
    increment(inventory: quantity)
  end
  
  def primary_image
    images.find_by(primary: true) || images.first
  end
  
  def average_rating
    reviews.any? ? reviews.avg(:rating) : 0
  end
  
  private
  
  def update_seo_slug
    self.seo_data = seo_data.merge(
      slug: name.parameterize,
      title: "#{name} - #{brand&.name}",
      description: description.truncate(160)
    )
  end
  
  def update_search_index
    SearchIndexJob.perform_later(self)
  end
end

Social Media Post Model

class Post
  include Mongoid::Document
  include Mongoid::Timestamps
  
  # Basic fields
  field :content, type: String
  field :visibility, type: String, default: "public"
  field :status, type: String, default: "published"
  field :location, type: Array # [longitude, latitude]
  field :language, type: String, default: "en"
  
  # Complex fields
  field :media_urls, type: Array, default: []
  field :hashtags, type: Array, default: []
  field :mentions, type: Array, default: []
  field :metadata, type: Hash, default: {}
  
  # Embedded documents
  embeds_many :comments
  embeds_many :reactions
  
  # Referenced associations
  belongs_to :user
  has_many :shares
  has_and_belongs_to_many :hashtags
  
  # Validations
  validates :content, presence: true, length: { maximum: 280 }
  validates :visibility, inclusion: { in: %w[public private friends] }
  
  # Indexes
  index({ user_id: 1, created_at: -1 })
  index({ visibility: 1, created_at: -1 })
  index({ location: "2dsphere" })
  index({ hashtags: 1 })
  index({ "user.username" => 1 })
  
  # Scopes
  scope :public_posts, -> { where(visibility: "public") }
  scope :recent, -> { order(created_at: -1) }
  scope :by_user, ->(user) { where(user: user) }
  scope :with_media, -> { where(:media_urls.ne => []) }
  scope :trending, -> { where(:created_at.gte => 24.hours.ago) }
  
  # Callbacks
  before_save :extract_hashtags_and_mentions
  after_create :notify_followers
  
  # Instance methods
  def like_count
    reactions.where(type: "like").count
  end
  
  def comment_count
    comments.count
  end
  
  def share_count
    shares.count
  end
  
  def engagement_rate
    total_engagement = like_count + comment_count + share_count
    user.followers_count > 0 ? (total_engagement.to_f / user.followers_count * 100).round(2) : 0
  end
  
  def can_be_viewed_by?(viewer)
    return true if visibility == "public"
    return true if user == viewer
    return true if visibility == "friends" && user.friends.include?(viewer)
    false
  end
  
  private
  
  def extract_hashtags_and_mentions
    self.hashtags = content.scan(/#\w+/).map(&:downcase)
    self.mentions = content.scan(/@\w+/).map(&:downcase)
  end
  
  def notify_followers
    NotificationJob.perform_later(self)
  end
end

Queries & Aggregation

Query Overview: MongoDB provides powerful querying capabilities through Mongoid. This section covers basic queries, complex conditions, aggregation pipelines, and performance optimization techniques for efficient data retrieval.
🔍 Basic Query Operations

Finding Documents

# Find all documents
users = User.all

# Find by ID (ObjectId)
user = User.find("507f1f77bcf86cd799439011")
user = User.find(BSON::ObjectId.from_string("507f1f77bcf86cd799439011"))

# Find by field value
user = User.find_by(email: "[email protected]")
users = User.where(status: "active")

# Find first/last documents
first_user = User.first
last_user = User.last
first_active = User.where(status: "active").first

# Find with multiple conditions
active_adults = User.where(:age.gte => 18, status: "active")
recent_users = User.where(:created_at.gte => 1.week.ago)
premium_users = User.where(:subscription_type.in => ["premium", "enterprise"])

# Find with OR conditions
users = User.any_of(
  { email: /@gmail.com$/ },
  { email: /@yahoo.com$/ }
)

# Find with NOT conditions
non_gmail_users = User.where(:email.nin => [/@gmail.com$/])

# Find with EXISTS conditions
users_with_phone = User.where(:phone.exists => true)
users_without_bio = User.where(:bio.exists => false)

Query Operators Reference

OperatorDescriptionExampleMongoDB Equivalent
eqEqual towhere(status: "active")$eq
neNot equal towhere(:status.ne => "inactive")$ne
gtGreater thanwhere(:age.gt => 18)$gt
gteGreater than or equalwhere(:age.gte => 18)$gte
ltLess thanwhere(:age.lt => 65)$lt
lteLess than or equalwhere(:age.lte => 65)$lte
inIn arraywhere(:status.in => ["active", "pending"])$in
ninNot in arraywhere(:status.nin => ["deleted", "banned"])$nin
existsField existswhere(:phone.exists => true)$exists
regexRegular expressionwhere(:email => /@gmail.com$/)$regex

Limiting and Sorting

# Limit results
recent_users = User.order(created_at: -1).limit(10)

# Skip documents (pagination)
page_2_users = User.order(created_at: -1).skip(20).limit(10)

# Sort by multiple fields
users = User.order(:status.asc, :created_at.desc)

# Sort by embedded fields
users = User.order("profile.age" => -1)

# Distinct values
unique_statuses = User.distinct(:status)
unique_domains = User.distinct(:email).map { |email| email.split('@').last }

# Count documents
total_users = User.count
active_count = User.where(status: "active").count
recent_count = User.where(:created_at.gte => 1.day.ago).count
🔗 Advanced Query Conditions

Complex Query Operators

# Array operators
users_with_tags = User.where(:tags.all => ["ruby", "rails"])
users_with_any_tag = User.where(:tags.in => ["ruby", "rails"])
users_without_tags = User.where(:tags.nin => ["php", "java"])

# Array element matching
users_with_first_tag = User.where("tags.0" => "ruby")
users_with_size = User.where(:tags.size => 3)

# Object/Embedded document queries
users_with_city = User.where("address.city" => "New York")
users_with_coordinates = User.where("location.0" => { "$gte" => -74, "$lte" => -73 })

# Nested object queries
premium_users = User.where("subscription.plan" => "premium")
active_premium = User.where("subscription.plan" => "premium", "subscription.status" => "active")

# Date range queries
today_users = User.where(:created_at.gte => Date.current.beginning_of_day)
this_week_users = User.where(:created_at.gte => Date.current.beginning_of_week)
last_month_users = User.where(:created_at.gte => 1.month.ago)

# Text search (requires text index)
search_results = User.where("$text" => { "$search" => "john developer" })
search_with_score = User.where("$text" => { "$search" => "ruby rails" }).order("$text" => { "$meta" => "textScore" })

Logical Operators

# AND conditions (default)
active_adults = User.where(:status => "active", :age.gte => 18)

# OR conditions
gmail_or_yahoo = User.any_of(
  { email: /@gmail.com$/ },
  { email: /@yahoo.com$/ }
)

# NOR conditions (neither condition true)
not_gmail_not_yahoo = User.nor(
  { email: /@gmail.com$/ },
  { email: /@yahoo.com$/ }
)

# Complex logical combinations
complex_query = User.where(:status => "active").any_of(
  { :age.gte => 18, :verified => true },
  { :age.gte => 21, :verified => false }
).nor(
  { email: /@spam.com$/ }
)

Geospatial Queries

# Near query (requires 2dsphere index)
nearby_users = User.where(
  location: {
    "$near" => {
      "$geometry" => {
        "type" => "Point",
        "coordinates" => [-73.935242, 40.730610] # NYC coordinates
      },
      "$maxDistance" => 5000 # 5km radius
    }
  }
)

# Within polygon
polygon_users = User.where(
  location: {
    "$geoWithin" => {
      "$geometry" => {
        "type" => "Polygon",
        "coordinates" => [[
          [-74, 40], [-74, 41], [-73, 41], [-73, 40], [-74, 40]
        ]]
      }
    }
  }
)

# Intersects with polygon
intersecting_users = User.where(
  location: {
    "$geoIntersects" => {
      "$geometry" => {
        "type" => "Polygon",
        "coordinates" => [[
          [-74, 40], [-74, 41], [-73, 41], [-73, 40], [-74, 40]
        ]]
      }
    }
  }
)
📊 Aggregation Pipeline

Basic Aggregation

# Simple aggregation
result = User.collection.aggregate([
  { "$match" => { status: "active" } },
  { "$group" => {
    "_id" => "$age_group",
    "count" => { "$sum" => 1 },
    "avg_age" => { "$avg" => "$age" }
  }},
  { "$sort" => { "_id" => 1 } }
])

# Using Mongoid aggregation
result = User.where(status: "active").aggregate([
  { "$group" => {
    "_id" => "$age_group",
    "count" => { "$sum" => 1 }
  }}
])

Aggregation Stages

StageDescriptionExample
$matchFilter documents{ "$match" => { status: "active" } }
$groupGroup by field{ "$group" => { "_id" => "$category", "count" => { "$sum" => 1 } } }
$sortSort results{ "$sort" => { "count" => -1 } }
$limitLimit results{ "$limit" => 10 }
$skipSkip documents{ "$skip" => 20 }
$projectSelect fields{ "$project" => { name: 1, email: 1, _id: 0 } }
$lookupJoin collections{ "$lookup" => { from: "posts", localField: "_id", foreignField: "user_id", as: "posts" } }
$unwindDeconstruct arrays{ "$unwind" => "$tags" }
$addFieldsAdd computed fields{ "$addFields" => { "full_name" => { "$concat" => ["$first_name", " ", "$last_name"] } } }

Advanced Aggregation Examples

# User statistics by age group
age_stats = User.collection.aggregate([
  { "$addFields" => {
    "age_group" => {
      "$switch" => {
        "branches" => [
          { "case" => { "$lt" => ["$age", 18] }, "then" => "minor" },
          { "case" => { "$lt" => ["$age", 25] }, "then" => "young_adult" },
          { "case" => { "$lt" => ["$age", 65] }, "then" => "adult" }
        ],
        "default" => "senior"
      }
    }
  }},
  { "$group" => {
    "_id" => "$age_group",
    "count" => { "$sum" => 1 },
    "avg_age" => { "$avg" => "$age" },
    "min_age" => { "$min" => "$age" },
    "max_age" => { "$max" => "$age" }
  }},
  { "$sort" => { "count" => -1 } }
])

# User activity timeline
activity_timeline = User.collection.aggregate([
  { "$match" => { :created_at.gte => 30.days.ago } },
  { "$group" => {
    "_id" => {
      "year" => { "$year" => "$created_at" },
      "month" => { "$month" => "$created_at" },
      "day" => { "$dayOfMonth" => "$created_at" }
    },
    "new_users" => { "$sum" => 1 }
  }},
  { "$sort" => { "_id" => 1 } }
])

# Top users by post count
top_users = User.collection.aggregate([
  { "$lookup" => {
    "from" => "posts",
    "localField" => "_id",
    "foreignField" => "user_id",
    "as" => "posts"
  }},
  { "$addFields" => {
    "post_count" => { "$size" => "$posts" }
  }},
  { "$match" => { "post_count" => { "$gt" => 0 } } },
  { "$sort" => { "post_count" => -1 } },
  { "$limit" => 10 },
  { "$project" => {
    "name" => 1,
    "email" => 1,
    "post_count" => 1,
    "_id" => 0
  }}
])

Aggregation Operators

CategoryOperatorsDescription
Arithmetic$add, $subtract, $multiply, $divide, $modMathematical operations
Comparison$eq, $ne, $gt, $gte, $lt, $lteValue comparisons
Logical$and, $or, $not, $norLogical operations
String$concat, $substr, $toLower, $toUpperString manipulation
Date$year, $month, $dayOfMonth, $hourDate/time operations
Array$size, $push, $addToSet, $first, $lastArray operations
Conditional$cond, $switch, $caseConditional logic
⚡ Performance Optimization

Query Performance Best Practices

# Use indexes for frequently queried fields
class User
  include Mongoid::Document
  
  field :email, type: String
  field :status, type: String
  field :created_at, type: DateTime
  
  # Create compound indexes for common query patterns
  index({ email: 1 }, { unique: true })
  index({ status: 1, created_at: -1 })
  index({ "profile.age" => 1, status: 1 })
  
  # Text search index
  index({ name: "text", bio: "text" })
  
  # Geospatial index
  index({ location: "2dsphere" })
end

# Use covered queries (all fields in index)
# Good: All fields in index
User.where(status: "active").only(:status, :created_at)

# Avoid: Fields not in index
User.where(status: "active").only(:status, :created_at, :name)

# Use projection to limit returned fields
users = User.where(status: "active").only(:name, :email)

# Use limit for large result sets
recent_users = User.order(created_at: -1).limit(100)

# Use skip with limit for pagination
page_users = User.order(created_at: -1).skip(offset).limit(per_page)

Query Analysis and Optimization

# Analyze query performance
explanation = User.where(status: "active").explain

# Check if query uses index
puts "Uses index: #{explanation['queryPlanner']['winningPlan']['inputStage']['indexName']}"

# Check execution time
puts "Execution time: #{explanation['executionStats']['executionTimeMillis']}ms"

# Check documents examined vs returned
puts "Docs examined: #{explanation['executionStats']['totalDocsExamined']}"
puts "Docs returned: #{explanation['executionStats']['nReturned']}"

# Profile slow queries
Mongoid.default_client.database.command(profile: 2, slowms: 100)

# Monitor query performance
class QueryLogger
  def self.log(query, duration)
    Rails.logger.info "Query: #{query} took #{duration}ms"
  end
end

# Use in models
class User
  include Mongoid::Document
  
  def self.with_logging
    start_time = Time.current
    result = yield
    duration = ((Time.current - start_time) * 1000).round
    QueryLogger.log(to_sql, duration)
    result
  end
end

Bulk Operations

# Bulk insert
users_data = [
  { name: "John", email: "[email protected]" },
  { name: "Jane", email: "[email protected]" },
  { name: "Bob", email: "[email protected]" }
]

User.collection.insert_many(users_data)

# Bulk update
User.collection.update_many(
  { status: "pending" },
  { "$set" => { status: "active", updated_at: Time.current } }
)

# Bulk upsert
User.collection.update_many(
  { email: "[email protected]" },
  { "$set" => { name: "John Updated", updated_at: Time.current } },
  { upsert: true }
)

# Bulk delete
User.collection.delete_many({ status: "inactive" })

# Using Mongoid for bulk operations
User.where(status: "pending").update_all(status: "active")

# Batch processing
User.where(status: "pending").find_in_batches(batch_size: 1000) do |batch|
  batch.each do |user|
    user.update(status: "active")
  end
end
🎯 Real-World Query Examples

E-commerce Analytics

# Sales analytics
sales_analytics = Order.collection.aggregate([
  { "$match" => { 
    created_at: { "$gte" => 30.days.ago },
    status: "completed"
  }},
  { "$group" => {
    "_id" => {
      "year" => { "$year" => "$created_at" },
      "month" => { "$month" => "$created_at" },
      "day" => { "$dayOfMonth" => "$created_at" }
    },
    "total_sales" => { "$sum" => "$total_amount" },
    "order_count" => { "$sum" => 1 },
    "avg_order_value" => { "$avg" => "$total_amount" }
  }},
  { "$sort" => { "_id" => 1 } }
])

# Product performance
product_performance = Product.collection.aggregate([
  { "$lookup" => {
    "from" => "order_items",
    "localField" => "_id",
    "foreignField" => "product_id",
    "as" => "orders"
  }},
  { "$addFields" => {
    "total_sold" => { "$sum" => "$orders.quantity" },
    "total_revenue" => { "$sum" => { "$multiply" => ["$orders.quantity", "$orders.price"] } }
  }},
  { "$match" => { "total_sold" => { "$gt" => 0 } } },
  { "$sort" => { "total_revenue" => -1 } },
  { "$limit" => 10 }
])

# Customer segmentation
customer_segments = User.collection.aggregate([
  { "$lookup" => {
    "from" => "orders",
    "localField" => "_id",
    "foreignField" => "user_id",
    "as" => "orders"
  }},
  { "$addFields" => {
    "total_spent" => { "$sum" => "$orders.total_amount" },
    "order_count" => { "$size" => "$orders" },
    "avg_order_value" => { "$avg" => "$orders.total_amount" }
  }},
  { "$addFields" => {
    "segment" => {
      "$switch" => {
        "branches" => [
          { "case" => { "$gte" => ["$total_spent", 1000] }, "then" => "premium" },
          { "case" => { "$gte" => ["$total_spent", 500] }, "then" => "regular" }
        ],
        "default" => "new"
      }
    }
  }},
  { "$group" => {
    "_id" => "$segment",
    "count" => { "$sum" => 1 },
    "avg_spent" => { "$avg" => "$total_spent" }
  }}
])

Social Media Analytics

# Engagement analytics
engagement_analytics = Post.collection.aggregate([
  { "$match" => { 
    created_at: { "$gte" => 7.days.ago },
    visibility: "public"
  }},
  { "$addFields" => {
    "total_engagement" => { 
      "$add" => [
        { "$size" => "$reactions" },
        { "$size" => "$comments" },
        "$share_count"
      ]
    }
  }},
  { "$group" => {
    "_id" => {
      "year" => { "$year" => "$created_at" },
      "month" => { "$month" => "$created_at" },
      "day" => { "$dayOfMonth" => "$created_at" }
    },
    "total_posts" => { "$sum" => 1 },
    "total_engagement" => { "$sum" => "$total_engagement" },
    "avg_engagement" => { "$avg" => "$total_engagement" }
  }},
  { "$sort" => { "_id" => 1 } }
])

# Trending hashtags
trending_hashtags = Post.collection.aggregate([
  { "$match" => { 
    created_at: { "$gte" => 24.hours.ago },
    visibility: "public"
  }},
  { "$unwind" => "$hashtags" },
  { "$group" => {
    "_id" => "$hashtags",
    "count" => { "$sum" => 1 },
    "total_engagement" => { "$sum" => { "$add" => [
      { "$size" => "$reactions" },
      { "$size" => "$comments" }
    ]}}
  }},
  { "$sort" => { "count" => -1 } },
  { "$limit" => 10 }
])

# User activity timeline
user_activity = User.collection.aggregate([
  { "$lookup" => {
    "from" => "posts",
    "localField" => "_id",
    "foreignField" => "user_id",
    "as" => "posts"
  }},
  { "$addFields" => {
    "post_count" => { "$size" => "$posts" },
    "last_post_date" => { "$max" => "$posts.created_at" }
  }},
  { "$addFields" => {
    "activity_level" => {
      "$switch" => {
        "branches" => [
          { "case" => { "$gte" => ["$post_count", 10] }, "then" => "high" },
          { "case" => { "$gte" => ["$post_count", 5] }, "then" => "medium" }
        ],
        "default" => "low"
      }
    }
  }},
  { "$group" => {
    "_id" => "$activity_level",
    "count" => { "$sum" => 1 },
    "avg_posts" => { "$avg" => "$post_count" }
  }}
])

Content Management Queries

# Content search with relevance
content_search = Article.collection.aggregate([
  { "$match" => { 
    "$text" => { "$search" => "ruby rails mongodb" },
    status: "published"
  }},
  { "$addFields" => {
    "relevance_score" => { "$meta" => "textScore" }
  }},
  { "$sort" => { "relevance_score" => -1 } },
  { "$limit" => 20 }
])

# Category content distribution
category_distribution = Article.collection.aggregate([
  { "$match" => { status: "published" } },
  { "$group" => {
    "_id" => "$category",
    "article_count" => { "$sum" => 1 },
    "total_views" => { "$sum" => "$view_count" },
    "avg_rating" => { "$avg" => "$rating" }
  }},
  { "$sort" => { "article_count" => -1 } }
])

# Author performance
author_performance = User.collection.aggregate([
  { "$lookup" => {
    "from" => "articles",
    "localField" => "_id",
    "foreignField" => "author_id",
    "as" => "articles"
  }},
  { "$addFields" => {
    "published_articles" => {
      "$size" => {
        "$filter" => {
          "input" => "$articles",
          "cond" => { "$eq" => ["$$this.status", "published"] }
        }
      }
    },
    "total_views" => { "$sum" => "$articles.view_count" },
    "avg_rating" => { "$avg" => "$articles.rating" }
  }},
  { "$match" => { "published_articles" => { "$gt" => 0 } } },
  { "$sort" => { "total_views" => -1 } },
  { "$limit" => 10 }
])

Associations & Relationships

Relationships Overview: MongoDB supports both embedded and referenced relationships. This section covers all relationship types, when to use each approach, and best practices for designing efficient data structures in MongoDB.
📦 Embedded Relationships

One-to-One Embedding

What is One-to-One Embedding? This is when you embed a single document within another document. It's perfect for data that belongs exclusively to the parent and is always accessed together. Think of it like a user having one profile or one set of preferences.

When to Use: Use one-to-one embedding when the embedded data is small, doesn't change frequently, and is always accessed together with the parent document. Examples include user profiles, preferences, settings, or configuration data.

Benefits: Fast reads (no additional queries), atomic updates, and simple data access. The embedded document is stored directly within the parent document, so there's no need for joins.

Limitations: The embedded document cannot be shared between multiple parents, and the parent document size increases. MongoDB has a 16MB document size limit.

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  
  embeds_one :profile
  embeds_one :preferences
  
  validates :email, presence: true, uniqueness: true
end

class Profile
  include Mongoid::Document
  
  field :bio, type: String
  field :avatar_url, type: String
  field :location, type: String
  field :website, type: String
  field :birth_date, type: Date
  
  embedded_in :user
  
  validates :bio, length: { maximum: 500 }
  
  def display_location
    location.present? ? location : "Not specified"
  end
end

class Preferences
  include Mongoid::Document
  
  field :theme, type: String, default: "light"
  field :language, type: String, default: "en"
  field :notifications, type: Hash, default: {}
  field :privacy_settings, type: Hash, default: {}
  
  embedded_in :user
  
  def notification_enabled?(type)
    notifications[type.to_s] == true
  end
end

# Usage examples
user = User.create(email: "[email protected]", name: "John Doe")
user.create_profile(
  bio: "Ruby developer passionate about clean code",
  location: "New York, NY",
  website: "https://johndoe.dev"
)
user.create_preferences(
  theme: "dark",
  language: "en",
  notifications: { email: true, push: false, sms: true }
)

# Accessing embedded documents
user.profile.bio # => "Ruby developer passionate about clean code"
user.preferences.notification_enabled?(:email) # => true
user.profile.display_location # => "New York, NY"

One-to-Many Embedding

What is One-to-Many Embedding? This allows you to embed multiple documents within a parent document. It's like having a collection of related items that belong exclusively to one parent. Think of a user having multiple addresses, phone numbers, or social media accounts.

When to Use: Use one-to-many embedding when you have a collection of related data that:

  • Always belongs to one parent (never shared)
  • Is relatively small in size
  • Is accessed together with the parent
  • Doesn't change frequently
Common examples include addresses, phone numbers, social accounts, or order items.

Benefits: Atomic updates (all embedded documents are updated together), fast reads (no additional queries), and simple data access. You can also query embedded documents directly using MongoDB's dot notation.

Considerations: Be mindful of document size limits (16MB), and remember that embedded documents cannot be shared between parents. For large collections or frequently changing data, consider using referenced relationships instead.

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  
  embeds_many :addresses
  embeds_many :phone_numbers
  embeds_many :social_accounts
  
  validates :email, presence: true, uniqueness: true
end

class Address
  include Mongoid::Document
  
  field :street, type: String
  field :city, type: String
  field :state, type: String
  field :zip_code, type: String
  field :country, type: String, default: "USA"
  field :primary, type: Boolean, default: false
  field :address_type, type: String, default: "home" # home, work, billing
  
  embedded_in :user
  
  validates :street, :city, :state, presence: true
  
  scope :primary, -> { where(primary: true) }
  scope :by_type, ->(type) { where(address_type: type) }
  
  def full_address
    [street, city, state, zip_code, country].compact.join(", ")
  end
  
  def make_primary!
    user.addresses.update_all(primary: false)
    update!(primary: true)
  end
end

class PhoneNumber
  include Mongoid::Document
  
  field :number, type: String
  field :type, type: String, default: "mobile" # mobile, home, work
  field :primary, type: Boolean, default: false
  field :verified, type: Boolean, default: false
  
  embedded_in :user
  
  validates :number, presence: true, format: { with: /\A\+?[\d\s\-\(\)]+\z/ }
  
  scope :verified, -> { where(verified: true) }
  scope :primary, -> { where(primary: true) }
end

class SocialAccount
  include Mongoid::Document
  
  field :platform, type: String # twitter, linkedin, github
  field :username, type: String
  field :url, type: String
  field :verified, type: Boolean, default: false
  
  embedded_in :user
  
  validates :platform, :username, presence: true
  
  scope :verified, -> { where(verified: true) }
  scope :by_platform, ->(platform) { where(platform: platform) }
end

# Usage examples
user = User.create(email: "[email protected]", name: "Jane Smith")

# Add addresses
user.addresses.create(
  street: "123 Main St",
  city: "New York",
  state: "NY",
  zip_code: "10001",
  primary: true
)

user.addresses.create(
  street: "456 Work Ave",
  city: "New York",
  state: "NY",
  zip_code: "10002",
  address_type: "work"
)

# Add phone numbers
user.phone_numbers.create(
  number: "+1-555-123-4567",
  type: "mobile",
  primary: true,
  verified: true
)

# Add social accounts
user.social_accounts.create(
  platform: "github",
  username: "janesmith",
  url: "https://github.com/janesmith",
  verified: true
)

# Querying embedded documents
primary_address = user.addresses.primary.first
verified_phones = user.phone_numbers.verified
github_account = user.social_accounts.by_platform("github").first

Embedded Document Best Practices

When to Use Embedded Documents: Embedded documents are perfect for data that has a strong parent-child relationship and is always accessed together. They provide excellent performance for read operations since all data is retrieved in a single query.

Size Considerations: Keep embedded documents small and manageable. MongoDB has a 16MB document size limit, so avoid embedding large arrays or complex nested structures. If your embedded data grows large, consider moving to referenced relationships.

Access Patterns: Use embedded documents when the data is always accessed together with the parent. If you frequently need to access embedded data independently, consider using referenced relationships instead.

Update Patterns: Embedded documents support atomic updates, which means all embedded documents are updated together. This is great for data consistency but can be inefficient if you only need to update a single embedded document.

Best Practices Summary:

  • Use for small, related data: Addresses, phone numbers, preferences
  • Always accessed together: Profile with user, addresses with user
  • Limited size: Avoid embedding large arrays or complex nested structures
  • Atomic updates: Updates to embedded documents are atomic
  • No sharing: Embedded documents cannot be shared between parents

Embedded Document Queries

Querying Embedded Documents: MongoDB provides powerful querying capabilities for embedded documents. You can query embedded fields using dot notation, which allows you to search within nested structures efficiently. This is one of the key advantages of MongoDB's document model.

Dot Notation: Use dot notation to access nested fields. For example, "addresses.city" refers to the city field within the addresses array. This works for both single embedded documents and arrays of embedded documents.

Array Queries: When querying arrays of embedded documents, you can use operators like $exists to check for the presence of elements, or array indices to access specific positions. This is particularly useful for finding users with multiple addresses or specific types of embedded documents.

Complex Queries: You can combine multiple conditions on embedded documents to create sophisticated queries. This allows you to find documents that match specific criteria across different embedded structures, making MongoDB very flexible for complex data relationships.

# Find users with specific embedded document criteria
users_with_ny_address = User.where("addresses.city" => "New York")
users_with_verified_phone = User.where("phone_numbers.verified" => true)
users_with_github = User.where("social_accounts.platform" => "github")

# Find users with primary address in specific state
users_in_california = User.where("addresses.primary" => true, "addresses.state" => "CA")

# Find users with multiple addresses
users_with_multiple_addresses = User.where("addresses.1" => { "$exists" => true })

# Find users with verified social accounts
users_with_verified_social = User.where("social_accounts.verified" => true)

# Complex embedded queries
users_with_complete_profile = User.where(
  "profile.bio" => { "$exists" => true, "$ne" => "" },
  "addresses.primary" => true,
  "phone_numbers.verified" => true
)
🔗 Referenced Relationships

One-to-Many Relationships

What are Referenced Relationships? Referenced relationships use document IDs to link documents across different collections. Unlike embedded relationships, the related documents are stored separately and referenced by their IDs. This is similar to foreign keys in relational databases.

When to Use Referenced Relationships: Use referenced relationships when:

  • Data is large or complex
  • Documents need to be shared between multiple parents
  • Data changes frequently
  • You need to query related documents independently
  • You want to avoid hitting document size limits

Benefits: Referenced relationships provide flexibility and scalability. They allow you to:

  • Share documents between multiple parents
  • Query related documents independently
  • Handle large datasets efficiently
  • Update related documents without affecting the parent

Considerations: Referenced relationships require additional queries to fetch related data, which can lead to N+1 query problems. Use eager loading (includes) to optimize performance when you need to access related documents.

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  field :status, type: String, default: "active"
  
  # One-to-many associations
  has_many :posts
  has_many :comments
  has_many :orders
  has_many :notifications
  
  validates :email, presence: true, uniqueness: true
  
  # Scopes for related data
  scope :with_posts, -> { includes(:posts) }
  scope :active_with_posts, -> { where(status: "active").includes(:posts) }
  
  def post_count
    posts.count
  end
  
  def recent_posts(limit = 5)
    posts.order(created_at: -1).limit(limit)
  end
  
  def total_comments
    comments.count
  end
end

class Post
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :title, type: String
  field :content, type: String
  field :status, type: String, default: "draft"
  field :published_at, type: DateTime
  field :view_count, type: Integer, default: 0
  field :tags, type: Array, default: []
  
  belongs_to :user
  
  has_many :comments
  has_many :likes
  
  validates :title, presence: true
  validates :content, presence: true, length: { minimum: 10 }
  
  scope :published, -> { where(status: "published") }
  scope :recent, -> { order(created_at: -1) }
  scope :popular, -> { order(view_count: -1) }
  
  def publish!
    update!(status: "published", published_at: Time.current)
  end
  
  def increment_view_count!
    increment!(view_count: 1)
  end
end

class Comment
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :content, type: String
  field :status, type: String, default: "approved"
  field :rating, type: Integer
  
  belongs_to :user
  belongs_to :post
  
  validates :content, presence: true, length: { minimum: 2 }
  validates :rating, numericality: { greater_than: 0, less_than: 6 }, allow_nil: true
  
  scope :approved, -> { where(status: "approved") }
  scope :recent, -> { order(created_at: -1) }
end

# Usage examples
user = User.create(email: "[email protected]", name: "John Author")

# Create posts
post1 = user.posts.create(
  title: "Getting Started with MongoDB",
  content: "MongoDB is a powerful NoSQL database...",
  tags: ["mongodb", "nosql", "database"]
)

post2 = user.posts.create(
  title: "Advanced Rails Patterns",
  content: "In this post, we'll explore advanced Rails patterns...",
  tags: ["rails", "ruby", "patterns"]
)

# Add comments to posts
other_user = User.create(email: "[email protected]", name: "Jane Reader")
other_user.comments.create(
  post: post1,
  content: "Great article! Very helpful.",
  rating: 5
)

# Query relationships
user.posts.published.count # => 0 (posts are drafts)
user.posts.first.publish! # => Publish the first post
user.posts.published.count # => 1

# Eager loading to avoid N+1 queries
users_with_posts = User.includes(:posts).where(:id.in => [user.id, other_user.id])
users_with_posts.each do |u|
  puts "#{u.name} has #{u.posts.count} posts"
end

Many-to-Many Relationships

What are Many-to-Many Relationships? Many-to-many relationships allow documents to be associated with multiple other documents in both directions. This is useful when you have complex relationships where both sides can have multiple connections. Think of users having multiple roles, or users following multiple other users.

When to Use Many-to-Many Relationships: Use many-to-many relationships when:

  • Both sides of the relationship can have multiple connections
  • You need to query relationships from both directions
  • The relationship data is simple (just IDs)
  • You want to avoid creating intermediate documents

Implementation Options: MongoDB offers two main approaches for many-to-many relationships:

  • has_and_belongs_to_many: Simple relationships with just IDs
  • has_many through: Complex relationships with additional data

Performance Considerations: Many-to-many relationships can become complex to query efficiently. Consider using indexes on the relationship arrays and be mindful of array size limits. For very large relationships, consider using a separate collection as an intermediate table.

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  
  # Many-to-many associations
  has_and_belongs_to_many :roles
  has_and_belongs_to_many :groups
  has_and_belongs_to_many :followed_users, class_name: "User", inverse_of: :followers
  has_and_belongs_to_many :followers, class_name: "User", inverse_of: :followed_users
  
  validates :email, presence: true, uniqueness: true
  
  def admin?
    roles.any? { |role| role.name == "admin" }
  end
  
  def follow!(user)
    followed_users << user unless followed_users.include?(user)
  end
  
  def unfollow!(user)
    followed_users.delete(user)
  end
  
  def following?(user)
    followed_users.include?(user)
  end
end

class Role
  include Mongoid::Document
  
  field :name, type: String
  field :description, type: String
  field :permissions, type: Array, default: []
  
  has_and_belongs_to_many :users
  
  validates :name, presence: true, uniqueness: true
  
  scope :active, -> { where(:name.nin => ["deleted"]) }
  
  def has_permission?(permission)
    permissions.include?(permission.to_s)
  end
end

class Group
  include Mongoid::Document
  
  field :name, type: String
  field :description, type: String
  field :privacy, type: String, default: "public" # public, private, secret
  
  has_and_belongs_to_many :users
  has_many :posts
  
  validates :name, presence: true
  
  scope :public_groups, -> { where(privacy: "public") }
  
  def member_count
    users.count
  end
  
  def add_member(user)
    users << user unless users.include?(user)
  end
  
  def remove_member(user)
    users.delete(user)
  end
end

# Usage examples
# Create roles
admin_role = Role.create(name: "admin", permissions: ["manage_users", "manage_content"])
moderator_role = Role.create(name: "moderator", permissions: ["moderate_content"])
user_role = Role.create(name: "user", permissions: ["create_content"])

# Create groups
ruby_group = Group.create(name: "Ruby Developers", description: "Ruby programming community")
rails_group = Group.create(name: "Rails Developers", description: "Rails framework community")

# Create users and assign roles
user1 = User.create(email: "[email protected]", name: "Admin User")
user2 = User.create(email: "[email protected]", name: "Moderator User")
user3 = User.create(email: "[email protected]", name: "Regular User")

user1.roles << admin_role
user2.roles << moderator_role
user3.roles << user_role

# Add users to groups
ruby_group.add_member(user1)
ruby_group.add_member(user2)
rails_group.add_member(user1)
rails_group.add_member(user3)

# Follow relationships
user1.follow!(user2)
user1.follow!(user3)
user2.follow!(user1)

# Query relationships
User.where(:roles.in => [admin_role.id]).count # => 1
ruby_group.users.count # => 2
user1.followed_users.count # => 2
user1.followers.count # => 1

# Check permissions
user1.admin? # => true
user2.admin? # => false
admin_role.has_permission?("manage_users") # => true

Referenced Relationship Best Practices

When to Use Referenced Relationships: Referenced relationships are ideal for data that needs to be shared, queried independently, or updated frequently. They provide the flexibility to handle complex data relationships while maintaining good performance characteristics.

Performance Optimization: The key to good performance with referenced relationships is proper indexing and eager loading. Always create indexes on foreign key fields and use includes() to avoid N+1 query problems when accessing related documents.

Data Independence: Referenced relationships allow documents to exist independently. This is perfect for data that can be shared between multiple parents or needs to be queried and updated independently of its parent document.

Scalability Considerations: While referenced relationships provide flexibility, they can become performance bottlenecks if not properly optimized. Monitor query performance and consider denormalization for frequently accessed data.

Best Practices Summary:

  • Use for large or shared data: Posts, comments, orders
  • Independent entities: Data that can exist on its own
  • Frequent updates: Data that changes often
  • Eager loading: Use includes() to avoid N+1 queries
  • Index foreign keys: Always index referenced fields
🔄 Polymorphic Associations

Polymorphic Relationships

What are Polymorphic Relationships? Polymorphic relationships allow a document to belong to multiple different types of documents. This is useful when you have a common behavior or feature that can be applied to different types of content. Think of comments that can be attached to posts, articles, videos, or any other content type.

When to Use Polymorphic Relationships: Use polymorphic relationships when:

  • You have common behavior across different document types
  • You want to avoid creating separate collections for each relationship
  • The related data has the same structure regardless of the parent type
  • You need to query related data across different parent types

Implementation: Polymorphic relationships in MongoDB use two fields:

  • commentable_type: Stores the class name of the parent
  • commentable_id: Stores the ID of the parent document
This allows MongoDB to know which collection and document to reference.

Performance Considerations: Polymorphic relationships can be slower to query than regular relationships because MongoDB needs to check multiple collections. Use proper indexes on both the type and ID fields, and consider eager loading to optimize performance.

class Comment
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :content, type: String
  field :status, type: String, default: "approved"
  
  # Polymorphic association
  belongs_to :commentable, polymorphic: true
  
  belongs_to :user
  
  validates :content, presence: true
  
  scope :approved, -> { where(status: "approved") }
  scope :recent, -> { order(created_at: -1) }
end

class Post
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :title, type: String
  field :content, type: String
  field :status, type: String, default: "draft"
  
  belongs_to :user
  has_many :comments, as: :commentable
  
  validates :title, presence: true
end

class Article
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :title, type: String
  field :content, type: String
  field :category, type: String
  
  belongs_to :author, class_name: "User"
  has_many :comments, as: :commentable
  
  validates :title, presence: true
end

class Video
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :title, type: String
  field :url, type: String
  field :duration, type: Integer
  
  belongs_to :creator, class_name: "User"
  has_many :comments, as: :commentable
  
  validates :title, :url, presence: true
end

# Usage examples
user = User.create(email: "[email protected]", name: "Commenter")

# Create different content types
post = Post.create(
  title: "MongoDB Guide",
  content: "Complete guide to MongoDB...",
  user: User.first
)

article = Article.create(
  title: "Rails Best Practices",
  content: "Learn Rails best practices...",
  category: "Programming",
  author: User.first
)

video = Video.create(
  title: "MongoDB Tutorial",
  url: "https://youtube.com/watch?v=abc123",
  duration: 1800,
  creator: User.first
)

# Add comments to different content types
user.comments.create(
  commentable: post,
  content: "Great post about MongoDB!"
)

user.comments.create(
  commentable: article,
  content: "Very helpful article!"
)

user.comments.create(
  commentable: video,
  content: "Excellent tutorial!"
)

# Query polymorphic relationships
post.comments.count # => 1
article.comments.count # => 1
video.comments.count # => 1

# Find all comments by a user
user.comments.includes(:commentable).each do |comment|
  puts "Comment on #{comment.commentable.class.name}: #{comment.content}"
end

# Find comments on specific content types
Comment.where(commentable_type: "Post").count # => 1
Comment.where(commentable_type: "Article").count # => 1
Comment.where(commentable_type: "Video").count # => 1

Polymorphic Best Practices

Design Considerations: When implementing polymorphic relationships, ensure that all parent types have a consistent interface. This makes it easier to work with the polymorphic association and reduces the complexity of your code.

Performance Optimization: Polymorphic relationships require careful indexing to maintain good performance. Always create compound indexes on both the type and ID fields, and use eager loading to avoid N+1 queries when accessing polymorphic associations.

Type Safety: Since polymorphic relationships store class names as strings, it's important to validate the commentable_type values and handle cases where the referenced class might not exist or might have been renamed.

Use Cases: Polymorphic relationships are perfect for features that can be applied to multiple types of content, such as comments, likes, attachments, or any other shared behavior across different document types.

Best Practices Summary:

  • Use for shared behavior: Comments, likes, attachments
  • Consistent interface: All commentable objects should have similar methods
  • Index polymorphic fields: Index both commentable_type and commentable_id
  • Eager loading: Use includes() with polymorphic associations
  • Type safety: Validate commentable_type values
⚡ Performance & Optimization

Eager Loading Strategies

What is Eager Loading? Eager loading is a technique that loads related documents in a single query instead of making separate queries for each relationship. This is crucial for avoiding the N+1 query problem, where accessing related data results in many additional database queries.

The N+1 Query Problem: When you have a collection of documents and need to access their related data, without eager loading, MongoDB will make one query for the main documents and then one additional query for each document to fetch its related data. This can result in hundreds or thousands of queries for large datasets.

When to Use Eager Loading: Use eager loading whenever you know you'll need to access related data for multiple documents. This is especially important when:

  • Displaying lists with related data
  • Iterating through documents and accessing relationships
  • Building complex views that need multiple related documents

Performance Impact: Eager loading can dramatically improve performance by reducing the number of database queries. However, it does load more data into memory, so use it judiciously and only for relationships you actually need.

# Avoid N+1 queries with eager loading
# Bad: N+1 queries
users = User.all
users.each do |user|
  puts "#{user.name} has #{user.posts.count} posts" # N+1 query per user
end

# Good: Eager loading
users = User.includes(:posts).all
users.each do |user|
  puts "#{user.name} has #{user.posts.count} posts" # No additional queries
end

# Multiple associations
users = User.includes(:posts, :comments, :roles).all

# Nested eager loading
users = User.includes(posts: :comments).all

# Conditional eager loading
users = User.includes(:posts).where(:status => "active")

# Polymorphic eager loading
comments = Comment.includes(:commentable, :user).all

# Custom eager loading with scopes
users = User.includes(:posts).where(:status => "active")
users.each do |user|
  puts "#{user.name}: #{user.posts.published.count} published posts"
end

Indexing for Relationships

Why Index Relationships? Proper indexing is crucial for maintaining good performance when working with relationships in MongoDB. Without proper indexes, queries that involve relationships can become very slow, especially as your data grows.

Key Indexing Strategies: When working with relationships, you should create indexes on:

  • Foreign key fields: Always index the fields that reference other documents
  • Polymorphic fields: Index both the type and ID fields for polymorphic relationships
  • Compound indexes: Create compound indexes for frequently used query combinations
  • Array fields: Index array fields that store relationship IDs

Performance Considerations: Indexes improve query performance but add overhead to write operations. Monitor your index usage and remove unused indexes. For polymorphic relationships, consider creating separate indexes for each type if you frequently query by specific types.

Index Maintenance: Regularly review your indexes to ensure they're being used effectively. MongoDB provides tools to analyze index usage and identify unused or inefficient indexes.

class User
  include Mongoid::Document
  
  field :email, type: String
  field :status, type: String
  
  has_many :posts
  has_many :comments
  
  # Index foreign keys in related models
  index({ email: 1 }, { unique: true })
  index({ status: 1 })
end

class Post
  include Mongoid::Document
  
  field :title, type: String
  field :status, type: String
  field :user_id, type: BSON::ObjectId
  
  belongs_to :user
  
  # Index foreign key
  index({ user_id: 1 })
  index({ status: 1, user_id: 1 })
  index({ created_at: -1, user_id: 1 })
end

class Comment
  include Mongoid::Document
  
  field :content, type: String
  field :user_id, type: BSON::ObjectId
  field :post_id, type: BSON::ObjectId
  field :commentable_type, type: String
  field :commentable_id, type: BSON::ObjectId
  
  belongs_to :user
  belongs_to :post
  belongs_to :commentable, polymorphic: true
  
  # Index foreign keys
  index({ user_id: 1 })
  index({ post_id: 1 })
  index({ commentable_type: 1, commentable_id: 1 })
  index({ created_at: -1, user_id: 1 })
end

# Create all indexes
Mongoid.create_indexes

Relationship Query Optimization

# Use projection to limit returned fields
users = User.only(:name, :email).includes(:posts)

# Use scopes for common queries
class User
  include Mongoid::Document
  
  has_many :posts
  
  scope :with_recent_posts, -> { 
    includes(:posts).where("posts.created_at" => { "$gte" => 1.week.ago })
  }
  
  scope :with_post_count, -> {
    aggregate([
      { "$lookup" => {
        "from" => "posts",
        "localField" => "_id",
        "foreignField" => "user_id",
        "as" => "posts"
      }},
      { "$addFields" => { "post_count" => { "$size" => "$posts" } } },
      { "$match" => { "post_count" => { "$gt" => 0 } } }
    ])
  }
end

# Batch processing for large datasets
User.find_in_batches(batch_size: 1000) do |batch|
  batch.each do |user|
    user.posts.includes(:comments).each do |post|
      # Process post and comments
    end
  end
end

# Use aggregation for complex relationship queries
top_users = User.collection.aggregate([
  { "$lookup" => {
    "from" => "posts",
    "localField" => "_id",
    "foreignField" => "user_id",
    "as" => "posts"
  }},
  { "$addFields" => {
    "post_count" => { "$size" => "$posts" },
    "total_views" => { "$sum" => "$posts.view_count" }
  }},
  { "$match" => { "post_count" => { "$gt" => 0 } } },
  { "$sort" => { "total_views" => -1 } },
  { "$limit" => 10 }
])
🎯 Real-World Examples

E-commerce Relationships

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  field :status, type: String, default: "active"
  
  # Embedded relationships
  embeds_one :profile
  embeds_many :addresses
  embeds_many :payment_methods
  
  # Referenced relationships
  has_many :orders
  has_many :reviews
  has_many :wishlist_items
  has_and_belongs_to_many :favorite_categories
  
  validates :email, presence: true, uniqueness: true
  
  def total_spent
    orders.completed.sum(:total_amount)
  end
  
  def average_order_value
    completed_orders = orders.completed
    return 0 if completed_orders.empty?
    completed_orders.sum(:total_amount) / completed_orders.count
  end
end

class Order
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :order_number, type: String
  field :status, type: String, default: "pending"
  field :total_amount, type: Float
  field :shipping_address, type: Hash
  field :billing_address, type: Hash
  
  belongs_to :user
  has_many :order_items
  has_many :payments
  
  validates :order_number, presence: true, uniqueness: true
  
  scope :completed, -> { where(status: "completed") }
  scope :pending, -> { where(status: "pending") }
  
  def complete!
    update!(status: "completed")
  end
  
  def total_items
    order_items.sum(:quantity)
  end
end

class Product
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :name, type: String
  field :description, type: String
  field :price, type: Float
  field :sku, type: String
  field :category, type: String
  field :tags, type: Array, default: []
  field :inventory, type: Integer, default: 0
  
  belongs_to :brand
  has_many :order_items
  has_many :reviews
  has_many :wishlist_items
  
  validates :name, :price, :sku, presence: true
  validates :sku, uniqueness: true
  
  scope :in_stock, -> { where(:inventory.gt => 0) }
  scope :by_category, ->(category) { where(category: category) }
  
  def available?
    inventory > 0
  end
  
  def average_rating
    reviews.any? ? reviews.avg(:rating) : 0
  end
end

class Review
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :rating, type: Integer
  field :title, type: String
  field :content, type: String
  field :verified_purchase, type: Boolean, default: false
  
  belongs_to :user
  belongs_to :product
  
  validates :rating, presence: true, numericality: { greater_than: 0, less_than: 6 }
  validates :content, presence: true, length: { minimum: 10 }
  
  scope :verified, -> { where(verified_purchase: true) }
  scope :recent, -> { order(created_at: -1) }
end

# Usage examples
user = User.create(email: "[email protected]", name: "John Customer")

# Add embedded data
user.create_profile(
  phone: "+1-555-123-4567",
  preferences: { newsletter: true, marketing: false }
)

user.addresses.create(
  street: "123 Main St",
  city: "New York",
  state: "NY",
  zip_code: "10001",
  primary: true
)

# Create products and orders
product1 = Product.create(
  name: "Ruby Programming Book",
  price: 29.99,
  sku: "BOOK-RUBY-001",
  category: "Books",
  inventory: 50
)

product2 = Product.create(
  name: "Rails Framework Guide",
  price: 39.99,
  sku: "BOOK-RAILS-001",
  category: "Books",
  inventory: 30
)

# Create order
order = user.orders.create(
  order_number: "ORD-#{Time.current.to_i}",
  total_amount: 69.98,
  shipping_address: user.addresses.primary.first.attributes,
  billing_address: user.addresses.primary.first.attributes
)

# Add items to order
order.order_items.create(
  product: product1,
  quantity: 1,
  price: product1.price
)

order.order_items.create(
  product: product2,
  quantity: 1,
  price: product2.price
)

# Add review
user.reviews.create(
  product: product1,
  rating: 5,
  title: "Excellent book!",
  content: "This book helped me learn Ruby programming from scratch.",
  verified_purchase: true
)

# Query relationships
user.total_spent # => 69.98
user.average_order_value # => 69.98
product1.average_rating # => 5.0
user.orders.completed.count # => 0
order.complete!
user.orders.completed.count # => 1

Social Network Relationships

class User
  include Mongoid::Document
  
  field :username, type: String
  field :email, type: String
  field :name, type: String
  field :bio, type: String
  field :status, type: String, default: "active"
  
  # Embedded relationships
  embeds_one :profile
  embeds_many :social_links
  
  # Referenced relationships
  has_many :posts
  has_many :comments
  has_many :messages_sent, class_name: "Message", foreign_key: "sender_id"
  has_many :messages_received, class_name: "Message", foreign_key: "recipient_id"
  
  # Many-to-many relationships
  has_and_belongs_to_many :followed_users, class_name: "User", inverse_of: :followers
  has_and_belongs_to_many :followers, class_name: "User", inverse_of: :followed_users
  has_and_belongs_to_many :groups
  
  validates :username, :email, presence: true, uniqueness: true
  
  def follow!(user)
    followed_users << user unless followed_users.include?(user)
  end
  
  def unfollow!(user)
    followed_users.delete(user)
  end
  
  def following?(user)
    followed_users.include?(user)
  end
  
  def feed_posts
    Post.where(:user_id.in => followed_users.pluck(:id) + [id])
        .order(created_at: -1)
        .includes(:user, :comments)
  end
end

class Post
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :content, type: String
  field :visibility, type: String, default: "public"
  field :location, type: Array # [longitude, latitude]
  field :tags, type: Array, default: []
  
  belongs_to :user
  has_many :comments
  has_many :likes
  has_many :shares
  
  validates :content, presence: true, length: { maximum: 1000 }
  
  scope :public_posts, -> { where(visibility: "public") }
  scope :recent, -> { order(created_at: -1) }
  
  def like_count
    likes.count
  end
  
  def comment_count
    comments.count
  end
  
  def share_count
    shares.count
  end
end

class Group
  include Mongoid::Document
  
  field :name, type: String
  field :description, type: String
  field :privacy, type: String, default: "public"
  field :rules, type: Array, default: []
  
  has_and_belongs_to_many :users
  has_many :posts
  
  validates :name, presence: true
  
  scope :public_groups, -> { where(privacy: "public") }
  
  def member_count
    users.count
  end
  
  def add_member(user)
    users << user unless users.include?(user)
  end
  
  def remove_member(user)
    users.delete(user)
  end
  
  def is_member?(user)
    users.include?(user)
  end
end

class Message
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :content, type: String
  field :read, type: Boolean, default: false
  field :message_type, type: String, default: "text" # text, image, file
  
  belongs_to :sender, class_name: "User"
  belongs_to :recipient, class_name: "User"
  
  validates :content, presence: true
  
  scope :unread, -> { where(read: false) }
  scope :recent, -> { order(created_at: -1) }
  
  def mark_as_read!
    update!(read: true)
  end
end

# Usage examples
user1 = User.create(username: "john_doe", email: "[email protected]", name: "John Doe")
user2 = User.create(username: "jane_smith", email: "[email protected]", name: "Jane Smith")
user3 = User.create(username: "bob_wilson", email: "[email protected]", name: "Bob Wilson")

# Follow relationships
user1.follow!(user2)
user1.follow!(user3)
user2.follow!(user1)

# Create group
ruby_group = Group.create(
  name: "Ruby Developers",
  description: "Community for Ruby developers",
  privacy: "public"
)

ruby_group.add_member(user1)
ruby_group.add_member(user2)

# Create posts
post1 = user1.posts.create(
  content: "Just learned about MongoDB associations!",
  visibility: "public",
  tags: ["mongodb", "rails", "learning"]
)

post2 = user2.posts.create(
  content: "Great post, John! MongoDB is amazing.",
  visibility: "public",
  tags: ["mongodb", "agreement"]
)

# Add comments
user2.comments.create(
  post: post1,
  content: "Thanks for sharing this!"
)

# Send messages
user1.messages_sent.create(
  recipient: user2,
  content: "Hey Jane, check out my new post about MongoDB!"
)

# Query relationships
user1.followed_users.count # => 2
user1.followers.count # => 1
user1.feed_posts.count # => 3 (including own posts)
ruby_group.member_count # => 2
user2.messages_received.unread.count # => 1

Validations & Callbacks

Data Integrity Overview: Validations ensure data quality and consistency, while callbacks provide hooks for custom logic during document lifecycle events. This section covers all validation types, custom validations, and callback strategies.
✅ Basic Validations

Core Validation Types

What are Validations? Validations are rules that ensure data integrity and quality before documents are saved to the database. They act as a safety net, preventing invalid or inconsistent data from being stored. Validations run before save operations and can prevent documents from being persisted if they don't meet the specified criteria.

Why Use Validations? Validations provide several important benefits:

  • Data Integrity: Ensure only valid data is stored
  • Business Rules: Enforce application-specific requirements
  • Error Prevention: Catch issues early in the development cycle
  • User Experience: Provide clear feedback about validation errors
  • Security: Prevent malicious or malformed data

Validation Types: MongoDB with Mongoid provides a comprehensive set of validation types:

  • Presence: Ensures fields are not blank or nil
  • Uniqueness: Prevents duplicate values across documents
  • Format: Validates against regular expressions or patterns
  • Length: Ensures string fields meet size requirements
  • Numericality: Validates numeric ranges and types
  • Inclusion/Exclusion: Restricts values to specific sets

Validation Timing: Validations run at specific points in the document lifecycle:

  • Before Save: Validations run before any save operation
  • Before Update: Can be configured to run before updates
  • Conditional: Can be made conditional based on other fields

class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  field :age, type: Integer
  field :username, type: String
  field :bio, type: String
  field :website, type: String
  field :status, type: String, default: "active"
  
  # Presence validations
  validates :email, presence: true
  validates :name, presence: true
  validates :username, presence: true
  
  # Uniqueness validations
  validates :email, uniqueness: true
  validates :username, uniqueness: true, case_sensitive: false
  
  # Format validations
  validates :email, format: { with: URI::MailTo::EMAIL_REGEXP }
  validates :website, format: { with: URI::regexp(%w[http https]), allow_blank: true }
  
  # Length validations
  validates :name, length: { minimum: 2, maximum: 50 }
  validates :bio, length: { maximum: 500 }
  validates :username, length: { minimum: 3, maximum: 20 }
  
  # Numerical validations
  validates :age, numericality: { 
    greater_than: 0, 
    less_than: 150,
    only_integer: true,
    allow_nil: true
  }
  
  # Inclusion validations
  validates :status, inclusion: { 
    in: %w[active inactive suspended deleted],
    message: "must be active, inactive, suspended, or deleted"
  }
  
  # Exclusion validations
  validates :username, exclusion: { 
    in: %w[admin root system],
    message: "cannot be a reserved username"
  }
  
  # Custom validation methods
  validate :username_format
  validate :age_consistency
  
  private
  
  def username_format
    return if username.blank?
    
    unless username.match?(/\A[a-zA-Z0-9_]+\z/)
      errors.add(:username, "can only contain letters, numbers, and underscores")
    end
  end
  
  def age_consistency
    return if age.blank?
    
    if age < 13 && status == "active"
      errors.add(:age, "must be at least 13 for active accounts")
    end
  end
end

Validation Options Reference

Validation Options Overview: Each validation type supports various options that allow you to customize the validation behavior. Understanding these options helps you create more precise and flexible validation rules that match your specific business requirements.

Common Options: Most validations support these common options:

  • message: Custom error message for the validation
  • allow_blank: Skip validation if field is blank
  • allow_nil: Skip validation if field is nil
  • if: Only run validation if condition is met
  • unless: Skip validation if condition is met

Performance Considerations: Some validations, like uniqueness, can be expensive on large collections. Consider using database indexes to improve performance, and be mindful of validation complexity when dealing with high-volume data.

ValidationOptionsDescriptionExample
presencetrueField must be presentvalidates :email, presence: true
uniquenessscope, case_sensitiveField must be uniquevalidates :email, uniqueness: true
formatwith, allow_blankField must match regexvalidates :email, format: { with: /\A[^@\s]+@[^@\s]+\z/ }
lengthminimum, maximum, isString length constraintsvalidates :name, length: { minimum: 2 }
numericalitygreater_than, less_than, only_integerNumber constraintsvalidates :age, numericality: { greater_than: 0 }
inclusionin, messageValue must be in listvalidates :status, inclusion: { in: %w[active inactive] }
exclusionin, messageValue must not be in listvalidates :username, exclusion: { in: %w[admin] }
🔧 Advanced Validations

Custom Validations

What are Custom Validations? Custom validations allow you to implement complex business logic that goes beyond the built-in validation types. They give you complete control over the validation process and enable you to create sophisticated validation rules that are specific to your application's requirements.

When to Use Custom Validations: Use custom validations when:

  • You need complex business logic validation
  • Multiple fields need to be validated together
  • You need conditional validation based on other fields
  • Built-in validations don't cover your specific requirements
  • You need to validate against external data or services

Implementation Patterns: Custom validations follow these common patterns:

  • Single Field Validation: Validate one field with complex logic
  • Cross-Field Validation: Validate relationships between multiple fields
  • Conditional Validation: Only validate under certain conditions
  • External Validation: Validate against external APIs or services

Error Handling: Custom validations use the errors.add method to add validation errors. You can add errors to specific fields or to the base object, and you can provide custom error messages that are user-friendly and informative.

class Product
  include Mongoid::Document
  
  field :name, type: String
  field :price, type: Float
  field :sku, type: String
  field :category, type: String
  field :tags, type: Array, default: []
  field :metadata, type: Hash, default: {}
  
  validates :name, presence: true
  validates :price, numericality: { greater_than: 0 }
  validates :sku, presence: true, uniqueness: true
  
  # Custom validations
  validate :sku_format
  validate :price_consistency
  validate :tags_limit
  validate :metadata_structure
  
  private
  
  def sku_format
    return if sku.blank?
    
    unless sku.match?(/\A[A-Z]{2,3}-\d{3,6}\z/)
      errors.add(:sku, "must be in format: XX-123 or XXX-123456")
    end
  end
  
  def price_consistency
    return if price.blank?
    
    # Check if price is reasonable for category
    case category
    when "electronics"
      if price < 10
        errors.add(:price, "electronics must cost at least $10")
      end
    when "books"
      if price > 200
        errors.add(:price, "books cannot cost more than $200")
      end
    end
  end
  
  def tags_limit
    return if tags.blank?
    
    if tags.length > 10
      errors.add(:tags, "cannot have more than 10 tags")
    end
    
    if tags.any? { |tag| tag.length > 20 }
      errors.add(:tags, "each tag must be 20 characters or less")
    end
  end
  
  def metadata_structure
    return if metadata.blank?
    
    required_keys = ["brand", "weight", "dimensions"]
    missing_keys = required_keys - metadata.keys
    
    if missing_keys.any?
      errors.add(:metadata, "must include: #{missing_keys.join(', ')}")
    end
  end
end

Conditional Validations

What are Conditional Validations? Conditional validations allow you to apply validation rules only under specific circumstances. This is useful when certain fields should only be validated based on the state of other fields or the document's current condition. Conditional validations make your validation logic more flexible and context-aware.

When to Use Conditional Validations: Use conditional validations when:

  • Certain fields are only required in specific states
  • Validation rules change based on other field values
  • You want to avoid unnecessary validation overhead
  • Different business rules apply in different contexts
  • You need to validate optional fields only when they're provided

Conditional Methods: Conditional validations can use different types of conditions:

  • Symbol: Reference to a method that returns true/false
  • Lambda: Inline condition using a lambda or proc
  • String: String representation of a method call
  • Proc: Custom proc object for complex conditions

Performance Benefits: Conditional validations can improve performance by avoiding unnecessary validation checks. They also make your validation logic more readable and maintainable by clearly expressing when each validation should apply.

class Order
  include Mongoid::Document
  
  field :order_number, type: String
  field :total_amount, type: Float
  field :status, type: String, default: "pending"
  field :payment_method, type: String
  field :shipping_address, type: Hash
  field :billing_address, type: Hash
  
  validates :order_number, presence: true, uniqueness: true
  validates :total_amount, numericality: { greater_than: 0 }
  
  # Conditional validations
  validates :payment_method, presence: true, if: :requires_payment?
  validates :shipping_address, presence: true, if: :requires_shipping?
  validates :billing_address, presence: true, if: :requires_billing?
  
  # Conditional validation with custom method
  validate :payment_method_valid, if: :requires_payment?
  validate :address_format, if: :requires_shipping?
  
  private
  
  def requires_payment?
    status != "cancelled" && total_amount > 0
  end
  
  def requires_shipping?
    status != "cancelled" && !digital_product?
  end
  
  def requires_billing?
    status != "cancelled"
  end
  
  def digital_product?
    # Logic to determine if product is digital
    false
  end
  
  def payment_method_valid
    valid_methods = %w[credit_card paypal stripe]
    
    unless valid_methods.include?(payment_method)
      errors.add(:payment_method, "must be one of: #{valid_methods.join(', ')}")
    end
  end
  
  def address_format
    required_fields = %w[street city state zip_code]
    
    shipping_address&.each do |field, value|
      if required_fields.include?(field) && value.blank?
        errors.add(:shipping_address, "#{field} is required")
      end
    end
  end
end

Cross-Field Validations

class User
  include Mongoid::Document
  
  field :email, type: String
  field :email_confirmation, type: String
  field :password, type: String
  field :password_confirmation, type: String
  field :birth_date, type: Date
  field :registration_date, type: Date
  field :premium_expires_at, type: DateTime
  
  validates :email, presence: true, format: { with: URI::MailTo::EMAIL_REGEXP }
  validates :password, presence: true, length: { minimum: 8 }
  validates :birth_date, presence: true
  
  # Cross-field validations
  validate :email_confirmation_matches
  validate :password_confirmation_matches
  validate :birth_date_reasonable
  validate :premium_expiry_after_registration
  
  private
  
  def email_confirmation_matches
    return if email.blank? || email_confirmation.blank?
    
    unless email == email_confirmation
      errors.add(:email_confirmation, "doesn't match email")
    end
  end
  
  def password_confirmation_matches
    return if password.blank? || password_confirmation.blank?
    
    unless password == password_confirmation
      errors.add(:password_confirmation, "doesn't match password")
    end
  end
  
  def birth_date_reasonable
    return if birth_date.blank?
    
    if birth_date > Date.current
      errors.add(:birth_date, "cannot be in the future")
    end
    
    if birth_date < 100.years.ago.to_date
      errors.add(:birth_date, "seems too old")
    end
  end
  
  def premium_expiry_after_registration
    return if premium_expires_at.blank? || registration_date.blank?
    
    if premium_expires_at <= registration_date
      errors.add(:premium_expires_at, "must be after registration date")
    end
  end
end
🔄 Callbacks & Lifecycle

Model Callbacks

class User
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :email, type: String
  field :name, type: String
  field :username, type: String
  field :slug, type: String
  field :last_login_at, type: DateTime
  field :login_count, type: Integer, default: 0
  field :status, type: String, default: "active"
  
  validates :email, presence: true, uniqueness: true
  validates :name, presence: true
  
  # Before callbacks
  before_create :generate_username
  before_save :normalize_email
  before_update :track_changes
  before_validation :generate_slug
  
  # After callbacks
  after_create :send_welcome_email
  after_save :update_search_index
  after_destroy :cleanup_related_data
  
  # Around callbacks
  around_save :log_operation_time
  
  private
  
  def generate_username
    return if username.present?
    
    base_username = name.parameterize
    counter = 1
    
    loop do
      candidate = "#{base_username}#{counter}"
      unless User.where(username: candidate).exists?
        self.username = candidate
        break
      end
      counter += 1
    end
  end
  
  def normalize_email
    self.email = email.downcase.strip if email.present?
  end
  
  def track_changes
    changed_fields = changes.keys - %w[updated_at]
    if changed_fields.any?
      Rails.logger.info "User #{id} changed: #{changed_fields.join(', ')}"
    end
  end
  
  def generate_slug
    return if slug.present?
    self.slug = name.parameterize
  end
  
  def send_welcome_email
    UserMailer.welcome(self).deliver_later
  end
  
  def update_search_index
    SearchIndexJob.perform_later(self)
  end
  
  def cleanup_related_data
    # Clean up related posts, comments, etc.
    Post.where(user_id: id).destroy_all
    Comment.where(user_id: id).destroy_all
  end
  
  def log_operation_time
    start_time = Time.current
    yield
    duration = ((Time.current - start_time) * 1000).round
    Rails.logger.info "User save operation took #{duration}ms"
  end
end

Callback Types and Usage

Callback Lifecycle: Understanding when each callback runs is crucial for implementing the right logic at the right time. Callbacks follow a specific order during document operations, and knowing this order helps you avoid common pitfalls and implement effective business logic.

Before vs After Callbacks: The choice between before and after callbacks depends on your needs:

  • Before Callbacks: Use when you need to modify the document or prevent the operation
  • After Callbacks: Use for side effects that don't affect the current operation
  • Around Callbacks: Use when you need to wrap the entire operation with custom logic

Common Patterns: Each callback type has established patterns and best practices:

  • Data Preparation: Use before callbacks for normalization and defaults
  • Side Effects: Use after callbacks for notifications and external updates
  • Performance Monitoring: Use around callbacks for timing and logging
  • Cleanup: Use after callbacks for resource cleanup and maintenance

CallbackTriggered WhenCommon UsesExample
before_createBefore document is createdGenerate slugs, set defaultsbefore_create :generate_slug
after_createAfter document is createdSend notifications, create related dataafter_create :send_welcome_email
before_saveBefore any save operationNormalize data, validate custom rulesbefore_save :normalize_email
after_saveAfter any save operationUpdate search indexes, cache invalidationafter_save :update_search_index
before_updateBefore document is updatedTrack changes, validate updatesbefore_update :track_changes
after_updateAfter document is updatedNotify subscribers, update related dataafter_update :notify_subscribers
before_destroyBefore document is deletedValidate deletion, backup databefore_destroy :validate_deletion
after_destroyAfter document is deletedCleanup related data, audit loggingafter_destroy :cleanup_related_data
around_saveAround save operationsPerformance monitoring, transactionsaround_save :log_operation_time
🎯 Real-World Examples

E-commerce Product Validations

Real-World Application: E-commerce applications require sophisticated validation rules to ensure data quality and business rule compliance. Product data must be accurate, complete, and consistent to provide a good user experience and maintain inventory integrity.

Business Requirements: E-commerce products have specific validation needs:

  • Pricing Rules: Prices must be positive and follow business logic
  • Inventory Management: Stock levels must be tracked accurately
  • Product Identification: SKUs must be unique and follow patterns
  • Category Classification: Products must belong to valid categories
  • Metadata Requirements: Essential product information must be complete

Validation Strategy: The validation approach combines multiple techniques:

  • Basic Validations: Presence, format, and range checks
  • Custom Validations: Complex business logic and cross-field validation
  • Conditional Validations: Different rules for different product states
  • Performance Optimizations: Efficient validation for high-volume data

User Experience: Good validation provides clear, actionable error messages that help users understand what needs to be fixed. This improves data quality and reduces support requests.

class Product
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :name, type: String
  field :description, type: String
  field :price, type: Float
  field :cost, type: Float
  field :sku, type: String
  field :category, type: String
  field :status, type: String, default: "draft"
  field :inventory, type: Integer, default: 0
  field :weight, type: Float
  field :dimensions, type: Hash, default: {}
  field :tags, type: Array, default: []
  field :metadata, type: Hash, default: {}
  
  # Basic validations
  validates :name, presence: true, length: { minimum: 2, maximum: 100 }
  validates :price, numericality: { greater_than: 0 }
  validates :sku, presence: true, uniqueness: true
  validates :category, presence: true, inclusion: { in: %w[electronics books clothing] }
  validates :status, inclusion: { in: %w[draft active inactive archived] }
  validates :inventory, numericality: { greater_than_or_equal_to: 0, only_integer: true }
  
  # Custom validations
  validate :sku_format
  validate :price_vs_cost
  validate :weight_required_for_shipping
  validate :dimensions_format
  validate :tags_limit
  
  # Callbacks
  before_save :normalize_name
  before_create :generate_sku_if_missing
  after_save :update_search_index
  after_destroy :cleanup_related_data
  
  private
  
  def sku_format
    return if sku.blank?
    
    unless sku.match?(/\A[A-Z]{2,3}-\d{3,6}\z/)
      errors.add(:sku, "must be in format: XX-123 or XXX-123456")
    end
  end
  
  def price_vs_cost
    return if price.blank? || cost.blank?
    
    if price < cost
      errors.add(:price, "cannot be less than cost")
    end
    
    if price > cost * 10
      errors.add(:price, "markup seems too high")
    end
  end
  
  def weight_required_for_shipping
    return if category == "digital"
    
    if weight.blank? || weight <= 0
      errors.add(:weight, "is required for physical products")
    end
  end
  
  def dimensions_format
    return if dimensions.blank?
    
    required_keys = %w[length width height]
    missing_keys = required_keys - dimensions.keys
    
    if missing_keys.any?
      errors.add(:dimensions, "must include: #{missing_keys.join(', ')}")
    end
    
    dimensions.each do |key, value|
      unless value.is_a?(Numeric) && value > 0
        errors.add(:dimensions, "#{key} must be a positive number")
      end
    end
  end
  
  def tags_limit
    return if tags.blank?
    
    if tags.length > 10
      errors.add(:tags, "cannot have more than 10 tags")
    end
    
    if tags.any? { |tag| tag.length > 20 }
      errors.add(:tags, "each tag must be 20 characters or less")
    end
  end
  
  def normalize_name
    self.name = name.titleize if name.present?
  end
  
  def generate_sku_if_missing
    return if sku.present?
    
    prefix = category.upcase[0..2]
    counter = Product.where(category: category).count + 1
    self.sku = "#{prefix}-#{counter.to_s.rjust(3, '0')}"
  end
  
  def update_search_index
    SearchIndexJob.perform_later(self)
  end
  
  def cleanup_related_data
    OrderItem.where(product_id: id).destroy_all
    Review.where(product_id: id).destroy_all
  end
end

Social Media Post Validations

Content Moderation Challenges: Social media platforms require sophisticated validation rules to ensure content quality, user safety, and platform integrity. Posts must be validated for content appropriateness, format compliance, and user engagement features.

Validation Requirements: Social media posts have unique validation needs:

  • Content Safety: Filter inappropriate or harmful content
  • Format Compliance: Ensure hashtags and mentions follow proper format
  • User Engagement: Validate mentions and hashtags for real users
  • Media Management: Control media attachments and file types
  • Scheduling Logic: Ensure scheduled posts are set for future dates

Content Processing: Social media posts often require automatic processing:

  • Hashtag Extraction: Automatically detect and format hashtags
  • Mention Validation: Verify that mentioned users exist
  • Content Analysis: Check for inappropriate language or spam
  • Media Validation: Ensure media files are valid and accessible

User Experience: Social media validation must balance content quality with user convenience. Clear error messages help users understand why their posts were rejected, while automatic processing reduces the burden on users.

class Post
  include Mongoid::Document
  include Mongoid::Timestamps
  
  field :content, type: String
  field :title, type: String
  field :visibility, type: String, default: "public"
  field :status, type: String, default: "draft"
  field :location, type: Array # [longitude, latitude]
  field :hashtags, type: Array, default: []
  field :mentions, type: Array, default: []
  field :media_urls, type: Array, default: []
  field :language, type: String, default: "en"
  field :scheduled_at, type: DateTime
  
  belongs_to :user
  
  # Basic validations
  validates :content, presence: true, length: { maximum: 280 }
  validates :visibility, inclusion: { in: %w[public private friends] }
  validates :status, inclusion: { in: %w[draft published scheduled archived] }
  validates :language, inclusion: { in: %w[en es fr de] }
  
  # Custom validations
  validate :content_appropriate
  validate :hashtags_format
  validate :mentions_valid
  validate :location_format
  validate :scheduled_at_future
  validate :media_limit
  
  # Callbacks
  before_save :extract_hashtags_and_mentions
  before_create :set_default_title
  after_create :notify_followers
  after_save :update_trending_hashtags
  
  private
  
  def content_appropriate
    return if content.blank?
    
    inappropriate_words = %w[spam scam fraud]
    if inappropriate_words.any? { |word| content.downcase.include?(word) }
      errors.add(:content, "contains inappropriate content")
    end
  end
  
  def hashtags_format
    return if hashtags.blank?
    
    hashtags.each do |hashtag|
      unless hashtag.match?(/\A#[a-zA-Z0-9_]+\z/)
        errors.add(:hashtags, "must start with # and contain only letters, numbers, and underscores")
      end
    end
  end
  
  def mentions_valid
    return if mentions.blank?
    
    mentions.each do |mention|
      unless mention.match?(/\A@[a-zA-Z0-9_]+\z/)
        errors.add(:mentions, "must start with @ and contain only letters, numbers, and underscores")
      end
      
      # Check if mentioned user exists
      username = mention[1..-1]
      unless User.where(username: username).exists?
        errors.add(:mentions, "user @#{username} does not exist")
      end
    end
  end
  
  def location_format
    return if location.blank?
    
    unless location.is_a?(Array) && location.length == 2
      errors.add(:location, "must be an array with [longitude, latitude]")
    end
    
    longitude, latitude = location
    unless longitude.between?(-180, 180) && latitude.between?(-90, 90)
      errors.add(:location, "coordinates are invalid")
    end
  end
  
  def scheduled_at_future
    return if scheduled_at.blank?
    
    if scheduled_at <= Time.current
      errors.add(:scheduled_at, "must be in the future")
    end
  end
  
  def media_limit
    return if media_urls.blank?
    
    if media_urls.length > 10
      errors.add(:media_urls, "cannot have more than 10 media items")
    end
    
    media_urls.each do |url|
      unless url.match?(/\Ahttps?:\/\/.+/)
        errors.add(:media_urls, "must be valid URLs")
      end
    end
  end
  
  def extract_hashtags_and_mentions
    return if content.blank?
    
    self.hashtags = content.scan(/#\w+/).map(&:downcase)
    self.mentions = content.scan(/@\w+/).map(&:downcase)
  end
  
  def set_default_title
    return if title.present?
    self.title = content.truncate(50) if content.present?
  end
  
  def notify_followers
    return unless status == "published"
    NotificationJob.perform_later(self)
  end
  
  def update_trending_hashtags
    return if hashtags.blank?
    TrendingHashtagsJob.perform_later(hashtags)
  end
end

Performance & Optimization

Performance Overview: MongoDB performance optimization involves strategic indexing, query optimization, monitoring, and best practices. This section covers comprehensive strategies for maximizing MongoDB performance in Rails applications.
🔍 Indexing Strategies

Index Types and Usage

class User
  include Mongoid::Document
  
  field :email, type: String
  field :username, type: String
  field :name, type: String
  field :status, type: String, default: "active"
  field :created_at, type: DateTime
  field :last_login_at, type: DateTime
  field :location, type: Array # [longitude, latitude]
  field :tags, type: Array, default: []
  field :metadata, type: Hash, default: {}
  
  # Single field indexes
  index({ email: 1 }, { unique: true })
  index({ username: 1 }, { unique: true })
  index({ status: 1 })
  index({ created_at: -1 })
  index({ last_login_at: -1 })
  
  # Compound indexes (order matters!)
  index({ status: 1, created_at: -1 })
  index({ email: 1, status: 1 })
  index({ status: 1, last_login_at: -1 })
  
  # Text search indexes
  index({ name: "text", bio: "text" })
  
  # Geospatial indexes
  index({ location: "2dsphere" })
  
  # Array indexes
  index({ tags: 1 })
  
  # Sparse indexes (skip null values)
  index({ phone: 1 }, { sparse: true })
  
  # TTL indexes (auto-delete after time)
  index({ created_at: 1 }, { expire_after_seconds: 86400 }) # 24 hours
  
  # Partial indexes (only for specific conditions)
  index({ email: 1 }, { partialFilterExpression: { status: "active" } })
  
  # Background indexes for large collections
  index({ username: 1 }, { background: true })
end

Index Management

# Create all indexes for a model
User.create_indexes

# Create indexes for all models
Mongoid.create_indexes

# Drop all indexes for a model
User.remove_indexes

# Check existing indexes
User.collection.indexes.each do |index|
  puts "Index: #{index['name']}"
  puts "Keys: #{index['key']}"
  puts "Options: #{index['options']}"
  puts "---"
end

# Create indexes with specific options
User.collection.indexes.create_one(
  { email: 1, status: 1 },
  { 
    background: true,
    name: "email_status_idx"
  }
)

# Drop specific index
User.collection.indexes.drop_one("email_status_idx")

# Check index usage statistics
User.collection.aggregate([
  { "$indexStats" => {} }
])

Index Best Practices

  • Compound Index Order: Most selective field first
  • Covered Queries: Include all queried fields in index
  • Avoid Over-Indexing: Each index has write overhead
  • Background Indexing: Use for large collections
  • Monitor Usage: Remove unused indexes
  • TTL Indexes: For time-based data cleanup
  • Partial Indexes: For conditional queries
⚡ Query Optimization

Query Performance Best Practices

# Use projection to limit returned fields
users = User.where(status: "active").only(:name, :email)

# Use limit for large result sets
recent_users = User.order(created_at: -1).limit(100)

# Use skip with limit for pagination
page_users = User.order(created_at: -1).skip(offset).limit(per_page)

# Use covered queries (all fields in index)
# Good: All fields in index
User.where(status: "active").only(:status, :created_at)

# Avoid: Fields not in index
User.where(status: "active").only(:status, :created_at, :name)

# Use compound queries efficiently
# Good: Uses compound index
User.where(status: "active", :created_at.gte => 1.week.ago)

# Avoid: Inefficient query pattern
User.where(status: "active").where(:created_at.gte => 1.week.ago)

# Use aggregation for complex queries
top_users = User.collection.aggregate([
  { "$match" => { status: "active" } },
  { "$group" => {
    "_id" => "$status",
    "count" => { "$sum" => 1 }
  }},
  { "$sort" => { "count" => -1 } }
])

# Use bulk operations for large datasets
User.collection.bulk_write([
  { update_one: { filter: { status: "pending" }, update: { "$set" => { status: "active" } } } },
  { update_one: { filter: { status: "inactive" }, update: { "$set" => { status: "deleted" } } } }
])

Query Analysis and Optimization

# Analyze query performance
explanation = User.where(status: "active").explain

# Check if query uses index
puts "Uses index: #{explanation['queryPlanner']['winningPlan']['inputStage']['indexName']}"

# Check execution time
puts "Execution time: #{explanation['executionStats']['executionTimeMillis']}ms"

# Check documents examined vs returned
puts "Docs examined: #{explanation['executionStats']['totalDocsExamined']}"
puts "Docs returned: #{explanation['executionStats']['nReturned']}"

# Profile slow queries
Mongoid.default_client.database.command(profile: 2, slowms: 100)

# Monitor query performance
class QueryLogger
  def self.log(query, duration)
    Rails.logger.info "Query: #{query} took #{duration}ms"
  end
end

# Use in models
class User
  include Mongoid::Document
  
  def self.with_logging
    start_time = Time.current
    result = yield
    duration = ((Time.current - start_time) * 1000).round
    QueryLogger.log(to_sql, duration)
    result
  end
end

# Analyze slow queries
slow_queries = Mongoid.default_client.database.command(
  profile: 2,
  slowms: 100
)

# Check index usage
index_stats = User.collection.aggregate([
  { "$indexStats" => {} }
])

# Monitor collection stats
collection_stats = User.collection.stats
puts "Collection size: #{collection_stats['size']} bytes"
puts "Document count: #{collection_stats['count']}"
puts "Average document size: #{collection_stats['avgObjSize']} bytes"

Eager Loading Strategies

# Avoid N+1 queries with eager loading
# Bad: N+1 queries
users = User.all
users.each do |user|
  puts "#{user.name} has #{user.posts.count} posts" # N+1 query per user
end

# Good: Eager loading
users = User.includes(:posts).all
users.each do |user|
  puts "#{user.name} has #{user.posts.count} posts" # No additional queries
end

# Multiple associations
users = User.includes(:posts, :comments, :roles).all

# Nested eager loading
users = User.includes(posts: :comments).all

# Conditional eager loading
users = User.includes(:posts).where(:status => "active")

# Polymorphic eager loading
comments = Comment.includes(:commentable, :user).all

# Custom eager loading with scopes
users = User.includes(:posts).where(:status => "active")
users.each do |user|
  puts "#{user.name}: #{user.posts.published.count} published posts"
end

# Use projection with eager loading
users = User.includes(:posts).only(:name, :email).all
📊 Monitoring & Profiling

Performance Monitoring

What is Performance Monitoring? Performance monitoring involves tracking and analyzing database operations to identify bottlenecks, optimize queries, and ensure your application meets performance requirements. Effective monitoring provides insights into how your MongoDB database is performing and helps you make data-driven optimization decisions.

Why Monitor Performance? Performance monitoring is essential for:

  • Proactive Optimization: Identify issues before they become problems
  • Capacity Planning: Understand resource usage and plan for growth
  • User Experience: Ensure fast response times for users
  • Cost Optimization: Use resources efficiently and reduce costs
  • Debugging: Quickly identify and resolve performance issues

Monitoring Tools: MongoDB provides several monitoring capabilities:

  • Query Profiler: Capture and analyze slow queries
  • Server Status: Monitor database server health
  • Connection Pool: Track connection usage and availability
  • Custom Metrics: Build application-specific monitoring

Monitoring Strategy: Effective monitoring follows these principles:

  • Set Baselines: Establish normal performance metrics
  • Monitor Trends: Track performance over time
  • Alert on Anomalies: Get notified of performance issues
  • Regular Reviews: Periodically analyze monitoring data

# Enable query profiling
Mongoid.default_client.database.command(profile: 2, slowms: 100)

# Check profiling status
profile_status = Mongoid.default_client.database.command(profile: -1)

# Get slow query logs
slow_queries = Mongoid.default_client.database.command(
  profile: 0,
  filter: { millis: { "$gt" => 100 } }
)

# Monitor database operations
class DatabaseMonitor
  def self.log_operation(operation, duration, collection = nil)
    Rails.logger.info "[DB] #{operation} on #{collection} took #{duration}ms"
  end
  
  def self.track_query(query, duration)
    Rails.logger.info "[QUERY] #{query} took #{duration}ms"
  end
end

# Custom monitoring middleware
class MongoidMonitoring
  def self.track_operation(operation)
    start_time = Time.current
    result = yield
    duration = ((Time.current - start_time) * 1000).round
    DatabaseMonitor.log_operation(operation, duration)
    result
  end
end

# Use in models
class User
  include Mongoid::Document
  
  def self.with_monitoring
    MongoidMonitoring.track_operation("user_query") do
      yield
    end
  end
end

# Monitor connection pool
pool_stats = Mongoid.default_client.cluster.pool_stats
puts "Active connections: #{pool_stats['active']}"
puts "Available connections: #{pool_stats['available']}"
puts "Pending connections: #{pool_stats['pending']}"

# Monitor server status
server_status = Mongoid.default_client.database.command(serverStatus: 1)
puts "Uptime: #{server_status['uptime']} seconds"
puts "Connections: #{server_status['connections']['current']}"
puts "Operations: #{server_status['opcounters']}"

Performance Metrics

What are Performance Metrics? Performance metrics are quantitative measurements that help you understand how your MongoDB database is performing. These metrics provide insights into query efficiency, resource usage, and overall database health. Tracking the right metrics helps you make informed optimization decisions.

Key Metrics to Track: Focus on these essential performance metrics:

  • Query Performance: Execution time, documents examined vs returned
  • Index Efficiency: Hit ratios, index usage patterns
  • Resource Usage: Memory, CPU, disk I/O, connection pool
  • Collection Statistics: Document count, size, growth patterns
  • Error Rates: Failed operations, timeout frequency

Collection Metrics: Collection-level metrics provide insights into data characteristics:

  • Document Count: Total number of documents in the collection
  • Storage Size: Total disk space used by the collection
  • Average Document Size: Helps understand data structure efficiency
  • Index Count: Number of indexes and their total size

Index Metrics: Index performance metrics help optimize query performance:

  • Access Patterns: How often indexes are used
  • Hit Ratios: Percentage of successful index lookups
  • Miss Patterns: When indexes fail to help queries
  • Storage Impact: Disk space used by indexes

# Collection statistics
collection_stats = User.collection.stats

puts "Collection: #{collection_stats['ns']}"
puts "Document count: #{collection_stats['count']}"
puts "Total size: #{collection_stats['size']} bytes"
puts "Average document size: #{collection_stats['avgObjSize']} bytes"
puts "Index count: #{collection_stats['nindexes']}"
puts "Total index size: #{collection_stats['totalIndexSize']} bytes"

# Index statistics
index_stats = User.collection.aggregate([
  { "$indexStats" => {} }
])

index_stats.each do |index|
  puts "Index: #{index['name']}"
  puts "Accesses: #{index['accesses']['ops']}"
  puts "Hits: #{index['accesses']['hits']}"
  puts "Misses: #{index['accesses']['misses']}"
  puts "Hit ratio: #{(index['accesses']['hits'].to_f / index['accesses']['ops'] * 100).round(2)}%"
end

# Query performance metrics
class QueryMetrics
  def self.track_query_performance
    start_time = Time.current
    result = yield
    duration = ((Time.current - start_time) * 1000).round
    
    Rails.logger.info "[METRICS] Query took #{duration}ms"
    
    # Store metrics for analysis
    QueryMetric.create(
      duration: duration,
      timestamp: Time.current,
      collection: result.class.name.downcase
    )
    
    result
  end
end

# Database health check
class DatabaseHealthCheck
  def self.perform
    begin
      # Test connection
      Mongoid.default_client.database.command(ping: 1)
      
      # Test basic operations
      test_user = User.create(name: "Health Check", email: "[email protected]")
      test_user.destroy
      
      { status: "healthy", message: "Database is responding normally" }
    rescue => e
      { status: "unhealthy", message: e.message }
    end
  end
end
🚀 Advanced Optimization

Bulk Operations

What are Bulk Operations? Bulk operations allow you to perform multiple database operations in a single request, significantly improving performance compared to individual operations. Instead of making separate network calls for each document, bulk operations batch multiple operations together, reducing network overhead and improving throughput.

Why Use Bulk Operations? Bulk operations provide several performance benefits:

  • Network Efficiency: Reduce network round trips between application and database
  • Atomic Operations: Multiple operations can be executed atomically
  • Better Throughput: Handle large datasets more efficiently
  • Reduced Overhead: Less per-operation overhead
  • Batch Processing: Ideal for data migration and batch updates

Types of Bulk Operations: MongoDB supports several bulk operation types:

  • Bulk Insert: Insert multiple documents at once
  • Bulk Update: Update multiple documents with a single operation
  • Bulk Upsert: Update existing documents or insert new ones
  • Bulk Delete: Remove multiple documents efficiently

When to Use Bulk Operations: Consider bulk operations for:

  • Data Migration: Moving large amounts of data
  • Batch Processing: Processing large datasets
  • Data Cleanup: Removing or updating many documents
  • Initial Data Loading: Setting up test or production data
  • Periodic Updates: Scheduled maintenance operations

# Bulk insert
users_data = [
  { name: "John", email: "[email protected]", status: "active" },
  { name: "Jane", email: "[email protected]", status: "active" },
  { name: "Bob", email: "[email protected]", status: "active" }
]

User.collection.insert_many(users_data)

# Bulk update
User.collection.update_many(
  { status: "pending" },
  { "$set" => { status: "active", updated_at: Time.current } }
)

# Bulk upsert
User.collection.update_many(
  { email: "[email protected]" },
  { "$set" => { name: "John Updated", updated_at: Time.current } },
  { upsert: true }
)

# Bulk delete
User.collection.delete_many({ status: "inactive" })

# Using Mongoid for bulk operations
User.where(status: "pending").update_all(status: "active")

# Batch processing
User.where(status: "pending").find_in_batches(batch_size: 1000) do |batch|
  batch.each do |user|
    user.update(status: "active")
  end
end

# Bulk write operations
User.collection.bulk_write([
  {
    update_one: {
      filter: { status: "pending" },
      update: { "$set" => { status: "active" } }
    }
  },
  {
    update_one: {
      filter: { status: "inactive" },
      update: { "$set" => { status: "deleted" } }
    }
  }
])

Caching Strategies

What is Caching? Caching is a technique that stores frequently accessed data in fast-access storage to improve application performance. Instead of repeatedly querying the database for the same data, caching stores the results in memory or a fast storage system, dramatically reducing response times and database load.

Why Use Caching? Caching provides several performance benefits:

  • Faster Response Times: Retrieve data from memory instead of disk
  • Reduced Database Load: Fewer queries to the database
  • Better Scalability: Handle more users with the same resources
  • Improved User Experience: Faster page loads and interactions
  • Cost Efficiency: Reduce database resource requirements

Caching Strategies: Different caching approaches work for different scenarios:

  • Application-Level Caching: Cache data in your Rails application
  • Database Query Caching: Cache query results
  • Object Caching: Cache entire objects or computed values
  • Fragment Caching: Cache parts of views or templates
  • CDN Caching: Cache static assets and content

Cache Invalidation: Managing cache freshness is crucial:

  • Time-Based Expiration: Automatically expire cache entries
  • Version-Based Invalidation: Use version numbers to invalidate cache
  • Event-Based Invalidation: Invalidate cache when data changes
  • Manual Invalidation: Explicitly clear cache when needed

# Redis caching for frequently accessed data
class User
  include Mongoid::Document
  
  field :email, type: String
  field :name, type: String
  field :status, type: String
  
  # Cache frequently accessed data
  def cached_profile
    Rails.cache.fetch("user_profile_#{id}", expires_in: 1.hour) do
      {
        name: name,
        email: email,
        status: status,
        post_count: posts.count,
        last_activity: last_login_at
      }
    end
  end
  
  # Cache with versioning
  def cached_data
    Rails.cache.fetch("user_#{id}_v#{cache_version}", expires_in: 30.minutes) do
      # Expensive computation
      calculate_user_stats
    end
  end
  
  private
  
  def cache_version
    updated_at.to_i
  end
  
  def calculate_user_stats
    {
      total_posts: posts.count,
      total_comments: comments.count,
      engagement_score: calculate_engagement_score
    }
  end
  
  def calculate_engagement_score
    # Complex calculation
    (posts.count * 2) + comments.count
  end
end

# Fragment caching for views
# In view: <%= render partial: 'user_profile', cached: true %>

# Cache invalidation
class User
  after_save :invalidate_cache
  after_destroy :invalidate_cache
  
  private
  
  def invalidate_cache
    Rails.cache.delete("user_profile_#{id}")
    Rails.cache.delete("user_#{id}_v#{cache_version}")
  end
end

# Background job for cache warming
class CacheWarmingJob < ApplicationJob
  queue_as :default
  
  def perform
    User.active.find_each do |user|
      user.cached_profile # Warm cache
    end
  end
end

Connection Pool Optimization

What is Connection Pooling? Connection pooling is a technique that maintains a pool of database connections that can be reused by multiple requests. Instead of creating a new connection for each database operation, the application borrows a connection from the pool, uses it, and returns it when done. This significantly reduces the overhead of establishing new connections.

Why Use Connection Pooling? Connection pooling provides several performance benefits:

  • Reduced Connection Overhead: Avoid the cost of creating new connections
  • Better Resource Utilization: Efficiently manage database connections
  • Improved Response Times: Faster database operations
  • Scalability: Handle more concurrent users efficiently
  • Connection Management: Automatic cleanup and health monitoring

Connection Pool Parameters: Key configuration options include:

  • Max Pool Size: Maximum number of connections in the pool
  • Min Pool Size: Minimum number of connections to maintain
  • Max Idle Time: How long connections can remain idle
  • Wait Queue Timeout: How long to wait for an available connection
  • Server Selection Timeout: Timeout for server selection

Optimization Strategies: Effective connection pool management involves:

  • Right-Size the Pool: Match pool size to your application needs
  • Monitor Usage: Track connection pool utilization
  • Handle Failures: Implement proper error handling and retry logic
  • Health Checks: Monitor connection health and remove bad connections

# Configure connection pool
# config/mongoid.yml
development:
  clients:
    default:
      uri: mongodb://localhost:27017/myapp_development
      options:
        server_selection_timeout: 5
        max_pool_size: 10
        min_pool_size: 2
        max_idle_time: 300
        wait_queue_timeout: 2500

production:
  clients:
    default:
      uri: <%= ENV['MONGODB_URI'] %>
      options:
        server_selection_timeout: 5
        max_pool_size: 20
        min_pool_size: 5
        max_idle_time: 300
        wait_queue_timeout: 2500
        read_preference: :secondary
        write_concern: { w: 1, j: true }

# Monitor connection pool
class ConnectionPoolMonitor
  def self.stats
    client = Mongoid.default_client
    pool = client.cluster.pool_stats
    
    {
      active_connections: pool['active'],
      available_connections: pool['available'],
      pending_connections: pool['pending'],
      total_connections: pool['active'] + pool['available']
    }
  end
  
  def self.health_check
    stats = self.stats
    
    if stats[:active_connections] > stats[:total_connections] * 0.8
      Rails.logger.warn "High connection pool usage: #{stats[:active_connections]}/#{stats[:total_connections]}"
    end
    
    stats
  end
end

# Connection pool monitoring job
class ConnectionPoolMonitoringJob < ApplicationJob
  queue_as :default
  
  def perform
    stats = ConnectionPoolMonitor.stats
    Rails.logger.info "Connection pool stats: #{stats}"
    
    # Alert if pool is nearly full
    if stats[:active_connections] > stats[:total_connections] * 0.9
      # Send alert
      AlertService.send_alert("High MongoDB connection pool usage")
    end
  end
end
🎯 Real-World Examples

E-commerce Performance Optimization

E-commerce Performance Challenges: E-commerce applications face unique performance challenges due to high user traffic, complex product catalogs, and real-time inventory management. Users expect fast page loads, accurate search results, and seamless checkout experiences, making performance optimization critical for business success.

Key Performance Requirements: E-commerce applications must handle:

  • High Concurrency: Multiple users browsing and purchasing simultaneously
  • Complex Queries: Product searches with multiple filters and sorting
  • Real-Time Inventory: Accurate stock levels across multiple locations
  • Fast Search: Quick product discovery and recommendations
  • Order Processing: Efficient checkout and payment processing

Optimization Strategies: Effective e-commerce optimization involves:

  • Strategic Indexing: Indexes for common search and filter patterns
  • Product Caching: Cache frequently accessed product data
  • Bulk Operations: Efficient inventory and order updates
  • Query Optimization: Optimize complex product search queries
  • Performance Monitoring: Track key metrics and user experience

User Experience Impact: Performance directly affects business metrics:

  • Conversion Rates: Faster sites convert more visitors to customers
  • Search Quality: Better search results improve product discovery
  • Inventory Accuracy: Real-time stock levels prevent overselling
  • Mobile Performance: Optimized for mobile shopping experiences

class Product
  include Mongoid::Document
  
  field :name, type: String
  field :price, type: Float
  field :category, type: String
  field :status, type: String, default: "active"
  field :inventory, type: Integer, default: 0
  field :tags, type: Array, default: []
  field :created_at, type: DateTime
  
  # Optimized indexes for e-commerce queries
  index({ category: 1, status: 1 })
  index({ price: 1 })
  index({ tags: 1 })
  index({ status: 1, created_at: -1 })
  index({ name: "text", description: "text" })
  
  # Compound indexes for common query patterns
  index({ category: 1, price: 1, status: 1 })
  index({ status: 1, inventory: 1 })
  
  # Performance-optimized scopes
  scope :active, -> { where(status: "active") }
  scope :in_stock, -> { where(:inventory.gt => 0) }
  scope :by_category, ->(category) { where(category: category) }
  scope :price_range, ->(min, max) { where(:price.gte => min, :price.lte => max) }
  
  # Cached methods for expensive operations
  def cached_average_rating
    Rails.cache.fetch("product_#{id}_rating", expires_in: 1.hour) do
      reviews.avg(:rating) || 0
    end
  end
  
  def cached_review_count
    Rails.cache.fetch("product_#{id}_review_count", expires_in: 30.minutes) do
      reviews.count
    end
  end
  
  # Bulk operations for inventory updates
  def self.update_inventory_bulk(product_ids, quantities)
    bulk_operations = product_ids.map.with_index do |product_id, index|
      {
        update_one: {
          filter: { _id: product_id },
          update: { "$inc" => { inventory: quantities[index] } }
        }
      }
    end
    
    collection.bulk_write(bulk_operations)
  end
  
  # Performance monitoring
  def self.with_performance_logging
    start_time = Time.current
    result = yield
    duration = ((Time.current - start_time) * 1000).round
    
    Rails.logger.info "[PERF] Product query took #{duration}ms"
    result
  end
end

class Order
  include Mongoid::Document
  
  field :order_number, type: String
  field :total_amount, type: Float
  field :status, type: String, default: "pending"
  field :created_at, type: DateTime
  
  belongs_to :user
  has_many :order_items
  
  # Optimized indexes for order queries
  index({ user_id: 1, created_at: -1 })
  index({ status: 1, created_at: -1 })
  index({ order_number: 1 }, { unique: true })
  
  # Aggregation for order analytics
  def self.sales_analytics(start_date, end_date)
    collection.aggregate([
      { "$match" => {
        created_at: { "$gte" => start_date, "$lte" => end_date },
        status: "completed"
      }},
      { "$group" => {
        "_id" => {
          "year" => { "$year" => "$created_at" },
          "month" => { "$month" => "$created_at" },
          "day" => { "$dayOfMonth" => "$created_at" }
        },
        "total_sales" => { "$sum" => "$total_amount" },
        "order_count" => { "$sum" => 1 }
      }},
      { "$sort" => { "_id" => 1 } }
    ])
  end
  
  # Bulk order processing
  def self.process_bulk_orders(order_ids)
    bulk_operations = order_ids.map do |order_id|
      {
        update_one: {
          filter: { _id: order_id, status: "pending" },
          update: { "$set" => { status: "processing", processed_at: Time.current } }
        }
      }
    end
    
    collection.bulk_write(bulk_operations)
  end
end

Social Media Performance Optimization

Social Media Performance Challenges: Social media platforms face unique performance challenges due to massive user bases, real-time content updates, and complex engagement patterns. Users expect instant content delivery, real-time notifications, and seamless interactions across multiple devices and platforms.

Key Performance Requirements: Social media applications must handle:

  • Real-Time Content: Instant posting and content delivery
  • High Engagement: Likes, comments, shares, and interactions
  • Content Discovery: Personalized feeds and recommendations
  • Geographic Features: Location-based content and services
  • Scalable Architecture: Handle millions of concurrent users

Optimization Strategies: Effective social media optimization involves:

  • Content Caching: Cache popular posts and user feeds
  • Geospatial Indexing: Optimize location-based queries
  • Engagement Tracking: Efficient like and comment processing
  • Feed Optimization: Fast personalized content delivery
  • Real-Time Features: Live updates and notifications

User Experience Impact: Performance directly affects user engagement:

  • Content Discovery: Fast feeds improve user engagement
  • Real-Time Interaction: Instant likes and comments
  • Mobile Experience: Optimized for mobile browsing
  • Content Relevance: Better recommendations through performance

class Post
  include Mongoid::Document
  
  field :content, type: String
  field :visibility, type: String, default: "public"
  field :status, type: String, default: "published"
  field :created_at, type: DateTime
  field :hashtags, type: Array, default: []
  field :location, type: Array # [longitude, latitude]
  
  belongs_to :user
  has_many :comments
  has_many :likes
  
  # Optimized indexes for social media queries
  index({ user_id: 1, created_at: -1 })
  index({ visibility: 1, created_at: -1 })
  index({ hashtags: 1 })
  index({ location: "2dsphere" })
  index({ status: 1, created_at: -1 })
  
  # Text search for content
  index({ content: "text" })
  
  # Performance-optimized scopes
  scope :public_posts, -> { where(visibility: "public", status: "published") }
  scope :recent, -> { order(created_at: -1) }
  scope :by_user, ->(user) { where(user: user) }
  scope :with_hashtag, ->(hashtag) { where(hashtags: hashtag) }
  
  # Cached engagement metrics
  def cached_engagement_score
    Rails.cache.fetch("post_#{id}_engagement", expires_in: 15.minutes) do
      like_count + comment_count * 2
    end
  end
  
  def like_count
    Rails.cache.fetch("post_#{id}_likes", expires_in: 5.minutes) do
      likes.count
    end
  end
  
  def comment_count
    Rails.cache.fetch("post_#{id}_comments", expires_in: 5.minutes) do
      comments.count
    end
  end
  
  # Feed generation with optimization
  def self.user_feed(user, limit = 20)
    followed_user_ids = user.followed_users.pluck(:id)
    
    collection.aggregate([
      { "$match" => {
        "$or" => [
          { user_id: { "$in" => followed_user_ids } },
          { user_id: user.id }
        ],
        visibility: "public",
        status: "published"
      }},
      { "$lookup" => {
        "from" => "users",
        "localField" => "user_id",
        "foreignField" => "_id",
        "as" => "user"
      }},
      { "$unwind" => "$user" },
      { "$sort" => { created_at: -1 } },
      { "$limit" => limit },
      { "$project" => {
        "content" => 1,
        "created_at" => 1,
        "hashtags" => 1,
        "user.name" => 1,
        "user.username" => 1
      }}
    ])
  end
  
  # Trending hashtags with caching
  def self.trending_hashtags(hours = 24)
    Rails.cache.fetch("trending_hashtags_#{hours}", expires_in: 10.minutes) do
      collection.aggregate([
        { "$match" => {
          created_at: { "$gte" => hours.hours.ago },
          visibility: "public"
        }},
        { "$unwind" => "$hashtags" },
        { "$group" => {
          "_id" => "$hashtags",
          "count" => { "$sum" => 1 }
        }},
        { "$sort" => { "count" => -1 } },
        { "$limit" => 10 }
      ])
    end
  end
  
  # Bulk operations for engagement
  def self.update_engagement_metrics
    collection.aggregate([
      { "$lookup" => {
        "from" => "likes",
        "localField" => "_id",
        "foreignField" => "post_id",
        "as" => "likes"
      }},
      { "$lookup" => {
        "from" => "comments",
        "localField" => "_id",
        "foreignField" => "post_id",
        "as" => "comments"
      }},
      { "$addFields" => {
        "like_count" => { "$size" => "$likes" },
        "comment_count" => { "$size" => "$comments" },
        "engagement_score" => {
          "$add" => [
            { "$size" => "$likes" },
            { "$multiply" => [{ "$size" => "$comments" }, 2] }
          ]
        }
      }},
      { "$out" => "posts_with_engagement" }
    ])
  end
end

Deployment & Production

Production Overview: Deploying MongoDB with Rails in production requires careful configuration, security considerations, monitoring, and deployment strategies. This section covers all aspects of production deployment and maintenance.
⚙️ Environment Configuration

Production Configuration

What is Production Configuration? Production configuration involves setting up MongoDB and Rails for a live, high-traffic environment. Unlike development, production environments require careful attention to security, performance, reliability, and scalability. The configuration must handle real user traffic, protect sensitive data, and maintain high availability.

Why Production Configuration Matters: Proper production configuration is critical for:

  • Security: Protect data and prevent unauthorized access
  • Performance: Optimize for high traffic and fast response times
  • Reliability: Ensure consistent uptime and data integrity
  • Scalability: Handle growth in users and data volume
  • Monitoring: Track system health and performance

Key Configuration Areas: Production setup involves several critical areas:

  • Connection Settings: Optimize connection pools and timeouts
  • Security Settings: SSL/TLS, authentication, and access controls
  • Performance Tuning: Read/write concerns and preferences
  • Error Handling: Retry logic and failure recovery
  • Logging: Appropriate log levels and monitoring

Environment Differences: Production differs from development in several ways:

  • Security Requirements: Stricter authentication and encryption
  • Performance Demands: Higher connection limits and optimizations
  • Monitoring Needs: Comprehensive logging and alerting
  • Reliability Focus: Redundancy and failover capabilities

# config/mongoid.yml
production:
  clients:
    default:
      uri: <%= ENV['MONGODB_URI'] %>
      options:
        server_selection_timeout: 5
        max_pool_size: 20
        min_pool_size: 5
        max_idle_time: 300
        wait_queue_timeout: 2500
        read_preference: :secondary
        write_concern: { w: 1, j: true }
        ssl: true
        ssl_ca_cert: <%= ENV['MONGODB_SSL_CA_CERT'] %>
        ssl_cert: <%= ENV['MONGODB_SSL_CERT'] %>
        ssl_key: <%= ENV['MONGODB_SSL_KEY'] %>
        retry_writes: true
        retry_reads: true
        max_connecting: 10
        heartbeat_frequency: 10
        server_selection_timeout: 30
        socket_timeout: 5
        connect_timeout: 10

staging:
  clients:
    default:
      uri: <%= ENV['MONGODB_STAGING_URI'] %>
      options:
        server_selection_timeout: 5
        max_pool_size: 10
        min_pool_size: 2
        ssl: true
        read_preference: :secondary
        write_concern: { w: 1, j: true }

# Environment-specific configuration
# config/environments/production.rb
Rails.application.configure do
  # ... other configuration ...
  
  # MongoDB configuration
  config.mongoid.logger = Rails.logger
  config.mongoid.logger.level = Logger::INFO
  
  # Enable query logging in development only
  if Rails.env.development?
    config.mongoid.logger.level = Logger::DEBUG
  end
end

Environment Variables

What are Environment Variables? Environment variables are configuration values that are set outside of your application code and can be accessed by your application at runtime. They provide a secure and flexible way to manage configuration across different environments without hardcoding sensitive information in your code.

Why Use Environment Variables? Environment variables offer several advantages:

  • Security: Keep sensitive data out of source code
  • Flexibility: Different configurations for different environments
  • Deployment Safety: No risk of committing secrets to version control
  • Scalability: Easy to manage across multiple servers
  • Compliance: Meet security and audit requirements

Key Environment Variables: Essential variables for MongoDB production:

  • Database Connection: MongoDB URI and connection parameters
  • Security Credentials: SSL certificates and authentication
  • Performance Settings: Connection pool sizes and timeouts
  • Read/Write Preferences: Database consistency settings
  • Monitoring: Log levels and monitoring configurations

Best Practices: Effective environment variable management involves:

  • Secure Storage: Use secure vaults for sensitive data
  • Documentation: Document all required variables
  • Validation: Validate required variables at startup
  • Backup Strategy: Secure backup of configuration

# .env.production
MONGODB_URI=mongodb+srv://username:[email protected]/myapp_production
MONGODB_SSL_CA_CERT=/path/to/ca-certificate.crt
MONGODB_SSL_CERT=/path/to/client-certificate.crt
MONGODB_SSL_KEY=/path/to/client-key.pem
MONGODB_READ_PREFERENCE=secondary
MONGODB_WRITE_CONCERN=majority
MONGODB_MAX_POOL_SIZE=20
MONGODB_MIN_POOL_SIZE=5

# .env.staging
MONGODB_STAGING_URI=mongodb+srv://username:[email protected]/myapp_staging
MONGODB_SSL_CA_CERT=/path/to/ca-certificate.crt
MONGODB_SSL_CERT=/path/to/client-certificate.crt
MONGODB_SSL_KEY=/path/to/client-key.pem

# Docker environment variables
# docker-compose.yml
version: '3.8'
services:
  web:
    build: .
    environment:
      - MONGODB_URI=mongodb://mongodb:27017/myapp_production
      - RAILS_ENV=production
    depends_on:
      - mongodb
  
  mongodb:
    image: mongo:6.0
    environment:
      - MONGO_INITDB_ROOT_USERNAME=admin
      - MONGO_INITDB_ROOT_PASSWORD=password
    volumes:
      - mongodb_data:/data/db
      - ./mongo-init.js:/docker-entrypoint-initdb.d/mongo-init.js:ro

volumes:
  mongodb_data:

Security Configuration

What is Security Configuration? Security configuration involves setting up authentication, authorization, encryption, and access controls for your MongoDB database in production. This includes user management, SSL/TLS encryption, network security, and audit logging to protect your data and comply with security requirements.

Why Security Configuration Matters: Proper security configuration is essential for:

  • Data Protection: Prevent unauthorized access to sensitive data
  • Compliance: Meet industry and regulatory requirements
  • Risk Mitigation: Reduce the risk of data breaches
  • Audit Requirements: Maintain audit trails for security events
  • Business Continuity: Protect against security incidents

Key Security Components: Production security involves several layers:

  • Authentication: User credentials and identity verification
  • Authorization: Role-based access controls and permissions
  • Encryption: SSL/TLS for data in transit and at rest
  • Network Security: Firewall rules and network isolation
  • Audit Logging: Track access and changes for compliance

Security Best Practices: Effective security implementation follows:

  • Principle of Least Privilege: Grant minimum necessary permissions
  • Defense in Depth: Multiple layers of security controls
  • Regular Updates: Keep security configurations current
  • Monitoring: Continuous security monitoring and alerting

# MongoDB security configuration
# mongo-init.js
db.createUser({
  user: "app_user",
  pwd: "secure_password",
  roles: [
    { role: "readWrite", db: "myapp_production" },
    { role: "dbAdmin", db: "myapp_production" }
  ]
})

# Create indexes for security
db.users.createIndex({ "email": 1 }, { unique: true })
db.users.createIndex({ "username": 1 }, { unique: true })

# Enable authentication
# /etc/mongod.conf
security:
  authorization: enabled

net:
  port: 27017
  bindIp: 127.0.0.1

# SSL/TLS configuration
net:
  ssl:
    mode: requireSSL
    PEMKeyFile: /path/to/mongodb.pem
    CAFile: /path/to/ca.crt

# Firewall configuration
# Allow only application server IPs
sudo ufw allow from 10.0.0.0/8 to any port 27017
sudo ufw allow from 172.16.0.0/12 to any port 27017
sudo ufw allow from 192.168.0.0/16 to any port 27017
🔒 Security Best Practices

Authentication & Authorization

What is Authentication & Authorization? Authentication verifies the identity of users or applications trying to access the database, while authorization determines what actions they can perform. Together, they form the foundation of database security, ensuring that only authorized users can access specific data and perform allowed operations.

Why Authentication & Authorization Matter: Proper access control is critical for:

  • Data Security: Prevent unauthorized access to sensitive information
  • Compliance: Meet regulatory requirements for data protection
  • Audit Trails: Track who accessed what and when
  • Risk Management: Minimize the impact of security incidents
  • Business Continuity: Protect against data breaches and loss

Authentication Methods: MongoDB supports several authentication approaches:

  • Username/Password: Traditional credential-based authentication
  • X.509 Certificates: Certificate-based authentication for applications
  • LDAP Integration: Enterprise directory integration
  • OAuth/SAML: External identity provider integration

Authorization Strategies: Effective authorization involves:

  • Role-Based Access: Assign permissions based on user roles
  • Database-Level Permissions: Control access to specific databases
  • Collection-Level Permissions: Fine-grained access control
  • Operation-Level Permissions: Control specific operations (read, write, etc.)

# User management for MongoDB
# Create application user
use myapp_production
db.createUser({
  user: "app_user",
  pwd: "secure_password_here",
  roles: [
    { role: "readWrite", db: "myapp_production" },
    { role: "dbAdmin", db: "myapp_production" }
  ]
})

# Create read-only user for analytics
db.createUser({
  user: "analytics_user",
  pwd: "analytics_password",
  roles: [
    { role: "read", db: "myapp_production" }
  ]
})

# Create admin user for maintenance
use admin
db.createUser({
  user: "admin_user",
  pwd: "admin_password",
  roles: [
    { role: "userAdminAnyDatabase", db: "admin" },
    { role: "dbAdminAnyDatabase", db: "admin" },
    { role: "clusterAdmin", db: "admin" }
  ]
})

# Role-based access control
# Custom roles for specific operations
db.createRole({
  role: "data_analyst",
  privileges: [
    { resource: { db: "myapp_production", collection: "" }, actions: ["find"] },
    { resource: { db: "myapp_production", collection: "users" }, actions: ["find"] }
  ],
  roles: []
})

# Assign role to user
db.grantRolesToUser("analytics_user", [{ role: "data_analyst", db: "myapp_production" }])

Network Security

# Network security configuration
# MongoDB configuration file
# /etc/mongod.conf

# Network settings
net:
  port: 27017
  bindIp: 127.0.0.1,10.0.0.5  # Only allow specific IPs
  maxInMemoryConnections: 100

# SSL/TLS configuration
net:
  ssl:
    mode: requireSSL
    PEMKeyFile: /etc/ssl/mongodb.pem
    CAFile: /etc/ssl/ca.crt
    allowConnectionsWithoutCertificates: false

# Security settings
security:
  authorization: enabled
  keyFile: /etc/mongodb/keyfile
  clusterAuthMode: keyFile

# Firewall rules
# Allow only application servers
sudo ufw allow from 10.0.0.0/8 to any port 27017
sudo ufw allow from 172.16.0.0/12 to any port 27017

# Block external access
sudo ufw deny 27017

# VPN configuration for remote access
# Only allow connections through VPN
sudo ufw allow from 10.8.0.0/24 to any port 27017

# Rate limiting
# Limit connections per IP
sudo iptables -A INPUT -p tcp --dport 27017 -m connlimit --connlimit-above 10 -j DROP

Data Encryption

# Encryption at rest
# MongoDB Enterprise configuration
security:
  encryption:
    keyFile: /etc/mongodb/keyfile
    enableEncryption: true

# Application-level encryption
class User
  include Mongoid::Document
  
  field :email, type: String
  field :encrypted_ssn, type: String
  field :encrypted_credit_card, type: String
  
  # Encrypt sensitive data before saving
  before_save :encrypt_sensitive_data
  after_find :decrypt_sensitive_data
  
  private
  
  def encrypt_sensitive_data
    if ssn_changed?
      self.encrypted_ssn = encrypt_field(ssn)
    end
    
    if credit_card_changed?
      self.encrypted_credit_card = encrypt_field(credit_card)
    end
  end
  
  def decrypt_sensitive_data
    if encrypted_ssn.present?
      self.ssn = decrypt_field(encrypted_ssn)
    end
    
    if encrypted_credit_card.present?
      self.credit_card = decrypt_field(encrypted_credit_card)
    end
  end
  
  def encrypt_field(value)
    return nil if value.blank?
    
    cipher = OpenSSL::Cipher.new('AES-256-GCM')
    cipher.encrypt
    key = Rails.application.credentials.mongodb_encryption_key
    cipher.key = key
    
    encrypted = cipher.update(value) + cipher.final
    auth_tag = cipher.auth_tag
    
    Base64.strict_encode64(encrypted + auth_tag)
  end
  
  def decrypt_field(encrypted_value)
    return nil if encrypted_value.blank?
    
    cipher = OpenSSL::Cipher.new('AES-256-GCM')
    cipher.decrypt
    key = Rails.application.credentials.mongodb_encryption_key
    cipher.key = key
    
    decoded = Base64.strict_decode64(encrypted_value)
    auth_tag_length = 16
    encrypted = decoded[0...-auth_tag_length]
    auth_tag = decoded[-auth_tag_length..-1]
    
    cipher.auth_tag = auth_tag
    cipher.auth_data = ""
    
    cipher.update(encrypted) + cipher.final
  end
end

# Credentials management
# config/credentials.yml.enc
mongodb_encryption_key: <%= Rails.application.credentials.mongodb_encryption_key %>
📊 Monitoring & Alerting

Health Monitoring

What is Health Monitoring? Health monitoring involves continuously checking the status and performance of your MongoDB database to ensure it's operating correctly and efficiently. This includes monitoring connection status, performance metrics, storage utilization, and replication health to proactively identify and resolve issues before they impact users.

Why Health Monitoring Matters: Comprehensive monitoring is essential for:

  • Proactive Issue Detection: Identify problems before they affect users
  • Performance Optimization: Track metrics to optimize database performance
  • Capacity Planning: Monitor growth trends and plan for scaling
  • Uptime Assurance: Ensure high availability and reliability
  • Compliance Requirements: Meet audit and regulatory requirements

Key Monitoring Areas: Effective health monitoring covers:

  • Connection Health: Database connectivity and response times
  • Performance Metrics: Query performance and resource utilization
  • Storage Health: Disk usage, fragmentation, and growth
  • Replication Status: Replica set health and synchronization
  • Security Events: Authentication failures and access patterns

Monitoring Strategy: Effective monitoring implementation involves:

  • Automated Checks: Regular automated health assessments
  • Alert Thresholds: Set appropriate alerting levels
  • Historical Tracking: Maintain metrics for trend analysis
  • Response Procedures: Clear escalation and response processes

# Database health monitoring
class DatabaseHealthMonitor
  def self.check_health
    {
      connection: check_connection,
      performance: check_performance,
      storage: check_storage,
      replication: check_replication
    }
  end
  
  def self.check_connection
    begin
      Mongoid.default_client.database.command(ping: 1)
      { status: "healthy", message: "Database is responding" }
    rescue => e
      { status: "unhealthy", message: e.message }
    end
  end
  
  def self.check_performance
    stats = Mongoid.default_client.database.command(dbStats: 1).first
    
    {
      collections: stats['collections'],
      data_size: stats['dataSize'],
      storage_size: stats['storageSize'],
      indexes: stats['indexes'],
      index_size: stats['indexSize']
    }
  end
  
  def self.check_storage
    stats = Mongoid.default_client.database.command(dbStats: 1).first
    
    total_size = stats['storageSize']
    data_size = stats['dataSize']
    fragmentation = ((total_size - data_size) / total_size.to_f * 100).round(2)
    
    {
      total_size: total_size,
      data_size: data_size,
      fragmentation_percentage: fragmentation,
      status: fragmentation > 50 ? "warning" : "healthy"
    }
  end
  
  def self.check_replication
    begin
      status = Mongoid.default_client.database.command(replSetGetStatus: 1).first
      
      {
        status: status['ok'] == 1 ? "healthy" : "unhealthy",
        members: status['members'].map { |m| m['stateStr'] },
        primary: status['members'].find { |m| m['state'] == 1 }&.dig('name')
      }
    rescue => e
      { status: "unhealthy", message: e.message }
    end
  end
end

# Monitoring job
class DatabaseMonitoringJob < ApplicationJob
  queue_as :default
  
  def perform
    health = DatabaseHealthMonitor.check_health
    
    # Log health status
    Rails.logger.info "Database health: #{health}"
    
    # Send alerts for unhealthy status
    if health[:connection][:status] == "unhealthy"
      AlertService.send_alert("Database connection failed: #{health[:connection][:message]}")
    end
    
    if health[:storage][:status] == "warning"
      AlertService.send_alert("High database fragmentation: #{health[:storage][:fragmentation_percentage]}%")
    end
    
    # Store metrics
    DatabaseMetric.create(
      timestamp: Time.current,
      connection_status: health[:connection][:status],
      storage_fragmentation: health[:storage][:fragmentation_percentage],
      performance_metrics: health[:performance]
    )
  end
end

Performance Monitoring

What is Performance Monitoring? Performance monitoring tracks key metrics that indicate how well your MongoDB database is performing. This includes query execution times, resource utilization, throughput, and response times to identify bottlenecks and optimize database performance for better user experience.

Why Performance Monitoring Matters: Performance monitoring is critical for:

  • User Experience: Fast response times improve user satisfaction
  • Capacity Planning: Understand current usage and plan for growth
  • Cost Optimization: Use resources efficiently and reduce costs
  • Problem Resolution: Quickly identify and fix performance issues
  • Business Impact: Performance directly affects business metrics

Key Performance Metrics: Essential metrics to monitor include:

  • Query Performance: Execution times and throughput
  • Resource Utilization: CPU, memory, and disk usage
  • Connection Pool: Connection usage and availability
  • Index Efficiency: Index hit rates and usage patterns
  • Error Rates: Failed operations and timeout frequency

Monitoring Implementation: Effective performance monitoring involves:

  • Real-Time Monitoring: Track metrics in real-time
  • Historical Analysis: Maintain data for trend analysis
  • Alerting: Notify when thresholds are exceeded
  • Dashboard Visualization: Clear metrics presentation

# Query performance monitoring
class QueryPerformanceMonitor
  def self.track_query(query, duration, collection = nil)
    Rails.logger.info "[QUERY] #{query} on #{collection} took #{duration}ms"
    
    # Store slow queries
    if duration > 100
      SlowQuery.create(
        query: query,
        duration: duration,
        collection: collection,
        timestamp: Time.current
      )
    end
    
    # Update metrics
    QueryMetric.create(
      query: query,
      duration: duration,
      collection: collection,
      timestamp: Time.current
    )
  end
  
  def self.analyze_slow_queries
    slow_queries = SlowQuery.where(:timestamp.gte => 1.hour.ago)
    
    slow_queries.group_by(&:collection).each do |collection, queries|
      avg_duration = queries.map(&:duration).sum / queries.length
      
      Rails.logger.warn "Slow queries detected in #{collection}: avg #{avg_duration}ms"
      
      if avg_duration > 500
        AlertService.send_alert("High average query time in #{collection}: #{avg_duration}ms")
      end
    end
  end
end

# Connection pool monitoring
class ConnectionPoolMonitor
  def self.stats
    client = Mongoid.default_client
    pool = client.cluster.pool_stats
    
    {
      active_connections: pool['active'],
      available_connections: pool['available'],
      pending_connections: pool['pending'],
      total_connections: pool['active'] + pool['available']
    }
  end
  
  def self.check_pool_health
    stats = self.stats
    
    if stats[:active_connections] > stats[:total_connections] * 0.8
      AlertService.send_alert("High connection pool usage: #{stats[:active_connections]}/#{stats[:total_connections]}")
    end
    
    if stats[:pending_connections] > 0
      AlertService.send_alert("Connection pool exhausted: #{stats[:pending_connections]} pending connections")
    end
    
    stats
  end
end

# Index usage monitoring
class IndexUsageMonitor
  def self.analyze_index_usage
    collections = Mongoid.default_client.database.collection_names
    
    collections.each do |collection_name|
      index_stats = Mongoid.default_client.database.collection(collection_name).aggregate([
        { "$indexStats" => {} }
      ])
      
      index_stats.each do |index|
        hit_ratio = index['accesses']['hits'].to_f / index['accesses']['ops'] * 100
        
        if hit_ratio < 10 && index['accesses']['ops'] > 100
          Rails.logger.warn "Low hit ratio for index #{index['name']} in #{collection_name}: #{hit_ratio.round(2)}%"
        end
      end
    end
  end
end
🔄 Backup & Recovery

Backup Strategies

What are Backup Strategies? Backup strategies involve creating and managing copies of your MongoDB data to protect against data loss, corruption, or accidental deletion. A comprehensive backup strategy includes regular automated backups, secure storage, testing procedures, and recovery plans to ensure business continuity in case of disasters.

Why Backup Strategies Matter: Proper backup strategies are critical for:

  • Data Protection: Safeguard against data loss and corruption
  • Business Continuity: Ensure rapid recovery from disasters
  • Compliance Requirements: Meet regulatory backup requirements
  • Risk Mitigation: Reduce the impact of system failures
  • Peace of Mind: Confidence in data safety and recovery

Backup Types: Different backup approaches serve different purposes:

  • Full Backups: Complete database snapshots
  • Incremental Backups: Only changed data since last backup
  • Point-in-Time Backups: Consistent snapshots at specific times
  • Continuous Backups: Real-time data protection

Backup Best Practices: Effective backup strategies include:

  • Automated Scheduling: Regular automated backup processes
  • Secure Storage: Encrypted backups in multiple locations
  • Testing Procedures: Regular backup restoration testing
  • Retention Policies: Clear backup retention and cleanup

# Automated backup script
#!/bin/bash
# backup_mongodb.sh

# Configuration
BACKUP_DIR="/backups/mongodb"
DATE=$(date +%Y%m%d_%H%M%S)
DB_NAME="myapp_production"
RETENTION_DAYS=30

# Create backup directory
mkdir -p $BACKUP_DIR

# Perform backup
mongodump \
  --uri="mongodb+srv://username:[email protected]/$DB_NAME" \
  --out="$BACKUP_DIR/backup_$DATE" \
  --gzip

# Compress backup
tar -czf "$BACKUP_DIR/backup_$DATE.tar.gz" -C "$BACKUP_DIR" "backup_$DATE"

# Remove uncompressed backup
rm -rf "$BACKUP_DIR/backup_$DATE"

# Upload to cloud storage
aws s3 cp "$BACKUP_DIR/backup_$DATE.tar.gz" "s3://my-backup-bucket/mongodb/"

# Clean up old backups
find $BACKUP_DIR -name "backup_*.tar.gz" -mtime +$RETENTION_DAYS -delete

# Log backup completion
echo "Backup completed: backup_$DATE.tar.gz" >> /var/log/mongodb_backup.log

# Rails backup task
# lib/tasks/mongodb.rake
namespace :mongodb do
  desc "Create database backup"
  task backup: :environment do
    backup_dir = Rails.root.join("backups")
    timestamp = Time.current.strftime("%Y%m%d_%H%M%S")
    backup_path = backup_dir.join("backup_#{timestamp}")
    
    # Create backup directory
    FileUtils.mkdir_p(backup_path)
    
    # Perform backup
    system("mongodump --uri='#{ENV['MONGODB_URI']}' --out='#{backup_path}' --gzip")
    
    # Compress backup
    system("tar -czf '#{backup_path}.tar.gz' -C '#{backup_dir}' 'backup_#{timestamp}'")
    
    # Remove uncompressed backup
    FileUtils.rm_rf(backup_path)
    
    puts "Backup created: #{backup_path}.tar.gz"
  end
  
  desc "Restore database from backup"
  task :restore, [:backup_file] => :environment do |task, args|
    backup_file = args[:backup_file]
    
    unless backup_file
      puts "Usage: rake mongodb:restore[backup_file.tar.gz]"
      exit 1
    end
    
    backup_path = Rails.root.join("backups", backup_file)
    
    unless File.exist?(backup_path)
      puts "Backup file not found: #{backup_path}"
      exit 1
    end
    
    # Extract backup
    system("tar -xzf '#{backup_path}' -C '#{Rails.root.join('backups')}'")
    
    # Restore database
    extracted_dir = backup_path.to_s.gsub('.tar.gz', '')
    system("mongorestore --uri='#{ENV['MONGODB_URI']}' --drop '#{extracted_dir}'")
    
    # Clean up extracted files
    FileUtils.rm_rf(extracted_dir)
    
    puts "Database restored from: #{backup_file}"
  end
end

Recovery Procedures

What are Recovery Procedures? Recovery procedures are step-by-step processes for restoring your MongoDB database from backups in case of data loss, corruption, or system failures. These procedures ensure that you can quickly and safely restore your data to a consistent state and resume normal operations with minimal downtime.

Why Recovery Procedures Matter: Proper recovery procedures are essential for:

  • Minimize Downtime: Quick restoration reduces business impact
  • Data Integrity: Ensure restored data is consistent and complete
  • Business Continuity: Maintain operations during disasters
  • Risk Management: Reduce the impact of data loss incidents
  • Compliance: Meet recovery time objectives (RTO)

Recovery Types: Different recovery scenarios require different approaches:

  • Full Recovery: Complete database restoration from backup
  • Point-in-Time Recovery: Restore to a specific moment
  • Partial Recovery: Restore specific collections or data
  • Disaster Recovery: Complete system restoration

Recovery Best Practices: Effective recovery procedures include:

  • Documented Procedures: Clear, step-by-step recovery instructions
  • Testing: Regular recovery testing and validation
  • Automation: Automated recovery scripts and processes
  • Validation: Post-recovery verification and testing

# Disaster recovery script
#!/bin/bash
# disaster_recovery.sh

# Configuration
BACKUP_S3_BUCKET="my-backup-bucket"
BACKUP_DIR="/backups/mongodb"
DB_NAME="myapp_production"

# Get latest backup from S3
LATEST_BACKUP=$(aws s3 ls "s3://$BACKUP_S3_BUCKET/mongodb/" | sort | tail -1 | awk '{print $4}')

if [ -z "$LATEST_BACKUP" ]; then
  echo "No backup found in S3"
  exit 1
fi

echo "Restoring from backup: $LATEST_BACKUP"

# Download backup from S3
aws s3 cp "s3://$BACKUP_S3_BUCKET/mongodb/$LATEST_BACKUP" "$BACKUP_DIR/"

# Extract backup
tar -xzf "$BACKUP_DIR/$LATEST_BACKUP" -C "$BACKUP_DIR"

# Restore database
mongorestore \
  --uri="mongodb+srv://username:[email protected]/$DB_NAME" \
  --drop \
  "$BACKUP_DIR/backup_*"

# Clean up
rm -rf "$BACKUP_DIR/backup_*"

echo "Disaster recovery completed"

# Rails recovery task
# lib/tasks/mongodb.rake
namespace :mongodb do
  desc "Perform disaster recovery"
  task disaster_recovery: :environment do
    puts "Starting disaster recovery..."
    
    # Stop application
    system("sudo systemctl stop rails-app")
    
    # Get latest backup from cloud storage
    backup_file = get_latest_backup_from_s3
    
    if backup_file.nil?
      puts "No backup found in S3"
      exit 1
    end
    
    puts "Restoring from backup: #{backup_file}"
    
    # Download and restore backup
    download_backup_from_s3(backup_file)
    restore_database_from_backup(backup_file)
    
    # Verify restoration
    if verify_database_restoration
      puts "Disaster recovery completed successfully"
      
      # Restart application
      system("sudo systemctl start rails-app")
    else
      puts "Disaster recovery failed"
      exit 1
    end
  end
  
  private
  
  def get_latest_backup_from_s3
    # Implementation to get latest backup from S3
    # Returns backup filename or nil
  end
  
  def download_backup_from_s3(backup_file)
    # Implementation to download backup from S3
  end
  
  def restore_database_from_backup(backup_file)
    # Implementation to restore database
  end
  
  def verify_database_restoration
    # Implementation to verify restoration
    # Returns true if successful, false otherwise
  end
end

# Point-in-time recovery
class PointInTimeRecovery
  def self.recover_to_timestamp(timestamp)
    # Get oplog entries up to timestamp
    oplog_entries = get_oplog_entries_until(timestamp)
    
    # Apply oplog entries to restore to point in time
    apply_oplog_entries(oplog_entries)
    
    puts "Recovered database to #{timestamp}"
  end
  
  private
  
  def self.get_oplog_entries_until(timestamp)
    # Implementation to get oplog entries
  end
  
  def self.apply_oplog_entries(entries)
    # Implementation to apply oplog entries
  end
end
🚀 Deployment Strategies

Docker Deployment

What is Docker Deployment? Docker deployment involves packaging your Rails application and MongoDB database into containers that can be easily deployed, scaled, and managed. Docker provides consistency across environments, simplifies deployment processes, and enables efficient resource utilization through containerization.

Why Use Docker Deployment? Docker deployment offers several advantages:

  • Environment Consistency: Same environment across development and production
  • Easy Scaling: Simple horizontal scaling with container orchestration
  • Isolation: Applications run in isolated environments
  • Portability: Easy deployment across different platforms
  • Resource Efficiency: Better resource utilization than traditional VMs

Docker Components: A complete Docker deployment includes:

  • Application Container: Rails application with all dependencies
  • Database Container: MongoDB with proper configuration
  • Web Server: Nginx for load balancing and SSL termination
  • Cache Container: Redis for session and cache storage
  • Networking: Container communication and external access

Deployment Best Practices: Effective Docker deployment involves:

  • Multi-Stage Builds: Optimize container size and security
  • Environment Variables: Secure configuration management
  • Health Checks: Monitor container health and availability
  • Persistent Storage: Proper data persistence for databases

# Docker Compose for production
# docker-compose.prod.yml
version: '3.8'

services:
  web:
    build: .
    environment:
      - RAILS_ENV=production
      - MONGODB_URI=mongodb://mongodb:27017/myapp_production
      - REDIS_URL=redis://redis:6379/0
    ports:
      - "3000:3000"
    depends_on:
      - mongodb
      - redis
    restart: unless-stopped
    volumes:
      - ./logs:/app/logs
    networks:
      - app-network

  mongodb:
    image: mongo:6.0
    environment:
      - MONGO_INITDB_ROOT_USERNAME=admin
      - MONGO_INITDB_ROOT_PASSWORD=secure_password
    volumes:
      - mongodb_data:/data/db
      - ./mongo-init.js:/docker-entrypoint-initdb.d/mongo-init.js:ro
      - ./mongodb.conf:/etc/mongod.conf:ro
    ports:
      - "27017:27017"
    restart: unless-stopped
    networks:
      - app-network

  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data
    restart: unless-stopped
    networks:
      - app-network

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./ssl:/etc/nginx/ssl:ro
    depends_on:
      - web
    restart: unless-stopped
    networks:
      - app-network

volumes:
  mongodb_data:
  redis_data:

networks:
  app-network:
    driver: bridge

# MongoDB configuration for Docker
# mongodb.conf
storage:
  dbPath: /data/db
  journal:
    enabled: true

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

net:
  port: 27017
  bindIp: 0.0.0.0

security:
  authorization: enabled

# Kubernetes deployment
# k8s/mongodb-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mongodb
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mongodb
  template:
    metadata:
      labels:
        app: mongodb
    spec:
      containers:
      - name: mongodb
        image: mongo:6.0
        ports:
        - containerPort: 27017
        env:
        - name: MONGO_INITDB_ROOT_USERNAME
          valueFrom:
            secretKeyRef:
              name: mongodb-secret
              key: username
        - name: MONGO_INITDB_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mongodb-secret
              key: password
        volumeMounts:
        - name: mongodb-data
          mountPath: /data/db
        - name: mongodb-config
          mountPath: /etc/mongod.conf
          subPath: mongod.conf
      volumes:
      - name: mongodb-data
        persistentVolumeClaim:
          claimName: mongodb-pvc
      - name: mongodb-config
        configMap:
          name: mongodb-config

---
apiVersion: v1
kind: Service
metadata:
  name: mongodb-service
spec:
  selector:
    app: mongodb
  ports:
  - port: 27017
    targetPort: 27017
  type: ClusterIP

CI/CD Pipeline

What is CI/CD Pipeline? CI/CD (Continuous Integration/Continuous Deployment) pipeline automates the process of testing, building, and deploying your Rails application with MongoDB. It ensures code quality, automates testing, and provides reliable, repeatable deployments with minimal human intervention.

Why Use CI/CD Pipeline? CI/CD pipelines provide several benefits:

  • Automated Testing: Run tests automatically on every code change
  • Faster Deployment: Reduce time from code to production
  • Quality Assurance: Catch issues early in development
  • Consistent Deployments: Repeatable and reliable deployment process
  • Rollback Capability: Quick rollback to previous versions

Pipeline Stages: A complete CI/CD pipeline includes:

  • Code Analysis: Static code analysis and security checks
  • Testing: Unit, integration, and end-to-end tests
  • Building: Create deployable artifacts
  • Deployment: Automated deployment to staging/production
  • Monitoring: Post-deployment health checks

Database Considerations: CI/CD with MongoDB requires special attention:

  • Migration Testing: Test database migrations in CI
  • Data Integrity: Ensure migrations don't break existing data
  • Rollback Strategy: Plan for migration rollbacks
  • Environment Isolation: Separate test and production databases

# GitHub Actions workflow
# .github/workflows/deploy.yml
name: Deploy to Production

on:
  push:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      mongodb:
        image: mongo:6.0
        env:
          MONGO_INITDB_ROOT_USERNAME: admin
          MONGO_INITDB_ROOT_PASSWORD: password
        options: >-
          --health-cmd "mongosh --eval 'db.runCommand(\"ping\").ok'"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 27017:27017

    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Ruby
      uses: ruby/setup-ruby@v1
      with:
        ruby-version: 3.2.0
        bundler-cache: true
    
    - name: Install dependencies
      run: |
        sudo apt-get update
        sudo apt-get install -y mongodb-clients
    
    - name: Run tests
      env:
        MONGODB_URI: mongodb://admin:password@localhost:27017/test
      run: |
        bundle install
        bundle exec rake db:create
        bundle exec rake db:test:prepare
        bundle exec rspec
    
    - name: Run security checks
      run: |
        bundle exec brakeman
        bundle exec bundle-audit check --update

  deploy:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Deploy to production
      env:
        DEPLOY_KEY: ${{ secrets.DEPLOY_KEY }}
        MONGODB_URI: ${{ secrets.MONGODB_URI }}
      run: |
        # Deploy to production server
        echo "Deploying to production..."
        
        # Run database migrations
        bundle exec rake db:migrate
        
        # Restart application
        sudo systemctl restart rails-app
        
        # Health check
        curl -f http://localhost/health || exit 1

# Database migration script
# lib/tasks/deploy.rake
namespace :deploy do
  desc "Run database migrations"
  task migrate: :environment do
    puts "Running database migrations..."
    
    # Backup before migration
    Rake::Task['mongodb:backup'].invoke
    
    # Run migrations
    Rake::Task['db:migrate'].invoke
    
    # Verify migration
    if verify_migration_success
      puts "Migration completed successfully"
    else
      puts "Migration failed, rolling back..."
      Rake::Task['mongodb:restore'].invoke
      exit 1
    end
  end
  
  desc "Rollback database migration"
  task rollback: :environment do
    puts "Rolling back database migration..."
    
    # Restore from backup
    Rake::Task['mongodb:restore'].invoke
    
    puts "Rollback completed"
  end
  
  private
  
  def verify_migration_success
    # Implementation to verify migration
    # Returns true if successful, false otherwise
  end
end

Real-World Examples

📋 Task Management App - Complete Mini Project
Project Overview: We'll build a complete Task Management application using MongoDB with Rails. This project demonstrates real-world patterns including user authentication, task management, project organization, and collaboration features.

🎯 Project Features

  • User Management: Registration, authentication, and profile management
  • Project Organization: Create and manage multiple projects
  • Task Management: Create, assign, and track tasks with priorities
  • Real-time Updates: Activity feeds and notifications

🏗️ Project Setup

Let's start by setting up our Rails application with MongoDB. We'll use Mongoid as our ODM and include essential gems for authentication and API handling.

# Gemfile
source 'https://rubygems.org'

gem 'rails', '~> 7.0'
gem 'mongoid', '~> 8.0'
gem 'bcrypt', '~> 3.1'
gem 'jwt', '~> 2.2'
gem 'pundit', '~> 2.1'
gem 'kaminari', '~> 1.2'
gem 'ransack', '~> 4.0'

group :development, :test do
  gem 'rspec-rails', '~> 6.0'
  gem 'factory_bot_rails', '~> 6.2'
  gem 'faker', '~> 3.1'
end
# config/application.rb
require_relative "boot"
require "rails/all"

Bundler.require(*Rails.groups)

module TaskManager
  class Application < Rails::Application
    config.load_defaults 7.0
    config.api_only = true
    config.middleware.use ActionDispatch::Session::CookieStore
  end
end

👤 User Model

Simple User model with basic authentication fields and project relationship.

# app/models/user.rb
class User
  include Mongoid::Document
  include Mongoid::Timestamps

  field :email, type: String
  field :username, type: String
  field :password_digest, type: String

  has_many :projects, dependent: :destroy

  validates :email, presence: true, uniqueness: true
  validates :username, presence: true, uniqueness: true

  index({ email: 1 }, { unique: true })
  index({ username: 1 }, { unique: true })
end

📁 Project Model

Project model with basic fields and user relationship.

# app/models/project.rb
class Project
  include Mongoid::Document
  include Mongoid::Timestamps

  field :name, type: String
  field :description, type: String
  field :status, type: String, default: 'active'

  belongs_to :owner, class_name: 'User'

  validates :name, presence: true
  validates :status, inclusion: { in: %w[active archived] }

  index({ owner_id: 1, status: 1 })
  index({ name: "text", description: "text" })

  scope :active, -> { where(status: 'active') }
  scope :owned_by, ->(user) { where(owner: user) }
end

🔍 Basic Queries

Simple queries to test the relationship between User and Project models.

# Rails console examples
# Create a user
user = User.create!(email: "[email protected]", username: "john_doe")

# Create a project for the user
project = Project.create!(
  name: "My First Project",
  description: "A sample project",
  owner: user
)

# Find user's projects
user.projects.count
user.projects.where(status: "active")

# Find project owner
project.owner.email

# Search projects by name
Project.where(name: /First/)

# Get all active projects
Project.active.count

🚀 Local Development & Testing

Let's set up and run our Task Management app locally. This will show you how to get the application running on your machine and test all the features.

# config/mongoid.yml
development:
  clients:
    default:
      uri: mongodb://localhost:27017/task_manager_dev
      options:
        server_selection_timeout: 5

test:
  clients:
    default:
      uri: mongodb://localhost:27017/task_manager_test
      options:
        server_selection_timeout: 5

📦 Setup Steps

Follow these steps to get the application running locally:

# 1. Install MongoDB locally
# macOS (using Homebrew)
brew tap mongodb/brew
brew install mongodb-community
brew services start mongodb/brew/mongodb-community

# Ubuntu/Debian
sudo apt-get install mongodb
sudo systemctl start mongodb

# Windows
# Download and install from https://www.mongodb.com/try/download/community

# 2. Create Rails app
rails new task_manager --api --skip-active-record
cd task_manager

# 3. Add gems to Gemfile
gem 'mongoid', '~> 8.0'
gem 'bcrypt', '~> 3.1'
gem 'jwt', '~> 2.2'
gem 'kaminari', '~> 1.2'

# 4. Install gems
bundle install

# 5. Initialize Mongoid
rails g mongoid:config

# 6. Generate models
rails g model User email:string username:string password_digest:string
rails g model Project name:string description:string status:string owner:references
rails g model Task title:string description:string status:string project:references creator:references assignee:references

# 7. Run migrations (Mongoid doesn't use migrations, but we'll create indexes)
rails runner "User.create_indexes"
rails runner "Project.create_indexes"
rails runner "Task.create_indexes"

# 8. Start the server
rails server

🧪 Testing the Application

Let's test our application using Rails console and create some sample data to see how everything works together.

# Start Rails console
rails console

# Create a test user
user = User.create!(
  email: "[email protected]",
  username: "john_doe"
)

# Create a project
project = Project.create!(
  name: "My First Project",
  description: "A sample project to test our app",
  owner: user
)

# Test queries
puts "User projects: #{user.projects.count}"
puts "Project owner: #{project.owner.email}"

# Test search
search_results = Project.where(name: /First/)
puts "Search results: #{search_results.count}"

# Test relationships
puts "User has #{user.projects.count} projects"
puts "Project belongs to: #{project.owner.username}"

🔍 Database Exploration

Explore the MongoDB database directly to understand how data is stored:

# Connect to MongoDB shell
mongosh

# Switch to our database
use task_manager_dev

# View collections
show collections

# View users
db.users.find().pretty()

# View projects
db.projects.find().pretty()

# Count documents
db.users.countDocuments()
db.projects.countDocuments()

# View indexes
db.users.getIndexes()
db.projects.getIndexes()

# Find projects by status
db.projects.find({status: "active"}).pretty()

# Search projects by name
db.projects.find({name: {$regex: "First"}}).pretty()

🐛 Debugging Tips

Common issues and how to debug them when running locally:

# Check if MongoDB is running
ps aux | grep mongod

# Check MongoDB connection
rails runner "puts Mongoid.default_client.database.name"

# View Rails logs
tail -f log/development.log

# Reset database (if needed)
rails runner "Mongoid.purge!"

# Check model validations
rails runner "user = User.new; puts user.valid?; puts user.errors.full_messages"

# Test model associations
rails runner "user = User.first; puts user.projects.count"

# Monitor database queries
# Add to config/application.rb:
config.mongoid.logger = Logger.new(STDOUT)

Reference & Commands

Mongoid Commands
CommandDescriptionExample
User.createCreate and save a documentUser.create(name: "John", email: "[email protected]")
User.findFind by IDUser.find("507f1f77bcf86cd799439011")
User.whereFind by criteriaUser.where(active: true)
User.countCount documentsUser.count
Best Practices Summary: Use embedded documents for one-to-few relationships, create indexes for common queries, implement proper pagination, and monitor query performance. Start with simple use cases and gradually explore advanced features.

Learn more about Rails
Learn more about DevOps

83 thoughts on “MongoDB and Rails: A Complete Developer Guide”

Comments are closed.

Scroll to Top