Skip to content

Latest commit

 

History

History
748 lines (554 loc) · 28.2 KB

File metadata and controls

748 lines (554 loc) · 28.2 KB

Active Record Tenanting

Note

This file will eventually become a complete "Rails Guide"-style document explaining Active Record tenanting with this gem.

In the meantime, it is a work-in-progress containing:

  • skeleton outline for documentation
  • functional roadmap represented as to-do checklists

Contents

1. What is Active Record Tenanted?

Tip

If you're not familiar with how Rails's built-in horizontal sharding works, it may be worth reading the Rails Guide on Multiple Databases with Active Record before proceeding.

Active Record Tenanted extends the Rails framework to enable an application to have many tenant-specific databases. It provides data isolation by logically separating each tenant's data, by providing safety mechanisms to help ensure safe usage of Active Record, and by modifying the behavior of many parts of Rails such as fragment caching, Active Job, Action Cable, Active Storage, Global ID, and database tasks. By providing integrated framework support for tenanting, Active Record Tenanted ensures that developers can write the majority of their code as if they were in a single-tenant application without putting tenant privacy and data security at risk.

1.1 Guiding design principles

The design of Active Record Tenanted is rooted in a few guiding principles in order to safely allow multiple tenants to share a Rails application instance:

  • Data "at rest" is persisted in a separate store for each tenant's data, isolated either physically or logically from other tenants.
  • Data "in transit" is only sent to users with authenticated access to the tenant instance.
  • All tenant-related code execution must happen within a well-defined isolated tenant context with controls around data access and transmission.

Another guiding principle, though, is:

  • Developing a multi-tenant Rails app should be as easy as developing a single-tenant app.

The goal is that developers will rarely need to think about managing tenant isolation.

1.2 High-level implementation

Active Record Tenanted extends Active Record to dynamically create a Connection Pool for a tenant on demand. It does this in a thread-safe way by relying heavily on Rails' horizontal sharding features.

It extends Rails' testing frameworks so that tests don't need to explicitly set up a tenant or otherwise be aware of tenanting (unless tenanting behavior is explicitly being tested).

It also provides integrations with Action Dispatch's Rack middleware, Action View Caching, Active Job, Action Cable, Turbo frames and streams, Active Storage, Action Mailbox, and Action Text to ensure that code is always aware of its "tenant context".

1.3 Concepts

A "tenant ID" is simply a string (or an integer) that uniquely identifies a subset of data. For example, this may be a subdomain, or a user-chosen name, or a foreign key into a customer database. It's used as part of the name of the database (e.g., the file path to a SQLite file on disk, or the name of a MySQL database) and so there are constraints on the tenant ID.

A "tenant context" refers to the "current tenant" during code execution. For code running in a Rails server, the tenant context is set automatically by Active Record Tenanted's middleware; but in other situations, such as in the Rails console, the context can be set by calling .with_tenant:

# When no tenant context is set, "current tenant" is nil:
ApplicationRecord.current_tenant   # => nil

ApplicationRecord.with_tenant("tenant-one") do
  # Inside this block, code runs within "tenant-one"'s context
  ApplicationRecord.current_tenant # => "tenant-one"
  User.current_tenant              # => "tenant-one"

  # ... and uses a connection to "tenant-one"'s database.
  User.connection_pool.db_config.database
  # => storage/tenants/development/tenant-one/db/main.sqlite3

  # ... so that SQL queries are executed on "tenant-one"'s database
  user = User.find(1)
  # User Load [tenant=tenant-one] (1.3ms)  SELECT "users".* FROM "users" WHERE "users"."id" = ? LIMIT ?  [["id", 1], ["LIMIT", 1]]
end

Note that a "tenant attribute" is set on all model instances, to reflect the tenant to which it belongs:

ApplicationRecord.with_tenant("tenant-one") do
  user = User.find(1)
  user.tenant                      # => "tenant-one"
end

Access to the database without a tenant context raises an exception:

ApplicationRecord.current_tenant   # => nil
User.find(1)                       # raises ActiveRecord::Tenanted::NoTenantError

1.4 Prior Art

Released in 2008, the acts_as_tenant gem allows an application to isolate tenant data that is commingled in a single database. It relies on Active Record associations and applies tenant constraints using scopes, and offers middleware tenant resolution. It provides very limited integration with the rest of the Rails framework.

In 2009, Guy Naor spoke at Acts As conference on Writing Multi-tenant Applications in Rails 2, which provides details on many aspects of multi-tenancy.

Released in 2011, the apartment gem extends Active Record to make dynamic connections to tenant-specific databases. It provides more substantial data isolation than However, it relies on a primitive reconnection mechanism that pre-dates Rails 6.1's thread-safe sharding model. It also provides only limited integration with the rest of the Rails framework.

In December 2020, Rails 6.1 was released with support for horizontal sharding and multi-database. This functionality provided new thread-safe capabilities for connection switching in Rails.

In early 2025, Julik Tarkhanov published a tenanting implementation named "Shardine" that uses the Rails sharding API. However, it also provided very limited integration with the rest of the Rails framework.

1.5 Shared state (WIP)

  • when we talk about "integration with the rest of rails" we're talking about Rails' assumptions about shared state
  • some of these assumptions are now busted
    • database ids are no longer unique
    • global ids are no longer global
    • cache is no longer global
    • cable channels are no longer global
    • jobs are no longer global

2. Application Configuration

This gem offers an "omakase" configuration that specifies:

  1. All models inheriting from ApplicationRecord will be tenanted.
  2. The subdomain of the request will be used to determine the tenant context.

These defaults can be overridden using the configuration options:

  • config.active_record_tenanted.connection_class
  • config.active_record_tenanted.tenant_resolver

This gem also introduces behavior changes into Rails to accommodate tenanting. All of these behavior changes can be disabled by setting config.active_record_tenanted.connection_class to nil.

2.1 The Default Configuration

To install this gem into an application with the defaults, first add the gem:

--- a/Gemfile
+++ b/Gemfile
@@ -3,6 +3,7 @@ git_source(:bc) { |repo| "https://github.com/basecamp/#{repo}" }
 ruby file: ".ruby-version"

 gem "rails", github: "rails/rails", branch: "main"
+gem "activerecord-tenanted"

 # Assets & front end
 gem "importmap-rails"

Extend your ApplicationRecord models:

--- a/app/models/application_record.rb
+++ b/app/models/application_record.rb
@@ -1,3 +1,4 @@
 class ApplicationRecord < ActiveRecord::Base
   primary_abstract_class
+  tenanted
 end

Extend your database configuration:

--- a/config/database.yml
+++ b/config/database.yml
@@ -12,7 +12,8 @@ default: &default
 production:
   primary:
     <<: *default
-    database: storage/production.sqlite3
+    database: storage/production/%{tenant}/main.sqlite3
+    tenanted: true
   cable:
     <<: *default
     database: storage/production_cable.sqlite3

In this configuration, ApplicationRecord classes and instances will be extended with tenant behavior:

class User < ApplicationRecord ; end

ApplicationRecord.current_tenant   # => nil

ApplicationRecord.with_tenant("tenant-one") do
  ApplicationRecord.current_tenant # => "tenant-one"
  User.current_tenant              # => "tenant-one"
  user = User.find(1)
  user.tenant                      # => "tenant-one"
end

And in this configuration, the TenantSelector middleware will automatically set the tenant context base on the request subdomain. A request to tenant-one.example.com will resolve to tenant ID "tenant-one", and all code that runs in the application as part of request handling will automatically be in this context:

class BooksController < ApplicationController
  def index
    Book.current_tenant # => "tenant-one" for a request to "tenant-one.example.com"
  end
end

2.2 Configuring the Database

By default, Active Record Tenanted will connect ApplicationRecord to tenanted shards based on the primary database configuration.

This can be overridden with an argument to tenanted with the name of the database. For example, if the database.yml file contained this configuration:

production:
  primary:
    adapter: mysql2
    database: primary_db
  secondary:
    adapter: sqlite3
    database: "storage/tenants/%{tenant}/main.sqlite3"
    tenanted: true

then the models could be configured as follows:

class ApplicationRecord < ActiveRecord::Base
  primary_abstract_class
  tenanted "secondary"
end

This approach also works for primary database that aren't named "primary":

production:
  tenant_db:
    adapter: sqlite3
    database: "storage/tenants/%{tenant}/main.sqlite3"
    tenanted: true
  secondary:
    adapter: mysql2
    database: primary_db
class ApplicationRecord < ActiveRecord::Base
  primary_abstract_class
  tenanted "tenant_db"
end

2.3 Configuring max_connection_pools

By default, Active Record Tenanted will cap the number of tenanted connection pools to 50. Setting a limit on the number of "live" connection pools at any one time provides control over the number of file descriptors used for database connections. For SQLite databases, it's also an important control on the amount of memory used.

The cap on the number of connection pools is configurable in config/database.yml by setting a max_connection_pools parameter:

production:
  primary:
    adapter: sqlite3
    database: "storage/tenants/%{tenant}/main.sqlite3"
    tenanted: true
    max_connection_pools: 20

Active Record Tenanted will reap the least-recently-used connection pools when this limit is surpassed. Developers are encouraged to tune this parameter with care, since setting it too low may lead to increased request latency due to frequently re-establishing database connections, while setting it too high may consume precious file descriptors and memory resources.

2.4 Configuring the Connection Class

By default, Active Record Tenanted assumes that ApplicationRecord is the tenanted abstract base class:

# Set this in an initializer if you're tenanting a connection class other than
# ApplicationRecord. This value indicates the connection class that this gem uses to integrate
# with a broad set of Rails subsystems, including:
#
# - Active Job
# - Active Storage
# - Action Cable
# - Action Dispatch middleware (Tenant Selector)
# - Test frameworks and fixtures
#
# Defaults to "ApplicationRecord", but this can be set to `nil` to turn off the integrations
# entirely.
config.active_record_tenanted.connection_class = "ApplicationRecord"

Applications may override this to tenant a different abstract connection class. For example, to connect some models to the "secondary" database in this configuration:

production:
  primary:
    adapter: mysql2
    database: primary_db
  secondary:
    adapter: sqlite3
    database: "storage/tenants/%{tenant}/main.sqlite3"
    tenanted: true

A new abstract connection class could be defined and configured as follows:

# define the abstract connection class
class TenantedApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
  tenanted "secondary"
end

# concrete tenanted models inherit from TenantedApplicationRecord
class User < TenantedApplicationRecord ; end

# make sure the Rails integrations use the desired connection class
Rails.application.configure do
  config.active_record_tenanted.connection_class = "TenantedApplicationRecord"
end

2.5 Configuring the Tenant Resolver

Active Record Tenanted's default tenant resolver uses the request's subdomain:

# Set this to a lambda that takes a request object and returns the tenant name. It's used by:
#
# - Action Dispatch middleware (Tenant Selector)
# - Action Cable connections
#
# Defaults to the request subdomain.
config.active_record_tenanted.tenant_resolver = ->(request) { request.subdomain }

Applications may override this with their own lambda that wraps more complex tenant resolution logic. For example:

module TenantSlug
  def self.resolve(request)
    # complex behavior to pull the tenant out of the request path
  end
end

# configure Active Record Tenanted in an initializer
Rails.application.configure do
  config.active_record_tenanted.tenant_resolver = ->(request) { TenantSlug.resolve(request) }
end

2.6 Other Tenant Configuration

TODO:

  • talk about connection_class and disabling integrations
  • tenanted_rails_records
  • log_tenant_tag
  • default_tenant

2.7 Related Rails Configurations

TODO:

  • explain why we set some options
    • active_record.use_schema_cache_dump = true
    • active_record.check_schema_cache_dump_version = false

Documentation "work in progress"

Active Record API

Documentation outline:

  • setting the tenant
    • .with_tenant and .current_tenant=
      • and the callbacks for each, :with_tenant and :set_current_tenant
    • validation
      • invalid characters in a tenant name (which is database-dependent)
      • and how the application may want to do additional validation (e.g. ICANN subdomain restrictions)
    • #tenant is a readonly attribute on all tenanted model instances
    • .current_tenant returns the execution context for the model connection class
  • and what we do in this gem to help manage that "current tenant" state
  • logging
    • SQL query logs - see rdocs in connection_adapter.rb
      • set config.active_record.query_log_tags = [ :tenant ]
      • must also have config.active_record.query_log_tags_enabled = true
    • TaggedLogging and config.log_tenant_tag
    • suggest how to add to structured logs if people are doing that
  • migrations
    • create_tenant migrates the new database
    • but otherwise, creation of the connection pool for a tenant that has pending migrations will raise a PendingMigrationError
  • database rake tasks (where DBNAME is the name of the database configuration, e.g. primary)
    • db:migrate:DBNAME
      • dependency of db:migrate and db:prepare
      • it operates on all tenants by default
      • if there are no tenants it will create a database for the default tenant
      • the ARTENANT env var can be specified to run against a specific tenant
    • db:drop:DBNAME replaces db:drop:tenant
      • dependency of db:drop
      • it operates on all tenants by default
      • the ARTENANT env var can be specified to run against a specific tenant
    • db:reset:DBNAME replaces db:reset:tenant
      • dependency of db:reset
      • it operates on all tenants by default
      • the ARTENANT env var can be specified to run against a specific tenant

TODO:

  • implement AR::Tenanted::DatabaseConfigurations::BaseConfig

    • create the specialized BaseConfig for tenanted: true databases
    • BaseConfig disables database tasks initially
    • BaseConfig raises if a connection is attempted
    • #database_path_for(tenant_name)
    • #tenants returns all the tenants on disk (for iteration)
    • raise an exception if tenant name contains a path separator
    • bucketed database paths
  • implement AR::Tenanted::DatabaseConfigurations::TenantConfig

    • make sure the logs include the tenant name (via #new_connection)
  • Active Record class methods

    • .tenanted
      • mixin Tenant
      • should error if self is not an abstract base class
      • Tenant.with_tenant and .current_tenant
      • Tenant#tenant
      • use a sentinel value to avoid needing a protoshard
      • tenant_config_name and .tenanted?
    • .tenanted_with
      • mixin Subtenant
      • should error if self is not an abstract base class or if target is not tenanted abstract base class
      • .tenanted?
      • #tenanted?
    • shared connection pools
    • all the creation and schema migration complications (we have existing tests for this)
      • read and write to the schema dump file
      • write to the schema cache dump file
      • make sure we read from the schema cache dump file when untenanted
      • test production eager loading of the schema cache from dump files
    • UntenantedConnectionPool should peek at its stack and if it happened during schema cache load, output a friendly message to let people know what to do
    • concrete class usage, e.g.: User.current_tenant= or User.with_tenant { ... }
    • make it OK to call with_tenant("foo") { with_tenant("foo") { ... } }
    • rename while_tenanted to with_tenant
    • introduce .with_each_tenant which is sugar for ApplicationRecord.tenants.each { ApplicationRecord.with_tenant(_1) { } }
  • tenant selector

    • rebuild AR::Tenanted::TenantSelector to take a proc
      • make sure it sets the tenant and prohibits shard swapping
      • or explicitly untenanted, we allow shard swapping
      • or else 404s if an unrecognized tenant
  • old Tenant singleton methods that need to be migrated to the AR model

    • .current_tenant
    • .current_tenant=
    • .tenant_exist?
    • .with_tenant
    • .create_tenant
      • which should roll back gracefully if it fails for some reason
    • .destroy_tenant
  • autoloading and configuration hooks

    • create a zeitwerk loader
    • install a load hook
  • database tasks

    • make db:migrate:__dbname__ migrate all the existing tenants
    • make db:migrate:__dbname__ ARTENANT=asdf run migrations on just that tenant
    • make db:drop:__dbname__ drop all the existing tenants
    • make db:drop:__dbname__ ARTENANT=asdf drop just that tenant
    • make db:migrate run db:migrate:__dbname__
    • make db:prepare run db:migrate:__dbname__
    • make db:drop run db:drop:__dbname__
    • make a decision on what output tasks should emit, and whether we need a separate verbose setting
    • use the database name instead of "tenant", e.g. "db:migrate:primary"
    • make the implicit migration opt-in
    • fully implement all the relevant database tasks - see #222
  • installation

    • install a variation on the default database.yml with primary tenanted and non-primary "global" untenanted
    • initializer: commented lines with default values and some docstrings
    • mailer URL defaults (setting %{tenant} for subdomain tenanting)
  • think about race conditions

  • pruning connections and connection pools

    • look into how to cap the number of connection pools, and prune them
  • integration test coverage

    • connection_class
      • fixture tenant
      • fixture tenant in parallel suite
      • clean up non-default tenants
      • integration test session host
      • integration test session verbs
    • fixtures are loaded
    • tenanted_rails_records
  • additional configuration

    • default_tenant (local only)

Tenanting in your application

Documentation outline:

  • introduce the basics
    • explain .tenanted and the ActiveRecord::Tenanted::Tenant module
    • explain .subtenant_of and the ActiveRecord::Tenanted::Subtenant module
    • explain .with_tenant, .with_each_tenant, .current_tenant=, and current_tenant
    • demonstrate how to create a tenant, destroy a tenant, etc.
  • troubleshooting: what errors you might see in your app and how to deal with it
    • specifically when running untenanted

Testing

Documentation outline:

  • explain the concept of a default tenant
    • and that database connection is wrapped in a transaction
  • explain creating a new tenant
    • and how that database is NOT wrapped in a transaction during the test,
    • but those non-fixture databases will be cleaned up at the start of the test suite
  • explain without_tenant
  • example of:
    • unit test with fixtures
    • integration test
    • sytem test

TODO:

  • testing
    • a without_tenant test helper
    • set up test helper to default to a tenanted named "test-tenant"
    • set up test helpers to deal with parallelized tests, too (e.g. "test-tenant-19")
    • set up integration tests to do the right things ...
      • set the domain name in integration tests
      • wrap the HTTP verbs with without_tenant
      • set the domain name in system tests
    • allow the creation of tenants within transactional tests

Caching

Documentation outline:

  • explain why we need to be careful
  • explain how active record objects' cache keys have tenanting built in
  • explain why we're not worried about collection caching and partial caching (?)
  • explain why we're not worried about russian doll caching
  • explain why calling Rails.cache directly requires care that it's either explicitly tenanted or global
  • explain why we're not worried about sql query caching (it belongs to the connection pool)

TODO:

  • make basic fragment caching work
  • investigate: is collection caching going to be tenanted properly
  • investigate: make sure the QueryCache executor is clearing query caches for tenanted pool
  • do we need to do some exploration on how to make sure all caching is tenanted?
    • I'm making the call not to pursue this. Rails.cache is a primitive. Just document it.

Action View Fragment Caching

Documentation outline:

  • explain how it works (cache keys)

TODO:

  • extend #cache_key on Base
  • extend #cache_key on Subtenant

Solid Cache

Documentation outline:

  • describe one-big-cache and cache-in-the-tenanted-database strategies
    • note that cache-in-the-tenanted-database means there is no global cache
    • note that cache-in-the-tenanted-database is not easily purgeable (today)
    • and so we recommend (?) one big cache in a dedicated database
  • how to configure Solid Cache for one-big-cache
  • how to configure Solid Cache for tenanted-cache

TODO:

  • upstream
    • feature: make shard swap prohibition database-specific

Action Cable

Documentation outline:

  • explain why we need to be careful
  • how to tenant a channel
    • make sure to call super if you override #connect
  • how the global id also contains the tenant
  • do we need to document each adapter?
    • async
    • test
    • solid_cable
    • redis?

TODO:

  • extend the base connection to support tenanting with a tenanted_connection method
  • reconsider the current API using tenanted_connection if we can figure out how to reliably wrap #connect
    • did this! prefer to force the app to call super() from #connect, it's simpler
  • test disconnection
    • ActionCable.server.remote_connections.where(current_tenant: "foo", current_user: User.find(1)).disconnect
    • can we make this easier to use by implying the current tenant?
  • add tenant to the action_cable logger tags
  • add integration testing around executing a command (similar to Job testing)

Turbo Rails

Documentation outline:

  • explain why we need to be careful
  • explain how it works (global IDs)

TODO:

  • extend to_global_id and friends for Base
  • extend to_global_id and friends for Subtenant
  • some testing around global id would be good here
  • system test of a broadcast update

Active Job

Documentation outline:

  • explain why we need to be careful
  • explain belt-and-suspenders of
    • ActiveJob including the current tenant,
    • and any passed record being including the tenant in global_id

TODO:

  • extend ActiveJob to set the tenant in perform_now
  • extend to_global_id and friends for Base
  • extend to_global_id and friends for Subtenant
  • create a tenanted GlobalID locator
  • inject the tenanted GlobalID locator as the default app locator
  • make sure the test helper perform_enqueued_jobs wraps everything in a without_tenant block

Active Storage

Documentation outline:

  • explain why we need to be careful
  • explain how it works
    • if connection_class is set, then Active Storage will insert the tenant into the blob key
      • and the disk service will include the tenant in the path on disk in the root location, like: 'foobar/ab/cd/abcd12345678abcd'
  • Disk Service can also have a tenanted root path, but it's optional

TODO:

  • extend Disk Service to change the path on disk
  • extend Blob to have tenanted keys

ActionMailer

Documentation outline:

  • explain how to configure the action mailer default host if needed, with a "%{tenant}" format specifier.

TODO:

  • Interpolate the tenant into a host set in config.action_mailer.default_url_options
  • Do we need to do something similar for the asset host?
    • I'm going to wait until someone needs it, because it's not trivial to hijack.
  • Do we need to do something similar for explicit host parameters to url helpers?
    • I don't think so.
    • I'm going to wait until someone needs it, because it's not trivial to hijack.

ActionMailbox

TODO:

  • I need a use case here around mail routing before I tackle it

Console

Documentation outline:

  • explain the concept of a "default tenant"
  • explain usage of the ARTENANT environment variable to control startup

Metrics

Some places we should add instrumentation:

  • Creating a new tenant database
  • Migrating a tenant database
  • Destroying a tenant database
  • Creating a tenanted connection pool
  • Reaping a tenanted connection pool