Skip to content

Latest commit

 

History

History
454 lines (360 loc) · 17.1 KB

File metadata and controls

454 lines (360 loc) · 17.1 KB

Active Record Tenanting

Note

This file will eventually become a complete "Rails Guide"-style document explaining Active Record tenanting with this gem.

In the meantime, it is a work-in-progress containing:

  • skeleton outline for documentation
  • functional roadmap represented as to-do checklists

Introduction

Tip

If you're not familiar with how Rails's built-in horizontal sharding works, it may be worth reading the Rails Guide on Multiple Databases with Active Record before proceeding.

Documentation outline:

Active Record

Configuration

Documentation outline:

  • how to configure database.yml

    • for tenanting a primary database
    • for tenanting a non-primary database
  • how to configure model classes and records

    • variations for primary or non-primary records
    • how to make a class that inherits from ActiveRecord::Base "subtenant" from a tenanted database
      • and note how we do it out of the box for Rails records
  • Rails configuration

    • explain why we set some options
      • active_record.use_schema_cache_dump = true
      • active_record.check_schema_cache_dump_version = false
    • explain gem railtie config options
      • connection_class
      • tenant_resolver
      • tenanted_rails_records
      • log_tenant_tag
    • demonstrate how to configure an app with subdomain tenanting
      • app.config.hosts
      • example TenantSelector config
    • demonstrate how to configure an app with root path tenanting
      • app.config.hosts
      • example TenantSelector config
  • migrations

    • create_tenant migrates the new database
    • but otherwise, creation of the connection pool for a tenant that has pending migrations will raise a PendingMigrationError
    • db:migrate will migrate all tenants

TODO:

  • implement AR::Tenanted::DatabaseConfigurations::RootConfig

    • create the specialized RootConfig for tenanted: true databases
    • RootConfig disables database tasks initially
    • RootConfig raises if a connection is attempted
    • #database_path_for(tenant_name)
    • #tenants returns all the tenants on disk (for iteration)
    • raise an exception if tenant name contains a path separator
    • bucketed database paths
  • implement AR::Tenanted::DatabaseConfigurations::TenantConfig

    • make sure the logs include the tenant name (via #new_connection)
  • Active Record class methods

    • .tenanted
      • mixin Tenant
      • should error if self is not an abstract base class
      • Tenant.with_tenant and .current_tenant
      • Tenant#tenant
      • use a sentinel value to avoid needing a protoshard
      • tenant_config_name and .tenanted?
    • .tenanted_with
      • mixin Subtenant
      • should error if self is not an abstract base class or if target is not tenanted abstract base class
      • .tenanted?
      • #tenanted?
    • shared connection pools
    • all the creation and schema migration complications (we have existing tests for this)
      • read and write to the schema dump file
      • write to the schema cache dump file
      • make sure we read from the schema cache dump file when untenanted
      • test production eager loading of the schema cache from dump files
    • UntenantedConnectionPool should peek at its stack and if it happened during schema cache load, output a friendly message to let people know what to do
    • concrete class usage, e.g.: User.current_tenant= or User.with_tenant { ... }
    • make it OK to call with_tenant("foo") { with_tenant("foo") { ... } }
    • rename while_tenanted to with_tenant
    • introduce .with_each_tenant which is sugar for ApplicationRecord.tenants.each { ApplicationRecord.with_tenant(_1) { } }
  • tenant selector

    • rebuild AR::Tenanted::TenantSelector to take a proc
      • make sure it sets the tenant and prohibits shard swapping
      • or explicitly untenanted, we allow shard swapping
      • or else 404s if an unrecognized tenant
  • old Tenant singleton methods that need to be migrated to the AR model

    • .current_tenant
    • .current_tenant=
    • .tenant_exist?
    • .with_tenant
    • .create_tenant
      • which should roll back gracefully if it fails for some reason
    • .destroy_tenant
  • autoloading and configuration hooks

    • create a zeitwerk loader
    • install a load hook
  • database tasks

    • make db:migrate:tenant:all iterate over all the tenants on disk
    • make db:migrate:tenant ARTENANT=asdf run migrations on just that tenant
    • make db:migrate:tenant run migrations on development-tenant in dev
    • make db:migrate run db:migrate:tenant in dev
    • make db:prepare run db:migrate:tenant in dev
    • make a decision on what output tasks should emit, and whether we need a separate verbose setting
    • make the implicit migration opt-in
    • use the database name instead of "tenant", e.g. "db:migrate:primary"
    • fully implement all the relevant database tasks:
      • db:_dump
      • db:_dump:__name__
      • db:abort_if_pending_migrations
      • db:abort_if_pending_migrations:__name__
      • db:charset
      • db:check_protected_environments
      • db:collation
      • db:create
      • db:create:all
      • db:create:__name__
      • db:drop
      • db:drop:_unsafe
      • db:drop:all
      • db:drop:__name__
      • db:encryption:init
      • db:environment:set
      • db:fixtures:identify
      • db:fixtures:load
      • db:forward
      • db:install:migrations
      • db:load_config
      • db:migrate with support for VERSION
      • db:migrate:down with support for VERSION
      • db:migrate:down:__name__
      • db:migrate:__name__
      • db:migrate:redo with support for STEP and VERSION
      • db:migrate:redo:__name__
      • db:migrate:reset
      • db:migrate:status
      • db:migrate:status:__name__
      • db:migrate:up with support for VERSION
      • db:migrate:up:__name__
      • db:prepare
      • db:purge (see Known Issues below)
      • db:purge:all (see Known Issues below)
      • db:reset
      • db:reset:all
      • db:reset:__name__
      • db:rollback with support for STEP
      • db:rollback:__name__
      • db:schema:cache:clear
      • db:schema:cache:dump
      • db:schema:dump
      • db:schema:dump:__name__
      • db:schema:load
      • db:schema:load:__name__
      • db:seed
      • db:seed:replant
      • db:setup
      • db:setup:all
      • db:setup:__name__
      • db:test:load_schema
      • db:test:load_schema:__name__
      • db:test:prepare
      • db:test:prepare:__name__
      • db:test:purge
      • db:test:purge:__name__
      • db:truncate_all
      • db:version
      • db:version:__name__
  • installation

    • install a variation on the default database.yml with primary tenanted and non-primary "global" untenanted
    • initializer: commented lines with default values and some docstrings
    • mailer URL defaults (setting %{tenant} for subdomain tenanting)
  • think about race conditions

  • pruning connections and connection pools

    • look into whether the proposed Reaper changes will allow us to set appropriate connection min/max/timeouts
      • and if not, figure out how to prune unused/timed-out connections
    • we should also look into how to cap the number of connection pools, and prune them
  • integration test coverage

    • connection_class
      • fixture tenant
      • fixture tenant in parallel suite
      • clean up non-default tenants
      • integration test session host
      • integration test session verbs
    • fixtures are loaded
    • tenanted_rails_records
  • additional configuration

    • default_tenant (local only)

Tenanting in your application

Documentation outline:

  • introduce the basics
    • explain .tenanted and the ActiveRecord::Tenanted::Tenant module
    • explain .subtenant_of and the ActiveRecord::Tenanted::Subtenant module
    • explain .with_tenant, .with_each_tenant, .current_tenant=, and current_tenant
    • demonstrate how to create a tenant, destroy a tenant, etc.
  • troubleshooting: what errors you might see in your app and how to deal with it
    • specifically when running untenanted

Testing

Documentation outline:

  • explain the concept of a default tenant
    • and that database connection is wrapped in a transaction
  • explain creating a new tenant
    • and how that database is NOT wrapped in a transaction during the test,
    • but those non-fixture databases will be cleaned up at the start of the test suite
  • explain without_tenant
  • example of:
    • unit test with fixtures
    • integration test
    • sytem test

TODO:

  • testing
    • a without_tenant test helper
    • set up test helper to default to a tenanted named "test-tenant"
    • set up test helpers to deal with parallelized tests, too (e.g. "test-tenant-19")
    • set up integration tests to do the right things ...
      • set the domain name in integration tests
      • wrap the HTTP verbs with without_tenant
      • set the domain name in system tests
    • allow the creation of tenants within transactional tests

Caching

Documentation outline:

  • explain why we need to be careful
  • explain how active record objects' cache keys have tenanting built in
  • explain why we're not worried about collection caching and partial caching (?)
  • explain why we're not worried about russian doll caching
  • explain why calling Rails.cache directly requires care that it's either explicitly tenanted or global
  • explain why we're not worried about sql query caching (it belongs to the connection pool)

TODO:

  • make basic fragment caching work
  • investigate: is collection caching going to be tenanted properly
  • investigate: make sure the QueryCache executor is clearing query caches for tenanted pool
  • do we need to do some exploration on how to make sure all caching is tenanted?
    • I'm making the call not to pursue this. Rails.cache is a primitive. Just document it.

Action View Fragment Caching

Documentation outline:

  • explain how it works (cache keys)

TODO:

  • extend #cache_key on Base
  • extend #cache_key on Subtenant

Solid Cache

Documentation outline:

  • describe one-big-cache and cache-in-the-tenanted-database strategies
    • note that cache-in-the-tenanted-database means there is no global cache
    • note that cache-in-the-tenanted-database is not easily purgeable (today)
    • and so we recommend (?) one big cache in a dedicated database
  • how to configure Solid Cache for one-big-cache
  • how to configure Solid Cache for tenanted-cache

TODO:

  • upstream
    • feature: make shard swap prohibition database-specific

Action Cable

Documentation outline:

  • explain why we need to be careful
  • how to tenant a channel
    • make sure to call super if you override #connect
  • how the global id also contains the tenant
  • do we need to document each adapter?
    • async
    • test
    • solid_cable
    • redis?

TODO:

  • extend the base connection to support tenanting with a tenanted_connection method
  • reconsider the current API using tenanted_connection if we can figure out how to reliably wrap #connect
    • did this! prefer to force the app to call super() from #connect, it's simpler
  • test disconnection
    • ActionCable.server.remote_connections.where(current_tenant: "foo", current_user: User.find(1)).disconnect
    • can we make this easier to use by implying the current tenant?
  • add tenant to the action_cable logger tags
  • add integration testing around executing a command (similar to Job testing)

Turbo Rails

Documentation outline:

  • explain why we need to be careful
  • explain how it works (global IDs)

TODO:

  • extend to_global_id and friends for Base
  • extend to_global_id and friends for Subtenant
  • some testing around global id would be good here
  • system test of a broadcast update

Active Job

Documentation outline:

  • explain why we need to be careful
  • explain belt-and-suspenders of
    • ActiveJob including the current tenant,
    • and any passed record being including the tenant in global_id

TODO:

  • extend ActiveJob to set the tenant in perform_now
  • extend to_global_id and friends for Base
  • extend to_global_id and friends for Subtenant
  • create a tenanted GlobalID locator
  • inject the tenanted GlobalID locator as the default app locator
  • make sure the test helper perform_enqueued_jobs wraps everything in a without_tenant block

Active Storage

Documentation outline:

  • explain why we need to be careful
  • explain how it works
    • if connection_class is set, then Active Storage will insert the tenant into the blob key
      • and the disk service will include the tenant in the path on disk in the root location, like: 'foobar/ab/cd/abcd12345678abcd'
  • Disk Service can also have a tenanted root path, but it's optional

TODO:

  • extend Disk Service to change the path on disk
  • extend Blob to have tenanted keys

ActionMailer

Documentation outline:

  • explain how to configure the action mailer default host if needed, with a "%{tenant}" format specifier.

TODO:

  • Interpolate the tenant into a host set in config.action_mailer.default_url_options
  • Do we need to do something similar for the asset host?
    • I'm going to wait until someone needs it, because it's not trivial to hijack.
  • Do we need to do something similar for explicit host parameters to url helpers?
    • I don't think so.
    • I'm going to wait until someone needs it, because it's not trivial to hijack.

ActionMailbox

TODO:

  • I need a use case here around mail routing before I tackle it

Console

Documentation outline:

  • explain the concept of a "default tenant"
  • explain usage of the ARTENANT environment variable to control startup