Skip to content

TDS v3 Internals Redesign#183

Draft
mjaric wants to merge 14 commits intomasterfrom
next
Draft

TDS v3 Internals Redesign#183
mjaric wants to merge 14 commits intomasterfrom
next

Conversation

@mjaric
Copy link
Member

@mjaric mjaric commented Feb 27, 2026

TDS v3 Internals Redesign

Goal: Incrementally rewrite the TDS library internals to deliver a well-layered, well-tested codebase with
user-extensible types, clean protocol layers, correct TLS, and proper error handling. Without breaking the
public API.

Motivation

The current TDS library (v2.x) works but has fundamental architectural problems:

  • Monolithic type system - types.ex is 1800+ lines; adding one type requires 9-11 edits; no user extensibility
    without forking
  • TLS layer hacks - Tds.Tls GenServer has race conditions, unbounded buffers, and leaked processes
  • Implicit state machine - Process.put for hidden state, unguarded transitions, silent transaction corruption
  • No separation of concerns - packet framing, message construction, token parsing tangled together; binary
    macros in two modules with different endianness semantics
  • Ad-hoc error handling - multiple SQL errors silently discarded, unknown tokens crash, no error classification

Phase 1: Constants & Binary Macros - Single Source of Truth

  • Task 1.1 - Create Tds.Protocol.Constants (compile-time macros for all TDS protocol constants)
  • Task 1.2 - Create Tds.Protocol.Binary (unified binary macros with explicit endianness)
  • Task 1.3 - Migrate all modules to use Constants and Binary; fix prelogin :instopt decode bug
  • Task 1.4 - Remove old BinaryUtils and Grammar modules

Phase 2: Packet Framing Layer

  • Task 2.1 - Create Tds.Protocol.Packet (encode/decode TDS packet frames with bounded reassembly)
  • Task 2.2 - Migrate Messages and Protocol to use Packet

Phase 3: Extensible Type System

  • Task 3.1 - Define Tds.Type behaviour (type_codes, decode_metadata, decode, encode,
    param_descriptor, infer)
  • Task 3.2 - Create Tds.Type.Registry (TDS code <-> handler module mapping, user types checked first)
  • Task 3.3 - Implement built-in type handlers (integer, float, decimal, string, binary, boolean, datetime,
    uuid, money, xml)
  • Task 3.4 - Wire Type.Registry into Protocol, replacing the monolithic types.ex

Phase 4: Transport Abstraction

  • Task 4.1 - Create Tds.Transport behaviour and Transport.Tcp implementation
  • Task 4.2 - Implement Transport.Tls + TlsHandshake cb_info module (eliminates GenServer, fixes race
    conditions)
  • Task 4.3 - Migrate Protocol to use Transport

Phase 5: State Machine Cleanup

  • Task 5.1 - Create Protocol.State with explicit phase transitions (eliminates Process.put, guards
    invalid transitions)
  • Task 5.2 - Extract Protocol.Connection and Protocol.Execution modules

Phase 6: Error Handling

  • Task 6.1 - Create error hierarchy (Tds.Error with :reason, :errors, :context, :original)
  • Task 6.2 - Aggregate multiple SQL errors, add operation context, classify errors

Phase 7: Login/Prelogin & Connection Init Cleanup

  • Task 7.1 - Fix Prelogin bugs (exhaustive encryption state match, replace inspect(self()) thread ID)
  • Task 7.2 - Fix Login7 (input validation, automatic offset calculation, FeatureExt groundwork)
  • Task 7.3 - Add session option validation with SQL injection prevention

Bonus

  • Task B.1 - Create mix tds.export_errors task for refreshing SQL Server error codes CSV

Quality Gates

  • All 269 existing tests pass after every commit
  • Code coverage analysis to identify blind spots (currently 68% overall)
  • Each new module has dedicated unit tests
  • Compile clean: mix compile --warnings-as-errors
  • Format clean: mix format --check-formatted
  • No references to old modules: grep -r "BinaryUtils\|Protocol.Grammar" lib/ returns empty
  • Public API unchanged: lib/tds.ex function signatures match v2.x
  • Full integration tests against SQL Server via Docker

this module will be used as source of truth, it will allso alow us to quickly search to the code by searching using symbol usage
… feature extensions

New tokens added (not in the original implementation):

- `offset` (0x78), `altmetadata` (0x88), `dataclassification` (0xA3), `tabname` (0xA4), `colinfo` (0xA5), `featureextack` (0xAE), `altrow` (0xD3), `sessionstate` (0xE4), `sspi` (0xED), `fedauthinfo` (0xEE)
Consolidates little-endian macros from `Tds.BinaryUtils` and big-endian and parameterized macros from `Tds.Protocol.Grammar` into a single module.

Establishes clear conventions for byte order, with little-endian as the default and `_be` suffixes for big-endian variants. This centralizes common binary encoding and decoding utilities, improving consistency and maintainability across the TDS protocol implementation.
Centralizes all TDS protocol constants (packet types, token types, data types, etc.) into `Tds.Protocol.Constants` and binary utility functions into `Tds.Protocol.Binary`. This refactor replaces hardcoded hexadecimal values with compile-time macros, enhancing maintainability and consistency across the codebase.

Corrects a bug in `Tds.Protocol.Prelogin` where the `:instopt` prelogin token was incorrectly decoded as an `:encryption` token.
@mjaric mjaric requested a review from josevalim February 27, 2026 11:48
Introduces `Tds.Protocol.Packet` for TDS packet framing.

Provides `encode/2` to split payloads into 4096-byte packets, each with an 8-byte header and up to 4088 bytes of data. Packet IDs start at 1 and wrap at 256.

Adds `decode_header/1` to parse 8-byte packet headers from incoming binaries.
Introduces `Packet.reassemble/2` to reconstruct complete TDS messages from a stream of packets. This function correctly handles message fragmentation by:

*   Concatenating data from multiple packets.
*   Validating sequential packet IDs to ensure correct ordering, including wrap-around behavior.
*   Enforcing a configurable maximum payload size to mitigate denial-of-service risks.
*   Gracefully managing partial `recv` operations and multiple packets within a single read from the socket.
Improves code quality and readability within the packet reassembly module.

- Relocates type and constant definitions to the module top for better organization.
- Removes an unused parameter from the `extract_and_continue` function.
- Enhances documentation for the `:max_payload_size` option to show a clear default value.
- Replaces a nested conditional with pattern matching in `finish_or_continue` for cleaner, more idiomatic code.
Migrates all calls to `encode_packets` and `encode_header` in `Messages`, `Prelogin`, and `Login7` to use the unified `Tds.Protocol.Packet.encode/2` function.

Removes the now-unused `encode_header/4` and `encode_packets/3` from `Messages`, centralizing packet serialization logic and reducing duplication.
Adjusts `login7_test.exs` to correctly handle `iodata` returned by `Login7.encode`, converting it to a binary for assertion.

Removes `messages_test.exs` as its functionality is now covered and superseded by `PacketTest`.
Replaces the internal `msg_recv` and `next_tds_pkg` functions with `Packet.reassemble/1`.

This centralizes the logic for receiving and reassembling TDS packets into a dedicated module. It simplifies the `Tds.Protocol` module and improves error handling by leveraging the new `Packet` module's more robust error reporting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant