Skip to content

db-openai: Add AsyncDatabricksSession class to support Session Protocol for Stateful Conversation Management via SQLAlchemy engine#316

Merged
jennsun merged 17 commits intomainfrom
openai-lakebasesqlalchemysession
Feb 10, 2026
Merged

db-openai: Add AsyncDatabricksSession class to support Session Protocol for Stateful Conversation Management via SQLAlchemy engine#316
jennsun merged 17 commits intomainfrom
openai-lakebasesqlalchemysession

Conversation

@jennsun
Copy link
Contributor

@jennsun jennsun commented Feb 3, 2026

Adds AsyncDatabricksSession, a session storage implementation for the OpenAI Agents SDK that persists conversation history to Databricks Lakebase.

This class subclasses OpenAI's SQLAlchemySession original code to inherit all SQL logic while adding Lakebase-specific features:

  • Automatic OAuth token rotation via SQLAlchemy's do_connect event
  • Instance name resolution and username inference from _LakebasePoolBase

More on Session protocol:
https://openai.github.io/openai-agents-python/ref/memory/session/#agents.memory.session.Session

Using SQLAlchemy's default QueuePool (pool_size=5, max_overflow=10) for connection pooling

Usage:

from databricks_openai.agents import AsyncDatabricksSession

  session = AsyncDatabricksSession(
  session_id=session_id,
        instance_name=LAKEBASE_INSTANCE_NAME,
    )
   result = Runner.run_streamed(agent, input=messages, session=session)

Example queries:

curl -X POST http://localhost:8000/invocations \
    -H "Content-Type: application/json" \
    -d '{"input": [{"role": "user", "content": "Hello I live in SF!"}]}'

returns responses with session id:

{"object":"response","output":[{"type":"message","id":"__fake_id__","content":[{"annotations":[],"text":"Hi! What part of San Francisco are you in, and what are you looking for—recommendations (food/coffee, parks, things to do), help planning a day, or something else?","type":"output_text","logprobs":[]}],"role":"assistant","status":"completed","provider_data":{"model":"databricks-gpt-5-2","response_id":"chatcmpl-D5Ki6f7TKBNrVVfuxDL0Lu16YCQjz"}}],"custom_outputs":{"session_id":"fd57ff2c-1d66-4da3-ba28-3216d4e6d86e"}}

follow-up stateful question:

curl -X POST http://localhost:8000/invocations \
    -H "Content-Type: application/json" \
    -d '{
        "input": [{"role": "user", "content": "What city did I say I live in?"}],
        "custom_inputs": {"session_id": "fd57ff2c-1d66-4da3-ba28-3216d4e6d86e"}
    }'

gives us:

{"object":"response","output":[{"type":"message","id":"__fake_id__","content":[{"annotations":[],"text":"You said you live in SF (San Francisco).","type":"output_text","logprobs":[]}],"role":"assistant","status":"completed","provider_data":{"model":"databricks-gpt-5-2","response_id":"chatcmpl-D5KiakiUvMW2weIeLXUuPesxyNpwv"}}],"custom_outputs":{"session_id":"fd57ff2c-1d66-4da3-ba28-3216d4e6d86e"}}

testing:
unit + integration tests

sample agent: OpenAI MemorySession Stateful Agent Example
sample app: https://eng-ml-agent-platform.staging.cloud.databricks.com/apps/j-openai-stateful?o=2850744067564480
image

@jennsun jennsun changed the title memorysession subclassing SQLAlchemySession OpenAI: Add MemorySession class to support Session Protocol for Stateful Conversation Management Feb 4, 2026
@jennsun jennsun marked this pull request as ready for review February 4, 2026 00:13
@jennsun jennsun requested a review from bbqiu February 4, 2026 01:49
# ensuring fresh tokens are injected via do_connect event.
engine = create_async_engine(
url,
pool_recycle=DEFAULT_POOL_RECYCLE_SECONDS,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connection pooling happens here via SQLAlchemy's default QueuePool (SQLAlchemy defaults to QueuePool) and pool_recycle=2700 setting ensures connections are recycled every 45 minutes (before the 50-min token cache expires), at which point the do_connect event injects a fresh token.

@@ -0,0 +1,234 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want this to be importable from databricks_openai.agents?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I'll make this importable so instead of:

from databricks_openai.agents.session import MemorySession

import path will look like:

from databricks_openai.agents import MemorySession

DEFAULT_DATABASE = "databricks_postgres"


class _LakebaseCredentials(_LakebasePoolBase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to rename _LakebasePoolBase?

also could you remind me why we didn't have the cache lock as an attribute of the _LakebasePoolBase instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had separate locks (sync vs async) for the sync vs async LakebasePools we implemented, so each of the subclasses adds their own cache lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename _LakebasePoolBase to be _LakebaseBase as it's a more generic class for resolving lakebase host/username/token caching - the actual pooling logic is actually implemented in the subclasses LakebasePool/AsyncLakebasePool

return token


class MemorySession(SQLAlchemySession):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls correct me if i'm wrong, but i think this is async only?

can we leave some clarification about this in the docstrings / throw a helpful error if someone tries to run this synchronously

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it's async only - taking a look at the source code all of these classes implement the async interface (since they follow the session protocol):
https://github.com/openai/openai-agents-python/blob/main/src/agents/memory/session.py

i'll cover this in unit tests/rename to AsyncDatabricksSession to make it clearer

)

# Attach event to inject Lakebase token before each connection
# Note: do_connect fires on sync_engine even for async operations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are creating an async engine right? is this an old comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but the async engine is a wrapper around a sync engine:

there is not yet an “async” version of a SQLAlchemy event handler

"Events can be registered at the instance level (e.g. a specific AsyncEngine instance) by associating the event with the sync attribute that refers to the proxied object. For example to register the PoolEvents.connect() event against an AsyncEngine instance, use its AsyncEngine.sync_engine attribute as target."
link: https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html#using-events-with-the-asyncio-extension

return self._credentials.username

@property
def connection_url(self) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we wanna reuse this in _create_engine func above?

token_cache_duration_seconds=token_cache_duration_seconds,
)

engine = self._create_engine(**engine_kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc, this can potentially be run on every new conversation right?

is there any way to reuse an engine across sessions? if not, should we try to make these operations async via event loops

@jennsun jennsun requested a review from bbqiu February 5, 2026 19:49
return token


class AsyncDatabricksSession(SQLAlchemySession):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from talking to the research team working on DBRA, they actually have a very similar snippet as us to manage a SQLAlchemy connection to lakebase: https://sourcegraph.prod.databricks-corp.com/databricks-eng/universe/-/blob/research/aroll/app/aroll_app/db/connection.py?L162-182

would it make sense for us to further abstract this by providing a similar AsyncLakebaseSQLAlchemy / LakebaseSQLAlchemy class?

Copy link
Contributor Author

@jennsun jennsun Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline but I'll refactor such that:

this will create much cleaner separation of concerns for future frameworks to reuse any sqlalchemy engines etc!

@jennsun jennsun requested a review from bbqiu February 6, 2026 02:04
# Class-level cache for AsyncLakebaseSQLAlchemy instances, keyed by instance_name.
# This allows multiple AsyncDatabricksSession instances to share a single engine/pool.
_lakebase_sql_alchemy_cache: dict[str, AsyncLakebaseSQLAlchemy] = {}
_lakebase_sql_alchemy_cache_lock = Lock()
Copy link
Contributor Author

@jennsun jennsun Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on the class-level cache for AsyncLakebaseSQLAlchemy engines keyed by instance_name?

this is so we reuse a single SQLAlchemy engine / pool per Lakebase instance, avoiding repeated pool creation, TCP handshakes, and auth setup.

sessions are still created per Runner.run(), but engines are shared

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this approach looks good to me to minimize IO. two comments:

  • we may want to include a param for a func for customers to customize the cache key. currently, diff engine kwargs for the same instance name will be ignored
  • let's also call this out in the docstring and add a param to optionally disable this engine caching

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the best case would be include engine kwargs + instance name in the cache key

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good - going to create cache key that takes into consideration both instance name + engine kwards, as well as ability to not cache the engines (but defaults to caching)

@@ -6,11 +6,15 @@
import uuid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ok for followup PR) we should probably think about separating this file into a few separate ones since it's getting quite long

model="databricks-claude-3-7-sonnet",
messages=[{"role": "user", "content": "hi"}],
tools=tools,
tools=cast(Any, tools),
Copy link
Collaborator

@bbqiu bbqiu Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we delete this change from the diff? i think we still need it cc @fanzeyi who ran into a bug that that was fixing earlier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context: #274 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this back here and included unit tests to make sure the non-list inputs are handled gracefully!

]

[project.optional-dependencies]
memory = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we update the CI job for this memory extra too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

Copy link
Collaborator

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, please address all comments and this looks ready to merge!

@jennsun jennsun requested review from bbqiu and fanzeyi February 9, 2026 21:11
Copy link
Collaborator

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, feel free to merge after addressing comments!

@jennsun jennsun changed the title OpenAI: Add MemorySession class to support Session Protocol for Stateful Conversation Management db-openai: Add AsyncDatabricksSession class to support Session Protocol for Stateful Conversation Management via SQLAlchemy engine Feb 10, 2026
@jennsun jennsun merged commit 4fbc9ea into main Feb 10, 2026
38 checks passed
@jennsun jennsun deleted the openai-lakebasesqlalchemysession branch February 10, 2026 00:48
jennsun added a commit to databricks/app-templates that referenced this pull request Feb 13, 2026
OpenAI AsyncDatabricksSession Stateful Agent Example
using session protocol class implemented in databricks/databricks-ai-bridge#316

* openai agents stateful example

* add session id to outputs

* update example w/ asyncdatabrickssession

* package release agent updates

* use uuid7 for example

* pr review updates

* add openai agent memory skill

* add to openai templates sync script

* run python sync skills

* databricks yml and use chatcontext convo id

* sanitize mcp tool output items https://github.com/databricks/app-templates/pull/119/changes

* deduplicate input logic

* update sanitize mcp handler to be more defensive

* rename from agent-openai-agents-sdk-stateful-memory to agent-openai-agents-sdk-short-term-memory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants