Upgrade to psycopg3#194
Conversation
IamJeffG
left a comment
There was a problem hiding this comment.
I agree this is a successful upgrade from v2 to v3 without introducing new problems.
That said, there are a couple things i want to clean up in future work. (I can make issues for these, if you agree):
- closing connections. There are many places I believe our scripts are not closing pg connections when done, and this can lead to connection scarcity on the server.
According to https://www.psycopg.org/psycopg3/docs/basic/from_pg2.html#with-connection:
- psycopg2:
with connection: manages the transaction only but does not close - psycopg 3:
with connection: manages the transaction and closes the connection
I say this to highlight that leaking connections is not new, but has always been there. psycopg 3 gives us a new tool, which is to lean into using context managers and it will clean up connections for us!
Specifically I suggest that we should set a psycopg 3 "style guide" for this repo that dictates to always use context managers with connection, unless an egregious circumstance.
- transaction management & autocommit. Every case our code addresses a transaction (by calling
conn.commit()or.rollback()) I think the code would have been clearer if we had just opened the connection withautocommit=Truein the first place. I do not notice any case where we are rolling back more than one query at a time.
It is not reasonable to expect the code maintainers to keep in mind which connections have autocommit enabled vs disabled (i.e. already in a transaction). I claim we should pick one setting or the other, and my preference, for this repo, would be autocommit=True.
Before moving forward, it's probably worth a look to see if we are leveraging "larger" transactions (around multiple commands at a time) and if so then where.
Combining 1 and 2 means this repo would never call connect() except as follows:
with psycopg.connect(autocommit=True):- I could be wrong about this (I acknowledge I didn't look all that closely) but I think StructuredDBWriter is opening and closing connections all the time. That's expensive. Can this class be refactored to open one connection, or better yet, to accept an already-open connection from the caller?
Again, no change needed on this PR. I will open new Issues if you agree.
…into upgrade-psycopg3
Co-authored-by: Jeffrey Gerard <IamJeffG@users.noreply.github.com>
Co-authored-by: Jeffrey Gerard <IamJeffG@users.noreply.github.com>
…scripts-hub into upgrade-psycopg3
|
@IamJeffG thank you for pointing out these additional improvements! I agree with your suggestions. I went ahead and audited the repository for every instance where we call I also checked and did not find any multi-statement transactions. Finally, I also added a note about database connection standards to I felt that these improvements were pretty well in scope with the overall motivation of this PR, nor a heavy lift to implement. That leaves a refactor of StructuredDBWriter. I think that's a different batch of work. If you're willing to write the ticket, that would be great 👍 |
|
The changes to StructuredDBWriter will happen in #210 |
Goal
Finishes what #99 started.
What I changed and why
db_operationstodsnas keyword argumentfinally: conn.close()block as the context manager handles it (https://access.crunchydata.com/documentation/psycopg3/3.1.9/basic/from_pg2.html#with-connection)conn.commit()after successful operations andconn.rollback()in exception handlers as psycopg3 requires explicit transaction control (https://www.psycopg.org/psycopg3/docs/basic/transactions.html)postgres_to_csv,alerts_gcs_test) to align with new behavior from psycopg3.What I'm not doing here
LLM use disclosure
Claude Sonnet 4.5 in Cursor to help implement changes to imports and adjusting some tests across the repo.