Skip to content

New Agent: tauHK#961

Open
merny93 wants to merge 24 commits intomainfrom
tauhk
Open

New Agent: tauHK#961
merny93 wants to merge 24 commits intomainfrom
tauhk

Conversation

@merny93
Copy link
Copy Markdown

@merny93 merny93 commented Dec 15, 2025

This PR aims to integrate the tauHK cryogenic housekeeping instrumentation into the ocs framework for use on the LAT.

Description

The tauHK instrument interfaces with a host computer over ethernet and communicates with a daemon process. The code in this PR is a thin wrapper on top of the daemon.

Data and commands are sent over a UDP socket serialized using protobuf. A one time code-gen step is required to generate all the python protobuf definitions such that the agent can communicate with the daemon. Resulting protobuf definitions files are kept in the pb2/ directory but not tracked in git.

This agent spawns the daemon process using the subprocess module and pushes logs to an ocs feed. The compiled binary is supplied but not tracked in git.

Motivation and Context

tauHK will be installed on site during the ASO upgrades. This code will be deployed on amun .

How Has This Been Tested?

This agent code has been tested on a development machine at Princeton. The hardware and software were tested end-to-end along with ocs-web integration.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@BrianJKoopman BrianJKoopman self-requested a review January 6, 2026 19:50
Copy link
Copy Markdown
Member

@BrianJKoopman BrianJKoopman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this took a while to get to, but here's a first pass at a review. It'd be great to have the requested documentation for the dependencies.

Comment thread socs/agents/tauhk/agent.py Outdated
Comment on lines +37 to +41
# The linter may tell you that these are unused but they are needed for the protobuf validation
# The linter may tell you that these are unused but they are needed for the protobuf validation
from pb2 import meta_pb2, validate_pb2
from pb2.system_pb2 import HKdata, HKsystem
from protoc_gen_validate.validator import ValidationFailed, validate_all
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the duplicate comment, and add # noqa: 401 to the end of the lines that giving import unused errors in flake8 to ignore them, which are:

socs/agents/tauhk/agent.py:29:1: F401 'copy.deepcopy' imported but unused
socs/agents/tauhk/agent.py:30:1: F401 'datetime.datetime' imported but unused
socs/agents/tauhk/agent.py:36:1: F401 'ocs.ocs_feed' imported but unused
socs/agents/tauhk/agent.py:39:1: F401 'pb2.meta_pb2' imported but unused
socs/agents/tauhk/agent.py:39:1: F401 'pb2.validate_pb2' imported but unused
socs/agents/tauhk/agent.py:41:1: F401 'protoc_gen_validate.validator.ValidationFailed' imported but unused

Comment thread docs/agents/tauhk.rst
Copy link
Copy Markdown
Member

@BrianJKoopman BrianJKoopman Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is largely incomplete. There's a complete example in the ocs docs. You can also see any other agent's reference page for an example. Important things missing:

  • The argparse block, which automatically displays arguments for the agent
  • A description of dependencies for running this agent (I know we've discussed this offline, but just for the record here). This should describe, or otherwise link to, instructions for installing the required dependencies.
  • Configuration file examples
  • "Agent API" section

Could you also add it to the index (in alphabetical order)?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first link may be incorrect. Was it supposed to point to: https://flake8.pycqa.org/en/latest/user/violations.html#in-line-ignoring-errors

Can you provide the link you had in mind please!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry about that, updated in the original post. It's here: https://ocs.readthedocs.io/en/main/developer/agent_references/documentation.html

Comment on lines +1 to +18
'''
protoc --python_out=pb2/ -I../protos -I../protos/include ../protos/system.proto ../protos/include/hk.proto ../protos/include/validate.proto ../protos/include/meta.proto

the import in the generated tauhk file is wrong
to fix it, change the import in the generated file from:
import blah as blah
to:
from . import blah as blah

pb2/include/hk_pb2.py will need to be adjusted as follows:
from .. import meta_pb2 as meta__pb2
from google.protobuf import empty_pb2 as google_dot_protobuf_dot_empty__pb2
from .. import validate_pb2 as validate__pb2
from google.protobuf import descriptor_pb2 as google_dot_protobuf_dot_descriptor__pb2

from ..meta_pb2 import *

'''
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better in the tauhk.rst file when describing how to install the dependencies.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, can you wrap your lines to ~80-100 characters? Your editor might have a way to automatically do this. (I still use 80, but recognize that monitors are typically wider these days.)

(I thought this was somewhere in the SO Developer Guide, which we link to from our contributing guide, but I actually can't seem to find it...however that is the convention throughout the repo.)

self.log = agent.log
self._take_data = False

self.command_port = ("127.0.0.1", 3006)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where self.command_port is passed it typically refers to "send command to the tauHK system". Why is the address the localhost though? Is this just sending the command to another process running locally on this machine?

This is a leading question towards suggesting that this be configurable (a 3006 default is fine) -- since the hardcoded port might not be open on the host.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is to communicate with the daemon process that is spawned by the agent to communicate with the hardware. It runs on the same computer as the agent. These ports are hardcoded on the daemon side so we can not make this configurable unless the daemon code is also modified.

Comment on lines +306 to +308
rtdChannel.logdac sets the excitation range for RTDs from 0(min) to 15(max)
rtdChannel.uvolts sets the excitation voltage in microvolts (e.g., 56 for 56uV)
diodeChannel.excitation sets the excitation mode for diodes: 0=DC, 1=AC, 2=None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the values exceed the ranges here? Is that caught on the tauhk side?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple places where it is confirmed. The set_param function will check the ranges and may error out returning the status to the user.


return True, f"Sent {channel_name}.{option_name}={val} successfully."

def advertise(self, session, params):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thought here is that a script would check the advertise processes' session.data, find a valid command, then use generic_send to send that command and the desired value?

Is this communicating with the tauHK hardware at all? Where do the available commands come from?

Also, in the running agent on site, this seems very static. When does the available command list change?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advertise process was added to show to the webocs plugin that we wrote what channels are commandable. The command list lives inside the system_pb2 definitions. It is generated by the protoc compiler from a config file that describes what hardware is connected and provides mappings for channel names, along with some other meta data.


@ocs_agent.param('config_file', type=str)
def load_config(self, session, params):
"""load_config()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show the argument in the method signature:

Suggested change
"""load_config()
"""load_config(config_file)

Comment on lines +496 to +497
with socket.socket(socket.AF_INET, socket.SOCK_DGRAM) as sock:
sock.sendto(message.SerializeToString(), command_port)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Protobuf do something to ensure delivery of the command? Else, you might run into instances where your command doesn't make it to the device.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is tricky to handle since all communication is UDP. That being said, the hardware streams "state" information back and a user can look at grafana to see if their desired command took effect. For example when adjusting an RTD excitation the user can look at the logdac field for that channel to see if it was changed.


@lru_cache(maxsize=256)
def get_spf(channel_name, channel_quantity):
'''get_spf(channel_name, channel_quantity)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need really to override the method signatures in these simple functions. That's only important in the ocs tasks/processes because they contain misleading arguments when rendered in Sphinx otherwise.

@BrianJKoopman
Copy link
Copy Markdown
Member

Pushed the changes from yesterday that we made to get things working. Slightly messy with the commits, sorry about that. We'll end up squashing in the end anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants