-
Notifications
You must be signed in to change notification settings - Fork 6
Feature/admx dl3 #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Feature/admx dl3 #195
Conversation
|
Sorry this is taking so long. I suggest splitting the review: New Features:
Fixes
|
|
New Features: Fixes
|
|
Regarding Fix 1: this is definitely a result of switching the underlying dl implementation to dl-cpp, even though calibrations are solely implemented in dl-py. Everything specified in a configuration file is first processed by the C++ code, and the specifications for how that's interpreted are taken, essentially, from the JSON specs. In a JSON object, the keys are always strings. I haven't given much thought yet to a solution, but I think we want to bound it by (a) satisfying the needs of arbitrary type mappings for a calibration, matching the dl2 behavior, and (b) not making the config file notation more (or "too"?) complicated. |
|
@dzhang0305 Regarding new feature 2 (adding the authentication object to the Interface init parameters), can you please describe a little about the use case for this? |
|
@dzhang0305 Regarding new feature 3 (new heartbeat), let me see if I understand the issue and fix correctly, and please correct anything that's wrong:
Assuming that's more-or-less correct, I have some thoughts and questions:
|
I was thinking dl-serve can run with args like "--auth-file /root/authentications.json" instead of only rely on the environment variables DRIPLINE_USER and DRIPLINE_PASSWORD. I don't think it's very crucial. |
Yes, this is to solve the 'ghost queue and connection' problem. I don't have a solid proof. But it feels like the problem showed up when the rabbitmq log related to this service is getting too long, the major connection of the queue is buried in the history log (there are many connections repeated and destroyed within a couple of ms). It becomes problematic when it takes more than 10s (the timeout setting) to find the connection of the queue. I haven't tested how long it takes the queue/connection to fail. I will do it this week. My impression was 2hours without talking to the service is 100% problematic, talking to it every 5min is totally fine. I haven't tried only talk to the service. Like I agree that if it's implemented in dripline-cpp, it would be more universal. Or we could implement the aliveness check in the k8s or docker swarm instead of dripline. For |
This is already possible. For all CL options, including those related to authentication, you can see the docs or use |
update and add/remove sections as needed
New Features
please describe new capabilities
The
scheduled_logfunctionality is copied from the develop branch but not in the main branchAdd the old
authentication_objway to set the Authentication.Schedule a message to check a specific routing_key (
rk_aliveness) is alive everyheartbeat_broker_sseconds. This fixed the ghost queue and connection that appears after long idle time. This require that the service yaml file has two more inputse.g.
Fixes
convert the
value_rawto string before looking up a calibration dictionary because the latter is by default interpret as a string-to-string if it's a dictionary (as least for bools).change
this_select = sqlalchemy.select(return_cols)tothis_select = sqlalchemy.select(*return_cols)Prior to merging for releases:
CMakeLists.txtfileappVersionto be the new container image tag version inchart/Chart.yaml