OCTRL-1078 Make Task Controller able to control OCC tasks#804
OCTRL-1078 Make Task Controller able to control OCC tasks#804justonedev1 wants to merge 4 commits intomasterfrom
Conversation
knopers8
left a comment
There was a problem hiding this comment.
thanks! I did not try to run it yet, but I already have some questions/suggestions.
| * === This file is part of ALICE O² === | ||
| * | ||
| * Copyright 2024 CERN and copyright holders of ALICE O². | ||
| * Author: Teo Mrnjavac <teo.mrnjavac@cern.ch> |
There was a problem hiding this comment.
| * Author: Teo Mrnjavac <teo.mrnjavac@cern.ch> |
applies to the rest as well.
| /* | ||
| Copyright 2024. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||
| you may not use this file except in compliance with the License. | ||
| You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ |
There was a problem hiding this comment.
just a thought, we might want to know whether we should keep the original licences of auto-generated files or move to the one we use in O2.
| kind: Kustomization | ||
| images: | ||
| - name: controller | ||
| newName: docker.io/teom/aliecs |
There was a problem hiding this comment.
that's valid? see also the Makefile which refers to teom
There was a problem hiding this comment.
yeah, I haven't tried to build and deploy docker image by then yet, I was running everything by make run. I changed the names. However, in my upcoming PR with environment controller, I changed and split kustomization for each controller and fixed this naming.
| - O2_PARTITION={{ environment_id }} | ||
| user: "{{ user }}" | ||
| arguments: | ||
| - "/root/readout.yaml" |
There was a problem hiding this comment.
why is the manifest file added as an argument to readout?
There was a problem hiding this comment.
it is described in the documentation, basically it is a scaffold to let the executor know which manifest to load. You can see it inside the KubectlTask in other PR
| user: "{{ user }}" | ||
| value: "{{ len(modulepath)>0 ? _module_cmdline : _plain_cmdline }}" | ||
| arguments: | ||
| - "/root/stfbuilder-senderoutput.yaml" |
| ) | ||
|
|
||
| // fmqToOCCState maps FairMQ internal states to lowercase OCC state names | ||
| var fmqToOCCState = map[string]string{ |
There was a problem hiding this comment.
FYI, we do it a bit differently in the current system: https://github.com/AliceO2Group/Control/blob/master/core/integration/odc/fairmq/fairmq.go
It probably doesn't matter much in here though.
There was a problem hiding this comment.
I don't understand how is propagation of states from ODC to ecs core relevant in a context of sending and receiveing changes from/to fairmq devices
There was a problem hiding this comment.
The states from ODC are the states of FairMQ devices.
There was a problem hiding this comment.
sorry, I misunderstood what you are objecting to. You are right, I fixed it according to ODC rules, so we can at least see if there is some problem during multi-step transitions. But over-all it shouldn't matter too match as you mentioned in the original comment
| @@ -0,0 +1,69 @@ | |||
| apiVersion: aliecs.alice.cern/v1alpha1 | |||
There was a problem hiding this comment.
Perhaps this and the other manifest in this directory could be moved to ecs-manifests.
There was a problem hiding this comment.
this is leftover
| @@ -0,0 +1,306 @@ | |||
| # VERSION defines the project version for the bundle. | |||
There was a problem hiding this comment.
for my curiousity, this file is auto-generated by the operator SDK or AI-generated?
There was a problem hiding this comment.
this is all, first version is generated, than I made some changes manually and also used llm
| log.V(1).Info("new reconcile request on existing Task Kind", "request", req) | ||
|
|
||
| // Handle finalizers for clean deletion | ||
| if res, stop, err := r.handleFinalizer(ctx, t, log); err != nil || stop { |
There was a problem hiding this comment.
i am struggling to understand what it is needed for, could you explain please?
There was a problem hiding this comment.
finalizer is used for cleaning up tasks... it is basically just a tag to say "wait don't delete yet"
https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/
| // As far as I understand this is necessary because even though we have gRPC streams, we cannot rely | ||
| // on them being implemented |
There was a problem hiding this comment.
What do you mean that they might not be implemented? Why do we recognize it by seeing an empty t.Status.State?
There was a problem hiding this comment.
occlite doesn't implement StateStream, so we cannot rely on the consuming it. and get the status via automatic update, we need to query it.
introduction of controller for kubernetes inside the controller-operator folder.
This version allows us to control all OCC tasks (tested on readout, stfsender and stfbuilder). Currently there is not image and controller was run directly via
make runon the node where the kubernetes cluster is running. If you run it locally and cluster is on different computer, you would not be able to communicate with OCC process as these require opened portsManifests required to run these tasks are inside the
ecs-manifestssubfolder.In order to apply these manifests use
kubectl. There is an explanation which file is which inside thekubernetes-ecs.md.