Skip to content
Merged
28 changes: 26 additions & 2 deletions .github/workflows/test_integration_epic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ jobs:
echo "ldd bin/eicrecon"
ldd bin/eicrecon

- name: Generate EICrecon input data
- name: Generate EICrecon input data for craterlake
uses: eic/run-cvmfs-osg-eic-shell@main
with:
platform-release: "eic_xl:nightly"
Expand All @@ -146,7 +146,7 @@ jobs:
--gun.momentumMax "20*GeV" --gun.distribution "uniform" -N 20 \
--outputFile sim_e_1GeV_20GeV_craterlake.edm4hep.root -v WARNING

- name: Run EICrecon
- name: Run EICrecon for craterlake
uses: eic/run-cvmfs-osg-eic-shell@main
with:
platform-release: "eic_xl:nightly"
Expand All @@ -157,3 +157,27 @@ jobs:
export LD_LIBRARY_PATH=${GITHUB_WORKSPACE}/../EICrecon/lib:${GITHUB_WORKSPACE}/../EICrecon/lib/EICrecon/plugins:${JANA_HOME}/lib:${JANA_HOME}/lib/JANA/plugins:$LD_LIBRARY_PATH
../EICrecon/bin/eicrecon sim_e_1GeV_20GeV_craterlake.edm4hep.root

- name: Generate EICrecon input data for inner detector
uses: eic/run-cvmfs-osg-eic-shell@main
with:
platform-release: "eic_xl:nightly"
setup: "/opt/detector/epic-main/bin/thisepic.sh"
run: |
echo "--- Generating EICrecon input data ---"
npsim --compactFile ${DETECTOR_PATH}/${DETECTOR}_inner_detector.xml \
-G --random.seed 1 --gun.particle "e-" --gun.momentumMin "1*GeV" \
--gun.momentumMax "20*GeV" --gun.distribution "uniform" -N 20 \
--outputFile sim_e_1GeV_20GeV_inner_detector.edm4hep.root -v WARNING

- name: Run EICrecon for inner detector
uses: eic/run-cvmfs-osg-eic-shell@main
with:
platform-release: "eic_xl:nightly"
setup: "/opt/detector/epic-main/bin/thisepic.sh"
run: |
export JANA_HOME=$GITHUB_WORKSPACE
export JANA_PLUGIN_PATH=$GITHUB_WORKSPACE/../EICrecon/lib/EICrecon/plugins
export LD_LIBRARY_PATH=${GITHUB_WORKSPACE}/../EICrecon/lib:${GITHUB_WORKSPACE}/../EICrecon/lib/EICrecon/plugins:${JANA_HOME}/lib:${JANA_HOME}/lib/JANA/plugins:$LD_LIBRARY_PATH
export DETECTOR_CONFIG=epic_inner_detector # not important
../EICrecon/bin/eicrecon sim_e_1GeV_20GeV_inner_detector.edm4hep.root

112 changes: 112 additions & 0 deletions docs/behavior.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,118 @@

## Introduction

## Components

### JFactories

#### Callbacks

The user-defined callbacks are `Init`, `Process`, `BeginRun`, `EndRun`, and `Finish`.

- `Init` is run at most once, and is used for loading and caching constant data. It is guaranteed to run before any call to `Process`.
If `Process` is never called, e.g. because the data is `Insert`ed instead, then `Init` will not be called.

- `Finish` only runs if `Init` has ran, and is responsible for cleaning up and closing any state opened by `Init`.

- `BeginRun` runs after `Init` and before `Process`. It will only be called if the run number has been set, and this will be the first call to `Process` corresponding to that run number.
`BeginRun` is responsible for loading and caching data keyed off of the run number, e.g. conditions, calibrations, lookup tables, machine learning models.

- `EndRun` runs after a previously set run number gets changed. It runs after the last call to `Process` with that run number, and before the `BeginRun` for the new run number.
`EndRun` is responsible for cleaning up and closing any state opened by the previous call to `BeginRun`. `EndRun` is guaranteed to be called exactly once for each `BeginRun`, as long
as JANA2 is shut down cleanly.

- `Process` is called exactly once for every JEvent. By the time `Process` is called, JANA2 guarantees that `Init` will have been called, followed by `BeginRun` if the run number has been set.

#### Activation
JFactories are lazy by design, which means that they won't be activated unless requested by another component. Activation is defined as calling `JFactory::Create` with a given JEvent, which
triggers the running of zero or more factory callbacks. Users are discouraged from calling `JFactory::Create` directly; instead, factories are usually activated via any of the following mechanisms:

- Declaring an `Input<T>` helper member variable on another JComponent.
- Calling `JEvent::Get*<T>()` or `JEvent::GetCollection<T>()`
- Setting the parameter `jana:autoactivate=$DATABUNDLE_NAME`. This will activate the factory even though its results are never used downstream, which is mainly useful for debugging.

#### Exception handling
JFactory's user-defined methods are allowed to throw exceptions. Unlike other JANA2 components, throwing an exception here does not immediately terminate processing -- the
user has the opportunity to catch the exception in the caller. For instance, calls to `JEvent::Get*` may be wrapped in a try-catch block. The exception itself is wrapped in a `JException` which preserves
stack trace and component information. If a factory callback excepts, the exception is stored so that it can be re-thrown on future calls. The excepting callback will only be called once. If any
callbacks except, JANA2 will still store the contents of each `Output` helper's transient output buffer. This means that if `Process` excepts, all data inserted prior to the exception will be preserved.

#### State machine

The JFactory state machine is defined as follows. The state has two components, `InitStatus` and `Status`. `InitStatus <- {NotRun, Run, Excepted}`, which allows `JFactory::Create` to guarantee that
`Init` gets called at most once, even when some events have the factory data `Insert`ed and others let it be `Processed`. `Status <- {Empty, Processed, Inserted, Excepted}` captures what exactly
is in cache. `Empty` is the only state where no data has been cached/stored; `Excepted` means that an empty or partial collection was stored. This storage operation is guaranteed to happen exactly once
per activated factory per event, a requirement imposed by Podio's write-exactly-once semantics. The following transition diagram shows the relationship between valid states and transitions corresponding
to factory callbacks.

```mermaid

stateDiagram-v2
NE : (InitNotRun, Empty)
XE : (InitExcepted, Empty)
XX : (InitExcepted, Excepted)
RE : (InitRun, Empty)
RI : (InitRun, Inserted)
RP : (InitRun, Processed)
RX : (InitRun, Excepted)
NI : (InitNotRun, Inserted)
XI : (InitExcepted, Inserted)

[*] --> NE
NE --> RE: Init
RE --> RP: Process
RP --> RE: ClearData
RE --> RX: Process
RX --> RE: ClearData
RE --> RI: Insert
RI --> RE: ClearData
NE --> NI: Insert
NI --> NE: ClearData
NE --> XE: Init
XE --> XX: Process
XX --> XE: ClearData
XE --> XI: Insert
XI --> XE: ClearData
```

Note that many of these transitions are encapsulated behind `JFactory::Create`. The only operations available to the user are `Create` (i.e. activate), `Insert`, and `ClearData`.
Redrawing the state diagram in terms of these transitions gives us:

```mermaid
stateDiagram-v2
NE : (InitNotRun, Empty)
XE : (InitExcepted, Empty)
XX : (InitExcepted, Excepted)
RE : (InitRun, Empty)
RI : (InitRun, Inserted)
RP : (InitRun, Processed)
RX : (InitRun, Excepted)
NI : (InitNotRun, Inserted)
XI : (InitExcepted, Inserted)

[*] --> NE
NE --> RP: Create
RE --> RP: Create
RP --> RE: ClearData
RE --> RX: Create
RX --> RE: ClearData
RE --> RI: Insert
RI --> RE: ClearData
NE --> NI: Insert
NI --> NE: ClearData
NE --> XX: Create
XE --> XX: Create
XX --> XE: ClearData
XE --> XI: Insert
XI --> XE: ClearData
NE --> RX: Create
```

Although not shown for the sake of visual clarity, it is important to note that `Create` operations are idempotent, so all of the `Status:Inserted`, `Status:Processed`, and `Status:Excepted` states
have implicit `Create` transitions pointing back to themselves. Correspondingly, the `Status:Empty` states have `ClearData` transitions pointing back to themselves. However, the `Status:Inserted` states
do _not_ have `Insert` transitions pointing back to themselves. Multiple `Insert` operations are disallowed due to the write-once constraint.


## Execution Engine

### Engine initialization
Expand Down
2 changes: 0 additions & 2 deletions src/libraries/JANA/Components/JHasOutputs.cc
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,8 @@ void jana::components::UpdateFactoryStatusOnEulerianStore(JFactory* fac) {
// that the factory doesn't accidentally get re-run.
// We do this inside a weird little free function because we need to avoid creating
// a circular definition of JFactory in our templates.
// Eventually we will need to refactor JFactory::Status and CreationStatus.

fac->SetStatus(JFactory::Status::Inserted);
fac->SetCreationStatus(JFactory::CreationStatus::Inserted);
}


4 changes: 1 addition & 3 deletions src/libraries/JANA/JEvent.h
Original file line number Diff line number Diff line change
Expand Up @@ -449,7 +449,6 @@ inline JFactoryT<T>* JEvent::Insert(T* item, const std::string& tag) const {
auto* factory = databundle->GetFactory();
if (factory != nullptr) {
factory->SetStatus(JFactory::Status::Inserted); // for when items is empty
factory->SetCreationStatus(JFactory::CreationStatus::Inserted); // for when items is empty
factory->SetInsertOrigin( mCallGraph.GetInsertDataOrigin() ); // (see note at top of JCallGraphRecorder.h)
return dynamic_cast<JFactoryT<T>*>(factory);
}
Expand All @@ -476,7 +475,6 @@ inline JFactoryT<T>* JEvent::Insert(const std::vector<T*>& items, const std::str
auto* factory = databundle->GetFactory();
if (factory != nullptr) {
factory->SetStatus(JFactory::Status::Inserted); // for when items is empty
factory->SetCreationStatus(JFactory::CreationStatus::Inserted); // for when items is empty
factory->SetInsertOrigin( mCallGraph.GetInsertDataOrigin() ); // (see note at top of JCallGraphRecorder.h)
return dynamic_cast<JFactoryT<T>*>(factory);
}
Expand Down Expand Up @@ -515,7 +513,7 @@ inline const podio::CollectionBase* JEvent::GetCollectionBase(std::string unique
return nullptr;
}

if (typed_bundle->GetStatus() == JDatabundle::Status::Empty) {
if (typed_bundle->GetStatus() == JDatabundle::Status::Empty || typed_bundle->GetStatus() == JDatabundle::Status::Excepted) {
auto* fac = typed_bundle->GetFactory();
if (fac != nullptr) {
JCallGraphEntryMaker cg_entry(mCallGraph, fac); // times execution until this goes out of scope
Expand Down
93 changes: 63 additions & 30 deletions src/libraries/JANA/JFactory.cc
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,6 @@ void JFactory::Create(const JEvent& event) {
m_logger = m_app->GetJParameterManager()->GetLogger(GetLoggerName());
}

if (mStatus == Status::Uninitialized) {
DoInit();
}

// How do we obtain our data? The priority is as follows:
// 1. JFactory::Process() if REGENERATE flag is set
// 2. JEvent::Insert()
Expand Down Expand Up @@ -88,7 +84,6 @@ void JFactory::Create(const JEvent& event) {

if (found_data) {
mStatus = Status::Inserted;
mCreationStatus = CreationStatus::InsertedViaGetObjects;
return;
}
}
Expand All @@ -99,26 +94,57 @@ void JFactory::Create(const JEvent& event) {
// 4. JFactory::Process()
// ---------------------------------------------------------------------

// If the data was Processed (instead of Inserted), it will be in cache, and we can just exit.
// Otherwise we call Process() to create the data in the first place.
// If we already ran Process() but it excepted, we re-run Process() to trigger the same exception, so that every consumer
// is forced to handle it. Otherwise one "fault-tolerant" consumer will swallow the exception for everybody else.
if (mStatus == Status::Unprocessed || mStatus == Status::Excepted) {
auto run_number = event.GetRunNumber();
if (mPreviousRunNumber != run_number) {
if (m_callback_style == CallbackStyle::LegacyMode) {
if (mPreviousRunNumber != -1) {
CallWithJExceptionWrapper("JFactory::EndRun", [&](){ EndRun(); });
}
CallWithJExceptionWrapper("JFactory::ChangeRun", [&](){ ChangeRun(event.shared_from_this()); });
CallWithJExceptionWrapper("JFactory::BeginRun", [&](){ BeginRun(event.shared_from_this()); });
}
else if (m_callback_style == CallbackStyle::ExpertMode) {
CallWithJExceptionWrapper("JFactory::ChangeRun", [&](){ ChangeRun(event); });
}
mPreviousRunNumber = run_number;
// Check if init had _previously_ excepted but the cache was since cleared
if (mInitStatus == InitStatus::InitExcepted && mStatus == Status::Empty) {
for (auto* output : GetOutputs()) {
output->LagrangianStore(*event.GetFactorySet(), JDatabundle::Status::Excepted);
}
for (auto* output : GetVariadicOutputs()) {
output->LagrangianStore(*event.GetFactorySet(), JDatabundle::Status::Excepted);
}
mStatus = Status::Excepted;
std::rethrow_exception(mException);
}

// Make sure Init() ran, which might except...
try {
DoInit(); // This checks mInitStatus internally before calling Init()
}
catch(...) {
// If Init() excepts, we still need to store an empty collection
mStatus = Status::Excepted;
for (auto* output : GetOutputs()) {
output->LagrangianStore(*event.GetFactorySet(), JDatabundle::Status::Excepted);
}
for (auto* output : GetVariadicOutputs()) {
output->LagrangianStore(*event.GetFactorySet(), JDatabundle::Status::Excepted);
}
std::rethrow_exception(mException);
}

// At this point, Init() has run and has _not_ excepted

if (mStatus == Status::Excepted) {
// But Process() might have already excepted!
std::rethrow_exception(mException);
}
else if (mStatus == Status::Empty) {
// Now we know that we need to run Process() to create the data in the first place
try {
auto run_number = event.GetRunNumber();
if (mPreviousRunNumber != run_number) {
if (m_callback_style == CallbackStyle::LegacyMode) {
if (mPreviousRunNumber != -1) {
CallWithJExceptionWrapper("JFactory::EndRun", [&](){ EndRun(); });
}
CallWithJExceptionWrapper("JFactory::ChangeRun", [&](){ ChangeRun(event.shared_from_this()); });
CallWithJExceptionWrapper("JFactory::BeginRun", [&](){ BeginRun(event.shared_from_this()); });
}
else if (m_callback_style == CallbackStyle::ExpertMode) {
CallWithJExceptionWrapper("JFactory::ChangeRun", [&](){ ChangeRun(event); });
}
mPreviousRunNumber = run_number;
}
for (auto* input : GetInputs()) {
input->Populate(event);
}
Expand All @@ -143,7 +169,7 @@ void JFactory::Create(const JEvent& event) {

LOG << "Exception in JFactory::Create, prefix=" << GetPrefix();
mStatus = Status::Excepted;
mCreationStatus = CreationStatus::Created;
mException = std::current_exception();
for (auto* output : GetOutputs()) {
output->LagrangianStore(*event.GetFactorySet(), JDatabundle::Status::Excepted);
}
Expand All @@ -152,8 +178,9 @@ void JFactory::Create(const JEvent& event) {
}
throw;
}

// Save the (successfully processed) data
mStatus = Status::Processed;
mCreationStatus = CreationStatus::Created;
for (auto* output : GetOutputs()) {
output->LagrangianStore(*event.GetFactorySet(), JDatabundle::Status::Created);
}
Expand All @@ -164,7 +191,7 @@ void JFactory::Create(const JEvent& event) {
}

void JFactory::DoInit() {
if (mStatus != Status::Uninitialized) {
if (mInitStatus != InitStatus::InitNotRun) {
return;
}
for (auto* parameter : m_parameters) {
Expand All @@ -173,17 +200,23 @@ void JFactory::DoInit() {
for (auto* service : m_services) {
service->Fetch(m_app);
}
CallWithJExceptionWrapper("JFactory::Init", [&](){ Init(); });
mStatus = Status::Unprocessed;
try {
CallWithJExceptionWrapper("JFactory::Init", [&](){ Init(); });
mInitStatus = InitStatus::InitRun;
}
catch (...) {
mInitStatus = InitStatus::InitExcepted;
mException = std::current_exception();
throw;
}
}

void JFactory::DoFinish() {
if (mStatus == Status::Unprocessed || mStatus == Status::Processed) {
if (mInitStatus == InitStatus::InitRun) {
if (mPreviousRunNumber != -1) {
CallWithJExceptionWrapper("JFactory::EndRun", [&](){ EndRun(); });
}
CallWithJExceptionWrapper("JFactory::Finish", [&](){ Finish(); });
mStatus = Status::Finished;
}
}

Expand Down
Loading
Loading