Skip to content

Write ALARA output to SQLite#75

Open
anu1217 wants to merge 12 commits intosvalinn:mainfrom
anu1217:make_adf
Open

Write ALARA output to SQLite#75
anu1217 wants to merge 12 commits intosvalinn:mainfrom
anu1217:make_adf

Conversation

@anu1217
Copy link
Copy Markdown
Contributor

@anu1217 anu1217 commented Jan 20, 2026

This PR introduces a structure to the sqlite activation results database and writes some preliminary results to it.

The run_lbl column is used to keep track of the ALARA runs from which results were produced (using the number of pulses, duty cycle %, and active burn time).

The block_name column has the effect of tracking the parent nuclide. The (flattened) irradiation time in seconds, flux spectrum shape, and number density all have separate columns.

This database should be expanded to include the average flux magnitude (pending #14)

fixes #35

Copy link
Copy Markdown
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing this a while ago and never got around to finishing it.

Here are some specific suggestions.

Overall, I think this PR is mixing things based on what your task was at the time and should maybe be broken down somehwat, aka separation of concerns.

Fundamentally, if you are going to rely on the ADF structure that we have developed for holding ALARA output data, then I think this PR should focus on methods that modify the ADF and then write it to SQL. Everything else is for a specific use case that can come in a future PR that focuses on how to process a variety of data that you have already generated using these new capabilities.

Comment thread tools/script_template.py Outdated
lib = aop.DataLibrary()
adf = lib.make_entries(run_dicts[run_dict])
adf_data.append(adf)
adf = pd.concat(adf_data)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you need this?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it's indented too much and should be outside the loop?

Comment thread tools/script_template.py

def modify_adf(adf, norm_flux_arr, t_irr_arr, inputs):
#Remove some columns:
adf.drop(columns=['time', 'time_unit', 'variable', 'var_unit', 'block', 'block_num'], inplace=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before you do this, will it be safer to first filter to get only the number density info. I guess you are running these to only generate number density, but just in case...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the same for time?

Comment thread tools/script_template.py Outdated
Comment thread tools/script_template.py Outdated
Comment thread tools/script_template.py Outdated
Comment thread tools/script_template.py Outdated
@anu1217
Copy link
Copy Markdown
Contributor Author

anu1217 commented Apr 14, 2026

What if we limit modify_adf() to include only the correct columns (i.e. drop and rename some, maybe leave flux spectrum shape in), but get rid of the parts that use num pulses, duty cycles, and irradiation time in this PR? The write_to_adf() and write_to_sqlite() methods can stay with some modifications. I think we've deviated somewhat from our original approach since we started focusing on schedule_transforms.py to handle the irradiation time approximations, so I just want to make sure we're on the same page regarding what the scope of this PR is.

@gonuke
Copy link
Copy Markdown
Member

gonuke commented Apr 15, 2026

I think this should focus on #35 and less on #33. I've left some comments on each of those issues, but for now I think they can largely be decoupled.

@gonuke
Copy link
Copy Markdown
Member

gonuke commented Apr 15, 2026

To repeat the comments from #35:
I think this PR should assume that a sufficiently complete ADF already exists and not include any functions (write_to_adf()) to generate such a thing. It should focus on:

  • what columns should exist in your final SQL table
  • what columns need to be added to/dropped from the ADF
  • how to write the modified ADF as SQL

Notably, this particular approach for generating the ADF makes sense for the current "campaign" of problems, but may not always make sense as we consider different variations on how we set up problems.

Copy link
Copy Markdown
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to separate the method for knowing the values of t_irr_flat from the modifying of the ADF. More generally, parsing a run_label may become the least common way to know this information.

Comment thread data/vit-j-175-bins.txt
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you only need this file to get the number of groups. Could that just be an input/command-line parameter? That is, in stead of providing a filename for the group bounds, you can just provide an integer number of groups.

Comment thread tools/iter_dt_out.yaml
- 64
flux_file : /filespace/a/asrajendra/research/activationDB/ref_flux_files/iter_dt_flux_2.0986E14
flux_file : ../calc_dwell_dir/ref_flux_files/iter_dt_flux_2.0986E14
vit_j_file : ../data/vit-j-175-bins.txt No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment - since this file only used to determine the number of groups - maybe this entry is just the number of groups

Suggested change
vit_j_file : ../data/vit-j-175-bins.txt
n_groups : 175

Comment thread tools/script_template.py
inputs = yaml.safe_load(yaml_file)
return inputs

def modify_adf(adf, norm_flux_arr, t_irr_arr, inputs):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding docstrings early can help the reviewer know your intent for the data structures.

Comment thread tools/script_template.py
Comment on lines +87 to +99
num_pulses = inputs['num_pulses']
duty_cycles = inputs['duty_cycles']

# Extract the number of pulses and duty cycles from the run label
pulse_num_dc = adf['run_lbl'].str.extract(r'_(\d+)p_(\d+)_').astype(int)

# Map num_pulses and duty_cycles to an index
pulse_idx = pulse_num_dc[0].map({pulse_num: i for i, pulse_num in enumerate(num_pulses)})
duty_cycle_idx = (pulse_num_dc[1]/100).map({duty_cycle: i for i, duty_cycle in enumerate(duty_cycles)})

# Add flattened irradiation time:
adf['t_irr_flat'] = t_irr_arr.T[pulse_idx.to_numpy(), duty_cycle_idx.to_numpy()]
adf['t_irr_flat'] = aop.convert_times(adf['t_irr_flat'], from_unit='y', to_unit='s')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separation of concerns: All of this seems like it's in the wrong place. All the data for the new columns in the ADF should be computed before you make the call the modify the ADF. Remember, there may be many ways to compute the data for the new columns depending on what the original source of the data is, and you want flexibility to mix and match.

Specifically, I thought we discussed that you shouldn't rely on decoding information from the run_label to get this information, anyway. If you still want to have that as one pathway to determine this data, that should be it's own method with only that purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design structure for database of activation results

2 participants