Skip to content

Binary io#3

Merged
DanViolette merged 60 commits intomainfrom
binary_io
Jun 27, 2025
Merged

Binary io#3
DanViolette merged 60 commits intomainfrom
binary_io

Conversation

@lucabaldini
Copy link
Collaborator

@lucabaldini lucabaldini commented May 28, 2025

This is a massive pull request in which I am including the bulk of the work at AstroPix/astropix-python#22

In a nutshell this includes all the facilities and data structures to write and read back from file binary astropix data, with the associated unit tests and documentation. More specifically:

  • Initial import of the binary I/O machinery, with classes and data structures
    to save astropix readouts to (and read them back from ) persistent storage.
  • Initial setup of the unit tests and continuos integration on github.
  • Initial setup of the documentation.
  • Top-level Makefile added to facilitate some operations.
  • pyproject.toml file added.

@grant-sommer
Copy link
Contributor

Here is the message I sent last night:

" I am having some trouble converting the .apx file into a .csv. Let me know if I am missing something in the follow steps:

In a temporary python file in the astropix_analysis directory, I’m importing the apx_to_csv function and the AstroPix4Readout class
Run the apx_to_csv with the path set to one of the output .apx files and the readout class set to AstroPix4Readout

When I run the function, I am getting a ValueError which tracks all the way back to line 65 in astropix_analysis/fmt.py: return int(super().getitem(index),2 )
ValueError: invalid literal for int() with base 2:’

Running does generate the .csv file where I would expect it to be, but it only writes the header and the column names, no data appears after that."

Attached here is a zip file with .apx file I was trying to convert.

20250612_160623.zip

@lucabaldini
Copy link
Collaborator Author

Thanks Grant! I was able to download the files and reproduce the error. I'll get a fix in place tomorrow.

@lucabaldini
Copy link
Collaborator Author

Allright---I got at the end of the file. The problem was that I was not handling correctly incomplete hit data at the end of the readout.

Now, before we discuss whether the change that I put in makes sense, I'd like to make sure that I understand the input file that you linked. Please correct me if any of the following is incorrect:

  • it contains exactly one readout, with readout_id 32.
  • the readout contains 445 hits, plus 4 extra bytes (i.e, an incomplete hit) at the end, which reads b'\dH\x18'

(I am asking because I am not quite sure how you get one single readout in the file, with readout_id 32, and an an overflown buffer. In our setup, running simple charge injection tests, the binary buffers are always for the vast majority padding bytes. Are we doing something inherently different?)

Now, in order to get to the end of the file, I put in a check on the slicing indices to interrupt the unpacking whenever I have an incomplete hit, see 756eb1b

This means that I am effectively throwing away any incomplete hit, which I guess makes sense in this particular case, but it might not be what we want to do in general if we are getting the remaining part of the hit in the subsequent readout.

Now: can you try re-running the conversion after the last commit

./bin/apx_convert.py path/to/20250612_160623/20250612_160623_data.apx

and see if what comes out makes sense?

Thanks!
Luca

@lucabaldini
Copy link
Collaborator Author

Also, if we need to keep track of the incomplete data and re-assemble hits across adjacent readouts, it would be awesome if you could provide a small file with multiple readouts that I could assemble a small unit test from. (In which case I would appreciate having the csv file produced with your decoder, if at all possible, to make a row-by-row comparison.)

@grant-sommer
Copy link
Contributor

Hi Luca, I pulled the most recent commit and now I am getting a .csv file out. But, the values in the csv don't make much sense to me. I am injecting into row 0, column 10 but that is not what's in the .csv file. Attached here is the full output of the run I just did, and the output .csv:

20250618_103557.zip

Let me know if I'm doing anything wrong in the analysis or running the injection.

@lucabaldini
Copy link
Collaborator Author

All right, thank Grant!

I downloaded the new archive, but before I start digging into it, can you give me some specifics, i.e.:

  • how many events in how many readouts is the .apx file supposed to contain?
  • is the .csv file produced with the new code, and therefore is wrong?
  • if this is the case, do you have any way to produce a correct .csv file that I can test against? (Probably not because you would need to tweak your decode routine to deal with the extra bits that the .apx format puts in, right?)
  • even if you cannot produce a fully-fledged, correct .csv file, is the anything else beside the row and column identifier that I can look for? e.g., how do you fill the readout id?

This reminds me that I meant to ask you what you are using for acquiring the data. Did you tweak an existing script of yours or you are using the apx4_read.py script in astropix-python? In the latter case we have to make sure that you pulled all the latest commits in the binary_io branch, because there was lots of back and forth there, and we have to make sure that the I and the O are in synch. (Which was the main reason why I was initially pushed to have the binary file infrastructure in the astropix-python repository.)

In any event: the thing roundtrips on our setup, so it must be just a matter of getting to the point where we are doing exactly the same thing. From that point it will be easy.

@grant-sommer
Copy link
Contributor

  • I don't have the exact number number of readouts, but it is around 175
  • The csv is using the new code, I pulled the most recent version of astropix_analysis before I ran everything
  • right now I don't have a way to get the correct csv, I can try to get something later today but nothing now
  • I also noticed that the ChipIDs are all wrong, and the final ToT values don't make sense either
  • I am also using apx4_read.py to get my data, I just checked and it wasn't the most recent version, I am going to try getting data again with the newest version and get back to you

let me know if there is any other information you need

@lucabaldini
Copy link
Collaborator Author

Thanks Grant---let's defer any discussion after you have acquired some data with the latest version apx4_read.py. I suspect (and hope) that's the problem. (Well, I know for a fact that is a problem, but it might not be the only one.)

And, by the way, don't feel pressured to do this in a rush. I realize you're juggling many different things and we'll get to it when we'll get to it :-)

@grant-sommer
Copy link
Contributor

Hi Luca, I tried using the newest version of the binary_io branch in astropix_python and astropix_analysis, but now I can't get apx4_read.py to run. It is returning "ModuleNotFoundError: No module named 'astropix_analysis'"

I think this has something to do with the setup script to set the python path, is there any way to check those variables? Or is this maybe something else?

@lucabaldini
Copy link
Collaborator Author

Strange---nothing should have changed in this respect. We should fix this once and forever, and document exactly how things are supposed to work across different operating systems. The right place for that is pull request #9

That all said, the thing should be straightforward. I am trying to stick to the standard structure of Python repos, as described, e.g., here
https://docs.python-guide.org/writing/structure/#structure-of-the-repository
where you have all the python code into a folder with the same name of the package. In my intentions getting things to work should be as easy as sourcing the setup file in the astropix-analysis folder

[lbaldini@pcpi0188 ~]$ echo $PYTHONPATH
[lbaldini@pcpi0188 ~]$ source work/astropix-analysis/setup.sh 
[lbaldini@pcpi0188 ~]$ echo $PYTHONPATH
/home/users/lbaldini/work/astropix-analysis:
[lbaldini@pcpi0188 ~]$

The key is: you should have the path to the local working copy of the astropix-analysis repo into your $PYTHONPATH. If that is not the case, then we should figure out why. If that is the case and still you can't get things to run, then I am not understanding.

So: can you start with a fresh terminal, do the setup (i.e., source the proper setup files) and post here all the commands that you typed, along with the value of the $PYTHONPATH environmental variable right before you fire up apx4_read.py?

(I have a rough day today, but maybe we can find 5 minutes to get together on Teams, tomorrow, and sort things out in person.)

@grant-sommer
Copy link
Contributor

I retried this and I got the setup file working in a command prompt window, but not a powershell window. Now I am getting an output .csv that makes sense, all the rows and columns make sense and the ToT values are reasonable. Is there a way to get setup.bat to work in powershell?

@DanViolette
Copy link
Contributor

DanViolette commented Jun 25, 2025 via email

@lucabaldini
Copy link
Collaborator Author

Hi Grant, Dan,
I realize it is getting difficult to keep track of all the things going on, but there is a separate pull request with updated setup scripts for Linux and Windows, including PS
#9

I'd like to add some documentation to that, but improving on the docs is hostage to having github-pages up and running, which in turn requires merging the first PR on the main branch to make sure everything is in order. I welcome tests on the PS scripts and helps for the docs.

If you want to merge PR #9 on the main first, and then synch all the development branches with the main before moving on with the other PR I am happy with that :-)

@DanViolette
Copy link
Contributor

DanViolette commented Jun 26, 2025 via email

@DanViolette DanViolette merged commit fcee9c7 into main Jun 27, 2025
2 checks passed
@DanViolette DanViolette deleted the binary_io branch July 1, 2025 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants