Skip to content

Conversation

@loeens
Copy link

@loeens loeens commented May 14, 2025

What
This PR fixes a bug in the ADC reader’s read() function when the frame size isn’t an integer multiple of the UDP payload size BYTES_IN_PACKET=1456. It assumed frames always start/end at packet boundaries, which causes:

  1. Mis-aligned frames
    • The read function doesn't correctly detect the beginning of a new frame → dropped packets and shifted data (corrupts frame data)
  2. Dropped samples at frame boundaries
    • The head of frame n+1 (in the tail of frame n’s last packet) is dropped

The combination of the two results in (slightly) corrupted data and a perceived halving of the frame rate.

How
Essentially refactor the whole read() function:

  • Introduce a persistent frame buffer across read() calls, with timeout‐based purging
  • Process remainder samples in the last (partial) packet instead of discarding them
  • Correct assembly logic for frames spanning packet borders
  • Fix frame-start/end detection

References
For background, see my comment on issue #56: #56 (comment)

Closes

loeens added 4 commits May 14, 2025 10:36
… to place received UDP payload in the correct position of the frame buffer
…is exceeded, for example when one or more packets of a frame are missing
…ETS_IN_FRAME, PACKETS_IN_FRAME_CLIPPED, UINT16_IN_PACKET
@loeens loeens changed the title fix(read): handle lost frames and corrupted data when using non-integer multiples of 1456 Bytes (#56, #65) fix(read): handle lost frames and corrupted data when using frame sizes which are non-integer multiples of 1456 Bytes (#56, #65) May 14, 2025
@LadiAdeoluwa
Copy link

LadiAdeoluwa commented May 17, 2025

Thank you for you commit. I have noticed this issue for a while and I'm glad some in noticing it.

Upon trying your update; I end up getting this message on repeat

"WARNING: Dropped Frame(s) 3100 since they weren't complete.
WARNING: Dropped Frame(s) 3101 since they weren't complete."

and the size of the file where i save the data never increases

@loeens
Copy link
Author

loeens commented May 17, 2025

Hello @LadiAdeoluwa,
thank you for testing it and thank you for your feedback! Can you please share your configuration (.cfg file), especially the parameters that influence the frame size and the framePeriodicity is interesting for me.

Because I have the feeling the issue could be related to you having a framePeriod >200 ms, I did two things in the last commit:

  • increase the default value after which incomplete frames get deleted to 1s and make it configurable at the top of the file: This should cover most generally used framePeriodicity values
  • only check for incomplete frames when a full frame is returned instead of each packet: reduces unneccessary load in the packet reading loop and fixes the introduced issue that all frames are dropped when the framePeriodicity > timeout

Please also let me know if you are successfully capturing frames with the adapted code.

Thanks,

Leon

@LadiAdeoluwa
Copy link

LadiAdeoluwa commented May 18, 2025

Hello Leon. Great Work!

It's trending in the right direction. only issue is that the new PR bricks my implementation on the Jetson(Linux).

Context: I am running a AWR2243 with the DCA1000.

-To benchmark the PR I first captured data using MMwave studio for 15 secs, the file size was ~380MB. This is a spectrogram of my hand moving away and from the radar
mmwave. this is the best case scenario.

  • Using the Old ADC.py with Mmwave to trigger the radar, the data is halved as expected and is ~190MB
    old_adc_mmwave

You would notice that the spectrogram is not as crisp as the previous one.

-Using the New PR ADC.py with Mmwave to trigger the radar, the data is is ~380MB and by all indications it appears complete.
new_adc_mmwave

So we cans that Using mmwave studio this PR works fine.

However on the Jetson using an mmwavelink example to trigger the radar, only the Original ADC.py runs and collects data, your previous PR before the last (e78d7ab) worked on the Jetson (and by worked in this instance I mean it executed and data was logged albeit wrongly) but the new PR (22b4743) doesn't . When recording on the Jetson using the original ADC.py, the data is also halved and is about 190MB, here's the spectrogram:
old_adc_jetson
The spectrogram looks weird cause I suspect the capture using linux might have lane/format (not sure what it is yet). but it captures non the less but half the expected length. I know you use the WRL6432 but I assume you are not using mmwavestudio, are there specific settings that make the data look dissimilar when not using MMwavestudio?

when I try your new PR with the Jetson nothing is logged and no data is saved for some reason. I am not doing anything special and my code is pretty basic, see below. so I dont know why the new PR does not work with it . it doesnt timeout too, the board just keeps blinking till I do a keyboard interrupt.

Screenshot 2025-05-18 at 6 40 48 AM

my cf.json file is attached but I dont think the linux implementation uses the cf.json, only mmwave studio does if I'm not mistaken

cf.json

Thanks,
Looking forward to your response

Ladi

@loeens
Copy link
Author

loeens commented May 18, 2025

Thank you for your detailed reply! Glad that it's working for you at least with mmWave studio.
Correct, I don't use mmWave studio. As it works with you when you use mmWave studio for triggering, it could be a configuration or a timing issue. The CONFIG_FPGA_GEN parameter value you use '01 01 01 02 03 1e' looks ok to me.

The script you posted a screenshot of, in what order do you execute it? Do you first start the script you posted and then send the (configuration and the) sensorStart command via mmWave CLI to the radar sensor or do you do it the other way around? As far as I know, when first starting the sensor and then starting the dca1000 recording, there could be issues with mis-alignment.
This would explain why the original version is able to capture data, while the new version isn't able to capture a single complete frame.

@LadiAdeoluwa
Copy link

LadiAdeoluwa commented May 18, 2025

Right now the Radar is triggered first using a different script before the DCA script runs and attempts to start capturing the packets. The code to run the radar is a python script and not mine but i have never seen the ‘sensorStart’ command. My understanding is that you trigger the radar then the DCA because it times out when there are no packets received but i will try it out, maybe there’s a way of creating a delay? Could you share how you trigger your radar and DCA1000? i know it’s a different sensor but i believe it can provide insight as to why this is not working with your new Pull request.

Congratulations on your thesis

Ladi

@loeens
Copy link
Author

loeens commented May 19, 2025

You want to do it the other way around: First start recording with the DCA1000 and then start chirping in the sensor. Doing it in this order ensures that the data flow is aligned and the DCA1000 doesn't start reading somewhere in the middle of a frame.

Concerning the sensorStart command: By this I mean triggering chirping in the radar sensor.
I am not familiar with the AWR2243, but assumed it would be similar on all mmWave radar chips from TI, but apparently it isn't. On the xWRL6432 you run the out of the box demo application, which has a CLI called the mmWave CLI you can access via UART. To this the chirp configuration is sent and chirping can be started with the sensorStart and stopped with the sensorStop command. For my sensor (xWRL6432) I built a small Python module which does all the configuration, starting, stopping, data aquisition, etc. you can find it in my profile.

@LadiAdeoluwa
Copy link

LadiAdeoluwa commented May 20, 2025

Hey I tried it. Triggered the radar after the DCA but it didn't work. I have troubleshooted the code and the adc.py is in fact reading the packets. When I print "packet_data" in function” _read_data_packet(self):" the packets are printed.

What I noticed is if you look at my "main" code it never gets out of the read() function to print the second ('here) and just keeps running till infinity if I let it or the socket is going home. it just gets stuck in the read() function

do you have any idea why why might be the case?

this is my main code
**###############################################################**

dca= DCA1000()
#configure FPGA
dca._send_command(CMD.CONFIG_FPGA_GEN_CMD_CODE, '0600', '01010102031e')
#Start Record

dca._send_command(CMD.RECORD_START_CMD_CODE)

#Run radar Senso

pwd = subprocess.Popen(['echo', 'radar123'], cwd = radar_loc, stdout=subprocess.PIPE)

pwd.wait()

cmd = subprocess.Popen(['sudo', '-S', './setup_radar'], cwd=radar_loc, stdin=pwd.stdout,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
cmd.wait

print('setup_radar error return code: ', cmd.stderr.read())
print('setup_radar error return code: ', cmd.returncode)

t_end = time.time() + 5 # Run for 10 seconds

print(Record Startin)


while time.time() < t_end:

print('here')

frame = dca.read()

 print('here'). <#############################(I NEVER GET HERE)

cmd = kill(radar_loc)

cmd.wait()
print('kill error return code: ', cmd.returncode)
dca._send_command(CMD.RECORD_STOP_CMD_CODE)`
**######################################################################**

It's interesting/funny that with the old adc.py it runs. my thoughts are that the frame never gets full and the buffer is never filled.
my initial assessment is that because (frame_id, frame_data) always returns (None, None) in perpetuity, the frame is never filled and is code id stuck within the read()’s "while True" loop

my adc params are ADC_PARAMS = {'chirps': 128, # 32
'rx': 4,
'tx': 2,
'samples': 256,
'IQ': 2,
'bytes': 2}

I am leaving the vicinity of the radar rn, but I created a “Mock” socket that reads the packets from a recorded file that I created after binding the address of the fpga so I can work on this at home

EDIT: when I print(packet_data.nbytes) I get 1456.
So I went on to do
print(f"Received packet: {packet_data.size} samples, byte count = {byte_count}") after packet_num, byte_count, packet_data = self._read_data_packet() I get the following:

packet no.: 2903929, byte count = 4228119168
packet no.: 2903930, byte count = 4228120624
packet no.: 2903931, byte count = 4228122080
packet no.: 2903932, byte count = 4228123536 and so on

WTF!!!!!! For some reason it starts at 4 billion!!! You would notice the byte count is unfathomably high but still increments by 1456. This gave me an idea to normalize the byte number before it goes into _place_data_packet_in_frame_buffer(). I create a self.base_byte_count in the DCA .init and set it to None then I place this:



if not hasattr(self, 'base_byte_count') or self.base_byte_count is None:
self.base_byte_count = byte_count # first ever byte_count = 0
relative_byte_count = byte_count - self.base_byte_count
offset = relative_byte_count // 2 # 


This way it always starts from zero regardless of whether I start capturing packet mid stream.

I tried this and was able to get out of the read loop finally, the recorded file I used was pretty small so I dont know if this is indicative of what will happen when I try this with the actual data tomorrow.



**(EDIT 2)**HELP: haven't gotten to a radar yet but when I printed (frame_id, frame data) after my fix I get (None, None) while the data is filling up then 

1 [ 904 65378 630 ... 218 64823 64591]

then immediately:
WARNING: Dropped Frame(s) 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 since they weren't complete.

Is this normal?

Do you have any idea What is going on? 



EDIT 3: is it normal to receive packets for other Frames while one is still being completed ? thats is the only thing that explains it, cause I'm guessing the deleted function is getting rid of them after the timeout, could it be that since I'm emulating a socket using a mock socket file its not fast enough?
Regards,

Ladi

@loeens
Copy link
Author

loeens commented May 20, 2025

okay, just to be clear for the sake of the topic of this pull request: The solution works for you together with mmWave Studio, which is what this repository and also the PR is aimed at.

Now I previously mentioned, the probability is very high, that it is one of the two issues, and not related to the code of this PR:

As it works with you when you use mmWave studio for triggering, it could be a configuration or a timing issue.

Now you probably eliminated the timing part, assuming that all the commands you send with subprocess.Popen() actually start the sensor. This leaves the second part: That the configuration you set in your radar sensor doesn't match what is set in the ADC_PARAMS in the reader.

What firmware image is flashed to the AWR2243 and how are the chirp parameters set? Are they hardcoded or is there a CLI you can access via UART? If there is a CLI, this is something usually mmWave Studio handles (at least the chirp configuration), so this is something you need to implement on your side.
I am completely unfamiliar with the AWR2243 as it seems quite different in handling compared to the IWRL6432 or even the IWR1843.

byte_count is the byte count since the DCA1000 was last resetted up until the frame that contains this number and packet_num is the number of packets sent since resetting the DCA1000. One packet can contain data from multiple frames, for example the remaining bytes from Frame n and the first bytes from Frame n+1.
This means, that when you start working with relative byte counts it makes it unable to tell, were the frame borders actually are. In my understanding, the only way to correctly extract the frame data is when the DCA starts at byte_count=0 and packet_num=1 and then the radar sensor is started, so the first package contains data from the first frame captured by the radar sensor. It is also why working with absolute packet counts won't give you correct data, similar to the old version of adc.py.

All I can say is that something is causing none of the frames being read completely due to a configuration misamtch between what is actually configured in the radar and what is configured in your ADC_PARAMS in the reader or a timing issue, causing you not to start reading at the right place. But I cannot help you with that anymore, as I have no idea how to work with the AWR2243 and couldn't quickly find documentation for it.

Good luck!
Leon

@LadiAdeoluwa
Copy link

thank you.

i figured out the issue. I have it running on a non MmWave triggered pipeline!

Regards,
Ladi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DCA1000EVM: + [AWR1843]: Not receiving raw datas as expected Half of ADC data is missing

2 participants