-
Notifications
You must be signed in to change notification settings - Fork 31
boot-cypress: Fix lingering rejected control transfer #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boot-cypress: Fix lingering rejected control transfer #25
Conversation
This is equivalent to the following change in glasgow firmware: GlasgowEmbedded/glasgow#660 In boot-cypress this was triggered if I did the following: - `python3 -m fx2.fx2tool read_eeprom -f filename.bin 0 32768` - result: `Command timeout (bootloader not loaded?)` - `python3 -m fx2.fx2tool load boot-cypress.ihex` - `python3 -m fx2.fx2tool read_eeprom -f filename.bin 0 32768` - result: `Command not acknowledged (wrong address width?)` This sometimes even resulted in total corruption of the flash memory, and should be fixed since it's an easy mistake to make. This happened, because while there was no firmware loaded, the SUDAV interrupt is sticky. When the bootloader firmware is loaded It immediately tries to process a setup packet. But in the endpoint buffer it actually sees the control transfer that took the CPU out of reset, and it doesn't know what to do with it, so it falls through to the EP0_STALL. From this point on the same thing happens as in glasgow firmware: The rejected control transfer is retried again and again. And it is retired again and again while the EP0 buffer is getting gradually filled with a new control transfer that has not arrived yet. So the CPU will see a mix of the old control transfer and the new control transfer, which will result in data corruption.
|
Arguably an additional change (in a different PR) could be made to clear the SUDAV interrupt at the beginning of main, but I'm not 100% sure of the correctness of that for all situations. |
whitequark
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great catch once again!
|
I am quite impressed with the skill you show in the firmware-related PRs. (I was aware that there were race conditions of this nature, but I never committed to the effort of tracking them all down; not the least because I find this type of issue particularly difficult to diagnose.) Would you like to join the Glasgow maintainer team? We definitely need further firmware work (e.g.: current limit/measurement support, refactoring in preparation for revD) and I trust your judgement there. |
Alright, I have a bit more time on my hands now, I can definitely help out for a while. |
|
Invited! So aside from the current measurement (where I feel there isn't much to discuss), I had this idea of writing a small "bytecode VM" for I2C operations. Essentially, convert the i2c_* function calls that often prevent sdcc from optimizing the function because it isn't a leaf any more, have something like: static struct i2c_command cmds[] = {
{ CMD_START },
{ CMD_ADDR, 0x12 },
{ CMD_READ, 1, &data },
{ CMD_STOP },
};
// ... later
i2c_execute(cmds, ARRAYSIZE(cmds));There's going to be a fair bit of refactoring necessary to support revD, and it inevitably will introduce many new I2C devices, so I think this approach could free up quite a bit of XRAM. (Also it can probably be made more readable, the loose functions with explicit status checks are really not particularly great there.) What do you think? |
|
I have seen i2c implementations like this, however I do remember, that the number of command arrays that ended up being used was quite low in the end in those projects. Not sure if it matters for glasgow, but it felt like useless flexibility back then. What's the status of revD? I have found issue 169. Is there another place where I can read up on revD plans? Is a full design already on the way? I could also try doing the full design if nobody is working on it yet. If I understand right, in your minispec you're proposing to use PCA9545A, and then PCA6408 would have to be topologically under PCA9545A, and the rest of the I2c slaves would be next to PCA9545A, right? If that's right then the command arrays could be used to compose a full sequence to always address one of the sub-buses when talking to the PCA6408, but keep the other accesses simpler, something like: static struct i2c_command cmds[] = {
{ CMD_START },
{ CMD_ADDR, address_of_PCA9545A},
{ CMD_WRITE, 1, &subbus_selector }
{ CMD_STOP },
{ CMD_START },
{ CMD_ADDR, address_of_PCA6408 },
{ CMD_READ, 1, &data },
{ CMD_STOP },
};Although having a different array for each each i2c device address and a hardcoded data buffer seems a bit limiting, perhaps I would pass a 16-bit address, and a data pointer to i2c_execute. static uint8_t cmds[] = {
CMD_START_ADDR_IMMEDIATE(address_of_PCA9545A),
CMD_WRITE_I2C_ADDR_BITS_7_8_DECODED,
CMD_STOP,
CMD_START_ADDR_VALUE_I2C_ADDR_BITS_0_6,
CMD_WRITE_REG_BITS_0_7,
CMD_START_ADDR_VALUE_I2C_ADDR_BITS_0_6,
CMD_READ,
CMD_STOP,
};
// ... later
i2c_execute(cmds, ARRAYSIZE(cmds), i2c_address_16bit, register_addess_8bit, &data, data_length);Normal i2c addresses are 7 bits + 1 bit of direction. So a CMD_START_ADDR_IMMEDIATE() macro could encode a 7-bit address , leaving space for 128 other commands, and the direction could be inferred by looking 1 step ahead in the command list. Now I think some of those arguments won't be used all the time, like the top half of i2c address, and the register address, so it might add to code size having to perform the full calling convention every time even if not used. Perhaps we could have some i2c_execute_simpler() functions that call into i2c_execute(), but again, I'm not sure of the effect this would all have on optimization. Oh, we could also have a CMD_END_LIST_OF_COMMANDS, so we don't have to pass ARRAYSIZE into i2c_execute. I know you're not a fan of building different firmware binaries for the different glasgow revisions, but an alternative could be |
Especially to include fixes from GlasgowEmbedded#18 and GlasgowEmbedded#25 This time boot-cypress is built with 4.5.0 #15242. For future builds please run from the root of the repository: `./software/deploy-bootloader.sh` The deploy script was adapted from the one used in glasgow firmware.
That would be great! I think it's worth it making a very simple implementation just to gauge how much space savings we're looking at, plus maybe playing with SDCC settings (configuring the I2C command processor to use callee-saved registers, etc) to further help it optimize I2C functions.
RevD is honestly in a very messy state for reasons I don't want to go into publicly; I will do my best to disentangle this mess later this year but for now I'll just say "it will definitely involve quite a bit more I2C stuff". Sorry I can't say more this very moment; I'm not any happier about it than you are, especially given how open the project's development normally is. I have noted your offer of assistance with revD design; can you tell me a bit more about your experience doing hardware development? You're obviously quite good with firmware and gateware but I haven't seen your schematic or layout, and I'd like to understand what to expect before making a commitment.
I think realistically a revD topology would involve an I2C switch and then each IO port (3-4 I2C targets) will be underneath the switch. I think I will probably not expose I2C for addons after all. So all of the I2C comms will involve a prefix of "configure the I2C switch", and the specific targets will be conditional on revision more so than they currently are.
The tail end of the flash is currently used to store a programmed firmware on revC, because the bitstream is ever so slightly too large to fit into the ICE_MEM. We could add RLE compression of the bitstream, but that's an entirely new can of worms. I dislike the idea of overlays much less than I dislike the idea of shipping multiple firmwares. It's still a significant chunk of complexity but I'm tentatively OK with shipping it. I do think it's more complex than the I2C command interpreter though. |
Yep, we're on the same page here.
While I don't have the artefacts to prove it, I think having a single entry point will be the best optimization-wise. |
This looks good to me too! |
I have some small public projects you can look at online, however they definitely aren't as dense as glasgow is:
And these are all open source projects other people actually use. Professionally I have only done schematic review, so my design skills are technically currently at the hobby level, but I am not completely clueless, and I do have some ideas of how to handle higher speed designs, for example I do think about current return paths on ground/(supply) planes. I have only ever used kicad (and some geda back in the day), but I'm not a kicad expert either, I just learned the minimum I need to be productive. I do realize glasgow is more complex than work I had done until now, but it seems approachable to me. I do consider myself an expert on debugging issues, and bringing up newly designed boards, I've done a lot of that professionally too.
I can understand not all aspects of development being public. I remain interested as long as the final design is intended to be open hardware. |
I actually consider the current situation (where a bunch of stuff, esp. around revD, happens "behind the scenes") a failure on my part; it was never supposed to happen and I should see that it doesn't happen again. While not all of my projects are radically transparent, most of them are, and Glasgow certainly should be.
Gotcha. I think I can work with this; Hector Martin mentioned he'd be up for doing the layout (about which I'm fairly particular, and I know I'm very happy with his work so it's a no-brainer for me), which leaves the schematic. There is some unfinished work on the IO ports that I need to check if it can actually be used or not; I'll let you know once I make that happen. |
This is equivalent to the following change in glasgow firmware: GlasgowEmbedded/glasgow#660
In boot-cypress this was triggered if I did the following:
python3 -m fx2.fx2tool read_eeprom -f filename.bin 0 32768Command timeout (bootloader not loaded?)python3 -m fx2.fx2tool load boot-cypress.ihexpython3 -m fx2.fx2tool read_eeprom -f filename.bin 0 32768Command not acknowledged (wrong address width?)This sometimes even resulted in total corruption of the flash memory, and should be fixed since it's an easy mistake to make.
This happened, because while there was no firmware loaded, the SUDAV interrupt is sticky. When the bootloader firmware is loaded It immediately tries to process a setup packet. But in the endpoint buffer it actually sees the control transfer that took the CPU out of reset, and it doesn't know what to do with it, so it falls through to the EP0_STALL. From this point on the same thing happens as in glasgow firmware: The rejected control transfer is retried again and again. And it is retired again and again while the EP0 buffer is getting gradually filled with a new control transfer that has not arrived yet. So the CPU will see a mix of the old control transfer and the new control transfer, which will result in data corruption.