-
Notifications
You must be signed in to change notification settings - Fork 746
boot: bootutil: swap-move: Allow trailer area not be a multiple of the sector size #2247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
boot: bootutil: swap-move: Allow trailer area not be a multiple of the sector size #2247
Conversation
The problem with this is that it introduces the inability to clear flags e.g. if a firmware update is marked for confirmation or test, once a flag is set it cannot be cleared without potentially bricking the device if power is lost during erase/re-write of that sector |
@nordicjm Have you a specific scenario in mind where there would be an issue?
Also note that the swap-scratch strategy already supports having firmware and trailer data in the same sector and this doesn't seem to cause any issue, so I don't think this should be a problem for swap-move. |
User uploads image, they mark it for test/swap (which writes the flag to the primary image, not the secondary), then they delete the secondary image without rebooting, the update image is gone, but the flag is permanently set to test/confirm |
@nordicjm Perhaps you're talking about a specific MCUboot feature I'm not aware of, but the usual flow is:
|
This routine will now also be useful for the swap-move strategy. The trailer size is removed from the list of parameters as it can be easily obtained using the 'state' parameter and it will be more convenient, for the swap-move strategy, not to have to compute it manually before calling 'boot_get_first_trailer_sector'. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
The swap-move strategy is currently assuming the trailer is stored in dedicated sector, meaning the size of the area allocated to the trailer must be a multiple of the sector size. This commit relaxes this assumption when moving sectors up in the primary slot by allowing the last sector containing firmware data to also hold part of the trailer (or the whole trailer if it is small enough). Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
When using the swap-move strategy, the secondary trailer was erased just after the primary trailer, when moving the sectors up in the primary slot. This was working because it was assumed the trailer is stored in dedicated sectors, so no sector could contain both trailer and firmware data. However, if this assumption is relaxed, it means it is no more possible to erase the secondary trailer before the last sector holding firmware data has been copied to the primary slot, since that sector might also contain trailer data. So the erasure of the secondary trailer has to occur at the time the last sector containing firmware data is swapped. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
When using the swap-move strategy, at the very beginning of the revert process, the secondary trailer was rewritten to make the revert look like a permanent upgrade in case an unfortunate reset occurs when rewriting the primary trailer. This was possible because the assumption was that no sector contained both part of the firmware and part of the trailer. To relax this assumption, it is necessary to avoid having to rewrite the secondary trailer at the start of the revert process, since that could also erase firmware data. The solution chosen is to write, during the upgrade process, a "fallback trailer" at the end of the last sector that is not containing part of the slot's trailer. The fallback trailer only contains the magic field. Depending on the size of the trailer, the maximum firmware image size has to be reduced a bit (by at most 8 bytes) to ensure the fallback trailer can be written. During a revert, if a reboot occurs while the primary trailer is rewritten, the presence of the fallback trailer will make possible to determine that a revert was in progress and has to be resumed. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
This check is not useful because it is already done during the validation of the image and was also making sure there was no firmware data in the first trailer sector, which is no more needed. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
This commit updates the computation of the maximum firmware image size for the swap-move strategy to allow the first sector containing part of the trailer to also hold firmware data. This means it is no more necessary to allocate a sector-aligned area for the trailer when using swap-move, the area to allocate is now simply equal to the maximum trailer size. The space required for writing the fallback trailer is also taken into account. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
For swap-move and swap-offset strategies, the size of the images used for running the tests was hardcoded and was smaller than the maximum possible size. This was done this way because the logic used to compute the maximum image size was not taking into account the padding that is needed when using those strategies. This commit fixes the issue, making now possible to run the tests with images having the maximum possible size. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
When using the swap-move strategy, the simulator now doesn't compute anymore the maximum image size so the area allocated to the trailer has a size that is a multiple of the sector size. Indeed, the swap-move strategy now supports having part of the firmware in the first sector holding trailer data, allowing larger images. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
When using the swap-move strategy, the area allocated to the trailer no more need to have a size that is a multiple of the sector size, since the area not allocated to the trailer in the first sector holding trailer data can now be used to store firmware data. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
The swap-move strategy now support having firmware data in the first sector holding trailer data. This commit adds a release note snippet regarding this change. Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
5b4c3e6
to
2bc56dc
Compare
The solution that was initially implemented to avoid the need of rewriting the secondary slot's trailer when starting a revert process was simple but had the drawback of requiring the trailer in the secondary slot to be valid after the completion of an upgrade. This meant a few changes in the swap tables were required to handle this swap-move-specific state, but also that it was no longer possible for a user to set again the secondary image as pending after an upgrade or revert process, since the secondary trailer wasn't erased anymore and cannot be rewritten without risking to lose part of the firmware. I'm not sure this is a very typical use case, but that might still be useful to some users, e.g. to perform a rollback after an update has been confirmed. So, I wasn't fully satisfied with that solution. I updated this MR with another solution, which is bit more complex but doesn't require anymore the secondary trailer to be valid after an upgrade process. Instead, a "fallback trailer" is written in the primary slot, in a sector that doesn't contain any bit of the primary slot's trailer to ensure it is not erased when the primary trailer is erased. This fallback trailer is used to determine a revert process was in progress in case a reboot occurs while rewriting the primary trailer. I updated the PR description accordingly and the details can be found in the commit messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like what is going on here, and really hope we can get this to a point where this will allow swap-move and swap-offset to be used on these 3 large sector devices.
However, I have removed the constraint in the simulator that disables the simulator tests on the first STM device, and even a basic upgrade fails. I haven't investigated further.
So, first thing to emphasize: work on mcuboot needs to always be made to work in the simulator. I would highly recommend that it be done first in the simulator, and only after that to test it on targets. The simulator will run through hundreds of thousands of scenarios that even if you had time to run them on a real device, would wear the flash out before a single run even finished.
I'll post a simple patch to test.
Also note that the mbedTLS update needs to be brought in for the simulator tests to pass on some compiler toolchains. I think this might be a clang warning issue, as it doesn't seem to affect CI.
This will enable the sim to run these tests on the stm32f4 device: diff --git a/sim/src/image.rs b/sim/src/image.rs
index 09214c50..ed72d2e5 100644
--- a/sim/src/image.rs
+++ b/sim/src/image.rs
@@ -451,7 +451,7 @@ impl ImagesBuilder {
let mut flash = SimMultiFlash::new();
flash.insert(dev_id, dev);
- (flash, Rc::new(areadesc), &[Caps::SwapUsingMove, Caps::SwapUsingOffset])
+ (flash, Rc::new(areadesc), &[])
}
DeviceName::Stm32f4SpiFlash => {
// STM style internal flash and external SPI flash. |
To run this on the simulator, cargo test \
--features "swap-move" \
-- --nocapture --test-threads 1 The features sets what will be tested, in this case just swap move, "sig-ecdsa-mbedtls swap-move" for example will test with ecdsa signatures, and swap move. You can also add arguments at the end to restrict the particular test you want to run. |
@d3zd3z Thanks for your feedback! I'm doing nearly all my testing with the simulator, which is a lot easier and faster than setting up a real device, in addition to be more comprehensive as you said. Your patch isn't sufficient to run swap-move with the STM32F4 device defined in the simulator, as the slots contain sectors with different sizes (which is of course not supported by swap-move, even with my changes). Also, when using swap-move or swap-offset, the simulator currently assumes all of the sectors of the flash memory have the same size, e.g. here: Lines 1810 to 1812 in 6cbea0a
So to be able to run swap-move on the STM32F4 device, you need: diff --git a/sim/src/image.rs b/sim/src/image.rs
index 09214c50..b00c7e96 100644
--- a/sim/src/image.rs
+++ b/sim/src/image.rs
@@ -437,9 +437,9 @@ impl ImagesBuilder {
// The flash layout as described is not present in any real STM32F4 device, but it
// serves to exercise support for sectors of varying sizes inside a single slot,
// as long as they are compatible in both slots and all fit in the scratch.
- let dev = SimFlash::new(vec![16 * 1024, 16 * 1024, 16 * 1024, 16 * 1024, 64 * 1024,
- 32 * 1024, 32 * 1024, 64 * 1024,
- 32 * 1024, 32 * 1024, 64 * 1024,
+ let dev = SimFlash::new(vec![64 * 1024, 64 * 1024,
+ 64 * 1024, 64 * 1024,
+ 64 * 1024, 64 * 1024,
128 * 1024],
align as usize, erased_val);
let dev_id = 0;
@@ -451,7 +451,7 @@ impl ImagesBuilder {
let mut flash = SimMultiFlash::new();
flash.insert(dev_id, dev);
- (flash, Rc::new(areadesc), &[Caps::SwapUsingMove, Caps::SwapUsingOffset])
+ (flash, Rc::new(areadesc), &[])
}
DeviceName::Stm32f4SpiFlash => {
// STM style internal flash and external SPI flash. With those changes, nearly all the tests are OK and in particular the
Not sure exactly what's going on here, I don't know the purpose of |
So I've be able to look a bit further. If my understanding is correct,
The
At next boot, MCUboot tries to resume the upgrade but since the boot status has not been properly written, MCUboot thinks the upgrade hasn't started yet so tries to compute the size of the primary image: mcuboot/boot/bootutil/src/loader.c Lines 1818 to 1822 in 6cbea0a
Of course, this fails because the TLV area of the (new) primary image hasn't been written yet, so From my point of view, this is not an MCUboot issue, but more a simulator-related issue. Indeed, from what I understand, when Lines 1212 to 1220 in 6cbea0a
So it is expected that either:
But in fact, there is another valid case that isn't handled by this test case: if no status word has been properly written, MCUboot thinks the upgrade hasn't started. In that case, there can be zero, one or multiple asserts depending on the current state of the slots (the primary and secondary headers might be valid or not, the size of the TLV area might be found or not, the crypto keys might be found in the TLV area or not, ...). @d3zd3z Is my understanding correct? If so, does it really make sense not to expect any assert, except one coming from the primary slot validation, when trying to resume the upgrade in |
When using swap-move, the area allocated to the trailer has currently to be a multiple of the sector size. This is not a big deal on devices having small sectors, but on devices having a large sector size, such most STM32 MCUs, this typically means losing a significant amount of flash memory. For example, on an STM32F4 device with 128-KiB sectors, the trailer is usually only a few hundreds of bytes long but a whole 128 KiB has still to be reserved for storing the trailer and can't therefore be used for firmware data.
The idea of this PR is to relax the need of having a sector-aligned trailer area, enabling therefore to use all the bytes not needed by the trailer, in the first sector holding trailer data, to store firmware data.
To illustrate the solution, let's imagine a device having 4096-byte sectors and a primary slot composed of
N
sectors. Let's assume the trailer needs 4224 bytes. Before this MR, two entire sectors would have to be allocated to the trailer, so the layout would have been something like:With this MR, only the exact amount of bytes actually needed by the trailer have to be allocated to the trailer, so we have 3968 additional bytes available to store the firmware image:
At the beginning of an upgrade or revert process, all the sectors in the primary slot are "moved up", ending up with:
To main ideas of the solution implemented by this PR is:
The latter is the most difficult part. Indeed, at the beginning of the revert process, the secondary trailer was rewritten to make the revert look like a permanent upgrade in case an unfortunate reset occurs during the rewriting of the primary trailer. This was a clever trick, but is unfortunately no longer possible if the trailers are not stored in dedicated sectors. The solution proposed by this PR is to write a "fallback trailer" composed only of a magic number in the primary slot, during the upgrade process, at the end of the sector that is just before the first sector containing part of the slot's trailer. This area is available at the end of the upgrade process since it is part of the padding area, used to "move the sectors up". The maximum image size is adjusted, by adding at most 8 bytes of padding, to ensure the area is always large enough to write the magic number. In case a reset occurs while rewriting the primary trailer during a revert process, the presence of the fallback trailer can be used to determine a revert was in progress.
If the improvement proposed by this MR is merged, I will try to port those changes to the swap-offset strategy, which should be quite easy, at least in theory.
Resolves #2052