OTA Failing (CoAp Option Length Too Long)

I went to do a monthly release and OTA was broken due to the firmware upgrade message now containing an option that was too large for the default Zephyr option size of 12 bytes:

[00:00:08.460,000] <inf> ota: DFU - Starting firmware upgrade from 0.1.2+2024051300 to 0.1.2+2024060600
[00:00:09.370,000] <err> net_coap: 21 is > sizeof(coap_option->value)(12)!

Decoding the CoAP message, it appears that the artifact version is being sent as an option. The software versions in our case would be “[email protected]+YYYYMMDD##” which is where the 21 characters comes from.

This appears to be a change on the server side sometime after the 3rd of May since OTA worked fine before that with the same firmware naming.

The version number we have is long because the build number is the full ISO date and an iteration count, but even if this followed the example semantic naming examples in the artifact of “1.0.0-beta.1” (see screenshot below), other users will hit this problem when they released more than 10 patch revisions.

I have changed my artifact naming to “0.1.2+#” temporarily to roll out a release with increased option size to work around the issue which was based upon Mike Szczys recently discovery of this issue for LightDB (Debugging Archives - Golioth):

CONFIG_COAP_EXTENDED_OPTIONS_LEN=y
CONFIG_COAP_EXTENDED_OPTIONS_LEN_VALUE=32

Regards,
Eric

Hey @EricNRS,

I’m glad you found Mike’s blog post about working around this issue and that it was not a blocker for you.

We have identified the issue on the Cloud side and will fix it shortly!

1 Like

Hey @EricNRS,

The fix for the issue has been rolled out, and I’ve tested it with Artifact Version: 1.0.1-alfabeta-12345678987654321.
Note that I had to increase the CONFIG_GOLIOTH_OTA_MAX_VERSION_LEN so that the decoding and parsing of the OTA Manifest wouldn’t fail.

Thanks for the fast fix. I will confirm the fix sometime this week.