OTA firmware update removed after device restart

I have modified the Golioth Basics function to only update firmware after receiving a RPC call. When I make this call the firmware successfully updates and restarts to the new firmware version however, after the device restarts again it boots into the old firmware version. Here is the code for the rpc call and the check to start updating:

static enum golioth_rpc_status update_firmware(
        zcbor_state_t* request_params_array,
        zcbor_state_t* response_detail_map,
        void* callback_arg) {
    
    nvs_write_str("g_upd","1");
    nvs_write_str("g_com","00000.RES.");
    return GOLIOTH_RPC_OK;
}

This also restarts the device thus activating the next part of the code:

    // For OTA, we will spawn a background thread that will listen for firmware
    // updates from Golioth and automatically update firmware on the device
    if(nvs_read_golioth_upd()[0] == '1'){
        golioth_fw_update_init(client, _current_version);
        nvs_write_str("g_upd","0");
    }

Will you please share the following information:

  1. What device is running this firmware
  2. What version of the Golioth Firmware SDK you are using
  3. The log output of the firmware successfully updating
  4. The log output of the device rebooting into the old version of the firmware
  1. ESP32
  2. 0.12.2

Log output showing the final part of the update, rebooting into new version (1.3.4) then restarting and booting into old firmware version (1.3.3):

Timestamp	Level	Module	Message	Metadata
Today at 2:40:38 PM	INFO	example_wifi	Connected to AP SSID: sparta	
Today at 2:40:38 PM	INFO	golioth_example	The current command is: 00000.CON.1	
Today at 2:35:36 PM	INFO	golioth_rpc	RPC observation established	
Today at 2:35:36 PM	INFO	example_wifi	Connected to AP SSID: sparta	
Today at 2:35:36 PM	INFO	golioth_example	The current command is: 00000.CON.1	
Today at 2:35:36 PM	INFO	golioth_example	The device Wifi SSID: sparta	
Today at 2:35:35 PM	INFO	golioth_example	The device serial number is: LBSCon_556004	
Today at 2:35:35 PM	INFO	golioth_example	The current payment info is: 1	
Today at 2:35:35 PM	INFO	golioth_basics	Entering endless loop	
Today at 2:35:35 PM	INFO	golioth_basics	Curent Firmware is: 1.3.3	
Today at 2:35:35 PM	INFO	golioth_basics	Hello, Golioth!	
Today at 2:35:35 PM	INFO	golioth_basics	Golioth client connected	
Today at 2:34:54 PM	INFO	golioth_rpc	RPC observation established	
Today at 2:34:54 PM	INFO	golioth_example	The current command is: 00000.CON.1	
Today at 2:34:54 PM	INFO	golioth_example	The device Wifi SSID: sparta	
Today at 2:34:54 PM	INFO	golioth_example	The device serial number is: LBSCon_556004	
Today at 2:34:53 PM	INFO	golioth_example	The current payment info is: 1	
Today at 2:34:53 PM	INFO	golioth_basics	Entering endless loop	
Today at 2:34:53 PM	INFO	golioth_basics	Curent Firmware is: 1.3.4	
Today at 2:34:53 PM	INFO	golioth_basics	Hello, Golioth!	
Today at 2:34:53 PM	INFO	golioth_basics	Golioth client connected	
Today at 2:34:53 PM	INFO	golioth_coap_client_libcoap	Golioth CoAP client connected	
Today at 2:34:53 PM	INFO	golioth_coap_client_libcoap	Entering CoAP I/O loop	
Today at 2:34:52 PM	INFO	libcoap	Setting PSK key	
Today at 2:34:52 PM	INFO	golioth_coap_client_libcoap	Session PSK-ID: 20240516224725-lbscon_556004@esp32testproject	
Today at 2:34:52 PM	INFO	golioth_coap_client_libcoap	Start CoAP session with host: coaps://coap.golioth.io	
Today at 2:34:52 PM	INFO	golioth_basics	Waiting for connection to Golioth...	
Today at 2:34:45 PM	INFO	golioth_fw_update	Rebooting into new image in 1 seconds	
Today at 2:34:44 PM	INFO	golioth_fw_update	Rebooting into new image in 2 seconds	
Today at 2:34:43 PM	INFO	golioth_fw_update	Rebooting into new image in 3 seconds	
Today at 2:34:43 PM	INFO	golioth_fw_update	Rebooting into new image in 4 seconds	
Today at 2:34:42 PM	INFO	golioth_fw_update	Rebooting into new image in 5 seconds	
Today at 2:34:42 PM	INFO	esp_image	segment 4: paddr=002abea8 vaddr=40088208 size=1617ch ( 90492)	
Today at 2:34:41 PM	INFO	esp_image	segment 3: paddr=001e0020 vaddr=400d0020 size=cbe80h (835200) map	
Today at 2:34:41 PM	INFO	esp_image	segment 2: paddr=001d7e10 vaddr=40080000 size=08208h ( 33288)	
Today at 2:34:41 PM	INFO	esp_image	segment 1: paddr=001d2224 vaddr=3ffbdb60 size=05be4h ( 23524)	
Today at 2:34:41 PM	INFO	esp_image	segment 0: paddr=001a0020 vaddr=3f400020 size=321fch (205308) map	
Today at 2:34:41 PM	INFO	fw_update_esp_idf	Setting boot partition	
Today at 2:34:41 PM	INFO	golioth_dfu	main(UPDATING): ready state	
Today at 2:34:41 PM	INFO	golioth_fw_update	State = Updating	
Today at 2:34:41 PM	INFO	golioth_dfu	main(DOWNLOADED): ready state	
Today at 2:34:40 PM	INFO	golioth_fw_update	State = Downloaded	
Today at 2:34:40 PM	INFO	esp_image	segment 4: paddr=002abea8 vaddr=40088208 size=1617ch ( 90492)	
Today at 2:34:40 PM	INFO	esp_image	segment 3: paddr=001e0020 vaddr=400d0020 size=cbe80h (835200) map	
Today at 2:34:40 PM	INFO	esp_image	segment 2: paddr=001d7e10 vaddr=40080000 size=08208h ( 33288)	
Today at 2:34:40 PM	INFO	esp_image	segment 1: paddr=001d2224 vaddr=3ffbdb60 size=05be4h ( 23524)	
Today at 2:34:40 PM	INFO	esp_image	segment 0: paddr=001a0020 vaddr=3f400020 size=321fch (205308) map	
Today at 2:34:40 PM	INFO	fw_block_processor	Total bytes written: 1187920	
Today at 2:34:40 PM	INFO	fw_block_processor	Max: 5120 ms	
Today at 2:34:39 PM	INFO	fw_block_processor	Min: 150 ms	
Today at 2:34:39 PM	INFO	fw_block_processor	Block Latency Stats:	
Today at 2:34:39 PM	INFO	golioth_fw_update	Download took 314570 ms	
Today at 2:34:39 PM	INFO	fw_block_processor	Downloading block index 1160 (1161/1161)	
Today at 2:34:39 PM	WARN	golioth_coap_client_libcoap	CoAP message retransmitted	
Today at 2:34:36 PM	INFO	fw_block_processor	Downloading block index 1159 (1160/1161)	
Today at 2:34:36 PM	INFO	fw_block_processor	Downloading block index 1158 (1159/1161)	
Today at 2:34:35 PM	INFO	fw_block_processor	Downloading block index 1157 (1158/1161)

I have some idea of what is happening but I need to sync up with my colleagues before I have a definitive solution for you.

After rebooting into a new image, MCUboot waits for confirmation that the new image is running correctly. The golioth_fw_update_init() usually takes care of this, but in your example this is not being run immediately after reboot. I suspect that MCUboot is triggered to reboot into the know-working image still stored in flash.

You should be able to verify that this is the case by calling fw_update_is_pending_verify() which will return true if the newly received firmware needs to be verified.

From there, perform the steps that you deem necessary to validate your new firmware (our SDK waits for a connection to Golioth to ensure that still works).

Once you’ve validated the new firmware, call fw_update_cancel_rollback() to prevent MCUboot from rolling back to the old.

At this point, the device should update the server regarding the currently running firmware version. But right now I don’t see how to do that outside of golioth_fw_update_init() so I need to do some research and get back to you.

Alright. Thanks for the response. I will try changing the firmware include the fw_update_is_pending_verify() and fw_update_cancel_rollback() function and see if that resolves the issue.

Changing the code to this prevents the rollback:

    // For OTA, we will spawn a background thread that will listen for firmware
    // updates from Golioth and automatically update firmware on the device
    if(nvs_read_golioth_upd()[0] == '1'){
        golioth_fw_update_init(client, _current_version);
        golioth_sys_msleep(250);
    }
    if(fw_update_is_pending_verify()){
        GLTH_LOGI(TAG, "Update Pending Verification");
        fw_update_cancel_rollback();
    } else {
        GLTH_LOGI(TAG, "Update Verified");
        nvs_write_str("g_upd","0");
    }

However there is still an issue where the value on the Golioth console does not update, still showing the previous firmware version until the “update_firmware” rpc is called a second time. Is there a way to have the golioth console update the firmware change without running the update initialization at startup. The whole point of this is to allow users to control when the firmware is updated.

Great news on getting the rollback issue solved!

The device is responsible for notifying the Golioth server of its firmware version. In my last message I mentioned I didn’t know of how to do that manually, but there is indeed an API you can call with the version and a status message:

golioth_ota_report_state_sync()

So after you have validated the image and cancelled the rollback, you can update the firmware status info shown on the Golioth console by calling something like:

status = golioth_ota_report_state_sync(client,
                                       GOLIOTH_OTA_STATE_IDLE,
                                       GOLIOTH_OTA_REASON_READY,
                                       "main",
                                       _current_version,
                                       NULL,
                                       SOME_TIMEOUT_VALUE_S);

It is good to call this function at every reboot, even when we’re not coming off of an update, just to make sure what’s shown on the cloud is current.

All of this is coming from the doxygen page for OTA which is worth a look: Golioth Firmware SDK: golioth_ota

Using that function resolves the issue of the console not updating on first restart. Thanks.

Looks like it is included in the Golioth_basics example for the latest SDK version as well. I’ve replacing Golioth basics with my own code when upgrading to new SDKs so it looks like I’ll need to pay attention for new functions in there going forward…