Session not created after devices offline for 24 hours

Noticed some odd behavior today with multiple devices.

On first boot up after being off overnight the devices would not establish a session. The device was otherwise functional but was not connecting to Golioth. After booting up a second time a session would be established. When I first saw this I thought it was a fluke with a single device, but later I saw the exact same behavior from a second device, so this may be systemic. I will try to leave both devices off overnight tonight and see if the same behavior occurs tomorrow.

I’ve confirmed that this behavior is still occurring. If the devices are left off overnight, then the first time they are turned on they will not establish a session on the Golioth console. If turned off and on again a session is established. Ideally they should be able to establish a session the first time they are turned on.

I can send you the device IDs of the two devices I have been using to test if that will help determine why this behavior is occuring.

We split this out to a new thread @npschwab, we will take a look at this

@npschwab thank you for reporting this. Would you be able to post some log output from the devices? I’m wondering if there is a failed handshake log message or something similar that would help us to troubleshoot this.

Also, please confirm the platform (esp-idf if I remember correctly?) and Golioth Firmware SDK version that you are using for these tests.

Looks like there is a timeout when turning on the device for the first time after a long time. I set the timeout to 5 but the message seemed to appear instantly…

I (1619) wifi_init: rx ba win: 6
I (1629) wifi_init: tcpip mbox: 32
I (1629) wifi_init: udp mbox: 6
I (1629) wifi_init: tcp mbox: 6
I (1639) wifi_init: tcp tx win: 5744
I (1639) wifi_init: tcp rx win: 5744
I (1639) wifi_init: tcp mss: 1440
I (1649) wifi_init: WiFi IRAM OP enabled
I (1649) wifi_init: WiFi RX IRAM OP enabled
E (6659) example_wifi: Timeout waiting for wifi to connect

The device is supposed to be periodically trying to reconnect so I will show the output when that occurs.

Also this is using ESP-IDF with Golioth firmware SDK version 0.12.1

On periodic reconnection, the device is printing out “I (908899) example_wifi: Connected to AP SSID: [SSID]” but is still not showing up on the golioth console.

It sounds like you are working on a couple of different problems here.

I set the timeout to 5 but the message seemed to appear instantly…

Can you share a code example? The delay value for WiFi connection timeout is ticks, not seconds. The wifi_wait_for_connected(); function should already be set to portMAX_DELAY

On periodic reconnection, the device is printing out “I (908899) example_wifi: Connected to AP SSID: [SSID]” but is still not showing up on the golioth console.

The connection to Golioth is a separate step from the connection to WiFi. Can you share more of the log output so we can see the behavior during and after the device connects to the access point?

There is an example of log output from a successful connection shown here: golioth-firmware-sdk/examples/esp_idf/rpc at main · golioth/golioth-firmware-sdk · GitHub

I’ve made some adjustments that seem to have resolved the issue. Originally I was initializing wifi with:

    // Initialize WiFi and wait for it to connect
    wifi_init(nvs_read_wifi_ssid(), nvs_read_wifi_password());
    //wifi_wait_for_connected();
    bool connected = wifi_wait_for_connected_with_timeout(5);

And then establishing golioth session with:

const char* psk_id = nvs_read_golioth_psk_id();
    const char* psk = nvs_read_golioth_psk();
    if(connected){
        struct golioth_client_config config = {
                .credentials = {
                        .auth_type = GOLIOTH_TLS_AUTH_TYPE_PSK,
                        .psk = {
                                .psk_id = psk_id,
                                .psk_id_len = strlen(psk_id),
                                .psk = psk,
                                .psk_len = strlen(psk),
                        }}};
        struct golioth_client* client = golioth_client_create(&config);
        assert(client);

        // golioth_basics will interact with each Golioth service
        golioth_basics(client);
    }

So it seems like the devices had started failing the first attempt to connect to wifi within time (before they were succeeding which was why II didn’t see this behavior before). There was also periodic attempts to reconnect if the ‘connected’ bool was false, however if the first attempt failed then there would never be an attempt to establish a session again. I changed this by having 2 bools, one for the wifi connection and one for the golioth session, to track both and run the necessary functions periodically if depending on if either was false.

1 Like