System hangs calling golioth_client_stop(client)

Hey @john.stuewe, I’m observing the same behavior and we are investigating the issue.

Can you post the log output prior to device reset?

Here is the reset sequence. The reset doesn’t trigger any kind of exception in the debugger.

[00:02:32.684,997] <inf> golioth_rd_template: Golioth client: disconnected
[00:02:33.775,848] <wrn> golioth_rd_template:  2102ms golioth client stopped.
[00:02:33.775,878] <dbg> golioth_rd_template: main: End of loop 2

[00:02:48.775,970] <dbg> golioth_rd_template: main: New Loop
[00:02:48.776,031] <dbg> golioth_rd_template: main: Starting client
uart:~$

*** Booting nRF Connect SDK v2.5.2 ***
[00:00:00.508,178] <inf> battery: Initializing battery measurement

and a passing loop

[00:00:11.319,549] <inf> golioth_rd_template: Golioth client: disconnected
[00:00:12.410,308] <wrn> golioth_rd_template:  2102ms golioth client stopped.
[00:00:12.410,339] <dbg> golioth_rd_template: main: End of loop 1

[00:00:27.410,430] <dbg> golioth_rd_template: main: New Loop
[00:00:27.410,491] <dbg> golioth_rd_template: main: Starting client
[00:00:29.026,947] <inf> golioth_coap_client_zephyr: Golioth CoAP client connected
[00:00:29.027,008] <inf> golioth_rd_template: Golioth client: connected

An updates or suggestions to fix this?

Hey @john.stuewe, a PR is opened but it needs to be merged. You can use the PR branch to test before the merge .

Thanks. I confirmed that this fixes the reboot I was seeing. It still takes 9 seconds to stop every time which is annoying but isn’t failing.
One thing to note is that if you call golioth_client_stop(client) when the client is already stopped it hangs until the WDT timer reboots the system vs. just returning immediately. That should be an easy fix.

Hey @john.stuewe, with the release of Firmware SDK v0.12.2, we addressed the reported issues and fixed the following:

  • Zephyr: Fix crash when repeatedly stopping and starting the client
  • Zephyr: Fix up to 10 second delay when stopping client
  • Zephyr: Fixed hang when attempting to stop an already stopped client

We also made our testing more robust by increasing the number of testing scenarios for the golioth_client_stop() method.

I can confirm that 0.12.2 fixes the extra delay on stopping and the start\stop crash. I’ve got another spurious reboot I’m working on but it doesn’t appear to be associated with these fixes.
Thanks!!!