we are using the Golioth Firmware SDK v0.10.0 on a nRF9160 device, along with the nRF Connect SDK v2.5.1.
The device is able to connect to the Network (what takes ~75 seconds) and seems to resolve DNS without a problem, but than fails to connect to the Golioth Cloud with error -22.
The log repeatedly prints the following messages after connecting:
[00:01:14.061,462] <err> golioth_coap_client_zephyr: Failed to send empty CoAP message: -22
[00:01:14.061,828] <err> golioth_coap_client_zephyr: Failed to connect: -22
[00:01:14.061,828] <wrn> golioth_coap_client_zephyr: Failed to connect: -22
This happens with the hello sample and the logging sample (from golioth-firmware-sdk/examples/zephyr).
With a firmware based on the Golioth Zephyr SDK, I already connected successfully to the Golioth Cloud, using the same hardware.
Hi @SenseJo, thank you for this detailed report! While I am not certain this is the issue, if you are using a pre-shared key (PSK) for authentication, it is possible that the PSK ID has exceeded the default max length in the Golioth Firmware SDK. Could you check and see if your PSK ID is longer than 32 characters, and, if so, try to connect again using a shorter PSK ID?
Hi @hasheddan, thank you for your response!
Yes, I am using PSK auth, and actually, my PSK ID was about 40 characters long! I now changed it to a 30 character PSK ID.
But unfortunately, this did not make any difference about the error, I’m still getting the same log output.
Maybe it helps if I describe how I build the firmware: I have a west manifest repo with the following west.yml:
cd modules/lib/golioth-firmware-sdk/examples/zephyr/hello
west build -b nrf9160dk_nrf9160_ns -p -- -DEXTRA_CONF_FILE="golioth-auth.conf;rtt.conf" -DCONFIG_LOG=y
The board I’m using actually isn’t a nRF9160DK, but a custom board. However, the board connects to Golioth with no issues when I use a firmware, e.g. the hello sample again, that is based on the Golioth Zephyr SDK. I just tried it again successfully with Golioth Zephyr SDK v0.8.0, using the same compile command as for the Golioth Firmware SDK hello sample.
@SenseJo thank you for that information! Could you also provide your config you are using with the Zephyr SDK? That will help me compare and reproduce.
@hasheddan Sure.
With the Golioth Zephyr SDK, I’m using the hello sample under modules/lib/golioth/samples/hello, and building using the same command as above:
The rtt.conf is the same as well, golioth-auth.conf differs slightly, and west.yml is different, of course:
Thanks @SenseJo! I doubt this is the issue, but it may be worth specifying pristine in your build to ensure that your new PSK ID / PSK is being used. The v0.8.0 Zephyr SDK release does use a different version of the nRF Connect SDK (v2.5.0), but I am also doubtful that would be causing this behavior. Could you try increasing the logging verbosity, particularly in the networking stack to provide additional informatin?
Thank you @hasheddan, and sorry for the delay, I had a few days off.
I checked my .config file to ensure that the credentials with the short PSK_ID are used.
Then I activated more logging by building as follows:
west build -b nrf9160dk_nrf9160_ns (read) -- -DEXTRA_CONF_FILE="golioth-auth.conf;rtt.conf" -DCONFIG_LOG=y -DCONFIG_NET_LOG=y -DCONFIG_NET_SOCKETS_LOG_LEVEL_DBG=y
@73jn thank you for adding your reproduction, and @SenseJo thank you for the additional information! We are looking into this and I will follow up here shortly.
@SenseJo@73jn are you both able to verify that your cellular data plan is active? Also, if supported by your provider, you may be able to observe packet captures that would help give a better indication of what data is actually being sent.
@hasheddan Yes, I am sure my cellular data plan is active, since I just swapped the firmware on the same board with the same SIM card, and things worked with the Golioth Zephyr SDK based firmware.
I don’t know whether our operator supports packet capturing, a quick search yielded no result.
I just tried the the hello-sample again with different revision of the Golioth Firmware SDK. Of course, I always updated the NCS accordingly.
With the current main branch, it’s the same behavior.
Interestingly, with revision v0.9.0, the nRF91 does not output error -22. However, it seems like no connection was established anyways, since the device does not print the hello counter logs, and is show as offline in the Golioth web console.
@SenseJo we have been unable to reproduce your observed behavior thus far, but it makes sense that you would see a difference between v0.9.0 and v0.10.0 as the latter moved to using the Zephyr CoAP library.
To help us continue investigation, would you mind sharing your build/zephyr/.config file with any sensitive information redacted?
I looked closer at your project configuration file, and one line caught my attention. On line 140 of your config file, CONFIG_MBEDTLS_ENABLE_HEAP is not set . This explains the errors you are experiencing, and I was able to reproduce them.
At this moment, I’m unsure why this has happened, but I’ll keep investigating.
Can you add CONFIG_MBEDTLS_ENABLE_HEAP=y to your prj.conf file (and check that it is actually set in the .config file) to confirm that this was the issue?
I took your advice and added CONFIG_MBEDTLS_ENABLE_HEAP=y, but that alone did not do the trick. That changed something, however, now I’m getting the following errors:
[00:01:13.725,982] <err> golioth_coap_client_zephyr: Failed to send empty CoAP message: -12
[00:01:13.726,348] <err> golioth_coap_client_zephyr: Failed to connect: -12
[00:01:13.726,379] <wrn> golioth_coap_client_zephyr: Failed to connect: -12
So now it’s error code -12 instead of -22.
I think the CONFIG_MBEDTLS_ENABLE_HEAP options should normally be set by golioth-firmware-sdk/examples/zephyr/common/Kconfig.defconfig, which includes the following:
Kconfig.defconfig
if GOLIOTH_SAMPLE_COMMON
[...]
if MBEDTLS
config MBEDTLS_ENABLE_HEAP
default y
if MBEDTLS_ENABLE_HEAP
config MBEDTLS_HEAP_SIZE
default 49152 if GOLIOTH_AUTH_METHOD_CERT
default 10240
endif # MBEDTLS_ENABLE_HEAP
if MBEDTLS_BUILTIN
config MBEDTLS_SSL_MAX_CONTENT_LEN
default 4096 if GOLIOTH_AUTH_METHOD_CERT
default 2048
endif # MBEDTLS_BUILTIN
if NRF_SECURITY
config MBEDTLS_SSL_IN_CONTENT_LEN
int
default 4096 if GOLIOTH_AUTH_METHOD_CERT
default 2048
config MBEDTLS_SSL_OUT_CONTENT_LEN
int
default 4096 if GOLIOTH_AUTH_METHOD_CERT
default 2048
endif # NRF_SECURITY
endif # MBEDTLS
[...]
endif # GOLIOTH_SAMPLE_COMMON
However, it looks like in my build of the hello-sample, this .defconfig file is not picked up for some reason:
Build output
-- west build: making build dir /home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/build/nrf9160dk_nrf9160_ns pristine
-- west build: generating a build system
Loading Zephyr default modules (Zephyr base).
-- Application: /home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello
-- CMake version: 3.25.1
-- Found Python3: /usr/bin/python3 (found suitable version "3.10.12", minimum required is "3.8") found components: Interpreter
-- Cache files will be written to: /home/jo/.cache/zephyr
-- Zephyr version: 3.4.99 (/home/jo/Documents/Project/project_ncs/zephyr)
-- Found west (found suitable version "1.0.0", minimum required is "0.14.0")
-- Board: nrf9160dk_nrf9160_ns, Revision: 0.14.0
-- ZEPHYR_TOOLCHAIN_VARIANT not set, trying to locate Zephyr SDK
-- Found host-tools: zephyr 0.16.5 (/home/jo/zephyr-sdk-0.16.5-1)
-- Found toolchain: zephyr 0.16.5 (/home/jo/zephyr-sdk-0.16.5-1)
-- Found Dtc: /home/jo/zephyr-sdk-0.16.5-1/sysroots/x86_64-pokysdk-linux/usr/bin/dtc (found suitable version "1.6.0", minimum required is "1.4.6")
-- Found BOARD.dts: /home/jo/Documents/Project/project_ncs/zephyr/boards/arm/nrf9160dk_nrf9160/nrf9160dk_nrf9160_ns.dts
-- Found devicetree overlay: /home/jo/Documents/Project/project_ncs/zephyr/boards/arm/nrf9160dk_nrf9160/nrf9160dk_nrf9160_ns_0_14_0.overlay
-- Generated zephyr.dts: /home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/build/nrf9160dk_nrf9160_ns/zephyr/zephyr.dts
-- Generated devicetree_generated.h: /home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/build/nrf9160dk_nrf9160_ns/zephyr/include/generated/devicetree_generated.h
-- Including generated dts.cmake file: /home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/build/nrf9160dk_nrf9160_ns/zephyr/dts.cmake
warning: Experimental symbol NET_SOCKETS_SOCKOPT_TLS is enabled.
warning: Experimental symbol NET_SOCKETS_ENABLE_DTLS is enabled.
warning: Experimental symbol NET_CONNECTION_MANAGER is enabled.
Parsing /home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/Kconfig
Loaded configuration '/home/jo/Documents/Project/project_ncs/zephyr/boards/arm/nrf9160dk_nrf9160/nrf9160dk_nrf9160_ns_defconfig'
Merged configuration '/home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/prj.conf'
Merged configuration '/home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/boards/nrf9160dk_nrf9160_ns.conf'
Merged configuration '/home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/golioth-auth.conf'
Merged configuration '/home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/rtt.conf'
Configuration saved to '/home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/build/nrf9160dk_nrf9160_ns/zephyr/.config'
Kconfig header saved to '/home/jo/Documents/Project/project_ncs/modules/lib/golioth-firmware-sdk/examples/zephyr/hello/build/nrf9160dk_nrf9160_ns/zephyr/include/generated/autoconf.h'
-- Found GnuLd: /home/jo/zephyr-sdk-0.16.5-1/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/../../../../arm-zephyr-eabi/bin/ld.bfd (found version "2.38")
-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is GNU 12.2.0
-- The ASM compiler identification is GNU
-- Found assembler: /home/jo/zephyr-sdk-0.16.5-1/arm-zephyr-eabi/bin/arm-zephyr-eabi-gcc
-- Found Python3: /usr/bin/python3 (found version "3.10.12") found components: Interpreter
Changed board to secure nrf9160dk_nrf9160 (NOT NS)
=== child image mcuboot - begin ===
[...]
However, I currently have no idea why the file is not been picked up. CONFIG_GOLIOTH_SAMPLE_COMMON=y is set in the build/zehphyr/.config file.
Thank you, and sorry for the late reply.
You are right, the order of the projects in the west.yml file makes the difference! After moving Golioth to the top, the hello-sample would connect successfully.