All good suggestions. I already have a diagnostic version that has increased timeout on retries. Also, treating a single successful retry as business as usual and not logging it. That’s 99% of the log entries, except like Brian’s case where it just seems to be locked up. The new version will restart after an hour of failed communication.
I’m looking into Async TCP, but sometimes the cure is worse than the disease. We’ll see. On the one hand, long blocking I/O’s supress sampling. On the other, interrupts during sampling cannot be tolerated. I’ve tried disabling interrupts during sampling but am plagued with wdt events after re-enabling. The IotaWatt is sampling 66% of the time, so the odds of interrupting sampling for asynchronous events are high. Best I can do right now is detect when the sampling was interrupted and discard the sample.
That’s one of the reasons I’d eventually like to move to the ESP32 - dual core.
So far, these communication problems are not epidemic. I’m going to work through the details and hopefully iron them out quickly.