Suggestion: adding a psuedo-random small delay to the emonTx transmit function to minimize collisions

Feature suggestion: it would be interesting to add a small random (or pseudo-random) delay or wait time of say 0-2 seconds (varying it every time) before engaging the wireless transmit in the emonTX. Doing this would minimize the chance of collision with other devices sending data with the same cadence.

Back story, or how I came upon this:

I just observed something interesting today: I have three emonTx V 3.4 pushing data to the same emonPi. Had a very brief power interruption (less than a second), and didnā€™t pay much attention to it. However, when I looked at the feeds about 1h later, none had been updated since the power interruption.

Looking at the 3 emonTX, they were working, and flashing the red transmit LED. However, all three were flashing the LED at exactly the same time, stepping on each otherā€™s transmission.

I unplugged two of the three and, sure enough, data from the one still plugged started updating again. Plugged the other two back, one at a time, and all was good.

Cheers!
Claudio

Edit - changed sudo to pseudo to avoid any possible confusion with the command sudo. BT, Moderator

1 Like

I agree itā€™s an issue when multiple RFM devices (eg emonTx) come back online at the same time. My own solution to this is to have a delay in the start up equal to the node id in seconds, this means each unique node id will cause the devices to start sending their data on a staggered rota. This method avoids having to edit each sketch with a unique delay or introducing a random element which cam play havoc on your input processing, eg if you are summing the power1 inputs of 3 emonTxā€™s (eg 3phase monitoring) you would want the totalling to happen in the last of the 3 to post itā€™s data so the total is based on all 3 latest values.

3 Likes

Is a 1 second gap a little too generous?

With the default DS sketch coming in with a 26-byte payload, or the CM sketch with a 40-byte payload, and add to that the header and checksum of 9 bytes, transmitted at a little over 49 kBaud, those take under 5.8 and under 8 ms respectively to transmit. Thereā€™s a clear advantage in starting off with a respectable gap between transmissions, but that needs to be weighed against the possible need to keep the sampling reasonably synchronous - as your (@pb66) example infers.

I can also see the appeal of the idea of adding some randomness, as the intention is clearly to avoid a long period with two or more transmitters blocking each other until their slightly different clock rates allow them to drift apart.

I donā€™t see a viable ā€œstandardā€ solution, both would be the ā€œbestā€ given appropriate (and different) conditions.

Yes the delay could be much less, I tend to use low node ids and my aim was to spread the first sends evenly(ish) across the 5-10 second interval. So with 3 emontxā€™s posting at 5s intervals nodes 1, 2 and 3 left a 2 second gap before the cycle restarted.

I see the appeal of adding a random element to avoid log periods of clashing after the devices have been running a while due to the ā€œ10s intervalsā€ not always being exactly 10s on every device, but that would currently potentially cause an issue with the way fixed interval feeds work as the timestamp is the recieved time, not the pre-adjusted send time.

If there was a mechanism to recognise received packets as belonging to a particular timestamp, the random adjustment would be of great value, eg if there was a packet counter included in the payload and that was translatable to a particular timestamp (eg (counter x 10s) + start time = timestamp) then the actual time sent and time received will be of less interest and a staggered and/or random rota would work well and allow ā€œsynchronisedā€ data as all emontxā€™s could be powered up at the same time and be sampling data for the same 10s interval, but reporting at irregular or staggered intervals, whilst the receiver allocates the same timestamp due to the same packet count.

In my own sketches, whilst I said above about a ā€œdelayā€ the way it actually works is that it will continue to loop and only send once the node id x 1s interval has passed, this has the advantage of allowing the sampling bias offset removal to settle whilst there are no transmissions and the first transmission is valid data. It eliminates ā€œzero first valuesā€ for temp etc and power and voltage readings are accurate from the very first report.

The ultimate answer is probably a ā€˜polledā€™ solution - but again that canā€™t be a universal standard because of the needs of battery-only users (though how many of those exist these days is a moot point).

However, since emonLibCM takes its transmission interval from mains time, it gets around the problem of RF collisions (provided, as you suggest, the sketches donā€™t all start at exactly the same instant) but does nothing for, and might even worsen, the long-term drift of mains time against emonCMSā€™s clock. Unless of course emonCMS can get its time from the same mains clockā€¦

Iā€™m not sure I understand how that is fundamentally different from clock drift between the sketch and emonCMS. You still have clock drift underlying the random jitter, so (considering just one transmitter) each sample gets the time stamp on arrival as before - itā€™s just that rather than following a regular pattern of one missed or overwritten - depending on the direction of drift - every n minutes (or hours), youā€™d have a scatter of correct or missed (or overwritten) samples spread randomly either side of the n minutes event. For a single transmitter, thatā€™s an obvious degradation of the data for no benefit, but considering many transmitters, it would make each individual channelā€™s data less consistent. The gain would be that long periods of missing data across pairs of channels should disappear, as suggested by @heckler.

That addresses the ā€˜scatterā€™ problem but sadly, it still leaves the issue of clock drift.

The other point about the length of the transmission is that with (say) 4 transmitters, the maximum time change needs to be only a few tens of milliseconds - if it was Ā±5 transmission periods (Ā± 2 mains cycles with the longest data packet), thereā€™s something like a 90% chance of no collision between any pair of transmitters. (Not being a statistician, I need to think about that for a long time, but thatā€™s my best guess.)

Iā€™m afraid I still think thereā€™s no easy solution. Iā€™m also reasonably certain that there isnā€™t a solution.

Iā€™m using a different approuch on this but will break compatibility.
By using GitHub - LowPowerLab/RFM69: RFM69 library for RFM69W, RFM69HW, RFM69CW, RFM69HCW (semtech SX1231, SX1231H) lib, you get some nice features:

  • digital rssi
  • configurable transmit power
  • package acknowledge

Each emonth listens (measures rssi) and only start a transmition if there is no chatting. A random delay is used to avoid colisions with other modules listening at the same time.
Each emonth tx power is lowered until no ackowledge is received and up a notch so that each module only transmits with the power level required to reach the base.

With this i was able to make the 2 AA cells live for more than 2 years on a network of 5 emonths posting every minute.
I can share my code for emonth and base if you want to take a look.

2 Likes

Thank you for that, Nuno.

I have looked at LPL and I did indeed see some of those features. I think @pb66 has looked too. As you say, adopting it would break compatibility, but now that the sketch in the emonPi is no longer automatically overwritten when the Piā€™s emonCMS software is updated, thereā€™s nothing to stop anyone changing to the LowPowerLabs library. But of course, the two libraries will not operate together as the message headers have some fundamental differences in their structure.

I think if Claudio @heckler wanted to go down that route, your code would be a very good start for him. However, it still doesnā€™t solve the possible timing problem with emonCMS that @pb66 pointed out.