Negative values for missed messages (and missing percentage)

I’m getting consistent negative values for missed and missedprc from my EmonTX3 (CM firmware) and EmonTX5 devices. The code for this is pretty simple:

This persists across device restarts. Can anyone assist with figuring out how this is occurring?

Reading back through a few lines above, I think this can only be caused by repeated MSG == 1 being received which makes self.missed negative. Python treats “truthy” values as 0 == False, else True. I’m not running this software, but you can probably fix it by ensuring missed is >= 0.

I’ve been digging into this a bit more. It looks like this may be a problem in the EmonHubRFM69LPL interfacer. In the debug logs below, you can see two duplicate packets getting processed. The duplicate is supposed to be discarded – and a message to that effect is logged – but a message is still published to MQTT, which results in the missed count getting decremented (since MSG - MSG -1 == -1).

So now I’m puzzling over why discards aren’t being…well…discarded. :stuck_out_tongue:

2024-12-31 23:41:47,471 DEBUG    SPI        290623 NEW FRAME : 
2024-12-31 23:41:47,472 DEBUG    SPI        290623 Timestamp : 1735688507.471400
2024-12-31 23:41:47,472 DEBUG    SPI        290623 From Node : 29
2024-12-31 23:41:47,472 DEBUG    SPI        290623    Values : [70643, 0, 0, -43, -195, -107, -2, 118, 18, -8928, -28085, -13622, -1209, 26140, 11095, 0, 0, -49958, -92.8397539536526]
2024-12-31 23:41:47,472 DEBUG    SPI        290623      RSSI : -63
2024-12-31 23:41:47,472 DEBUG    SPI        290623 Sent to channel(start)' : ToEmonCMS
2024-12-31 23:41:47,473 DEBUG    SPI        290623 Sent to channel(end)' : ToEmonCMS
2024-12-31 23:41:47,573 INFO     SPI        Packet received 52 bytes
2024-12-31 23:41:47,573 INFO     SPI        Discarding duplicate packet
2024-12-31 23:41:47,667 DEBUG    MQTT       Publishing: emon/emontx5_2/MSG 70643
2024-12-31 23:41:47,667 DEBUG    MQTT       Publishing: emon/emontx5_2/Vrms2 0
2024-12-31 23:41:47,667 DEBUG    MQTT       Publishing: emon/emontx5_2/Vrms3 0
2024-12-31 23:41:47,668 DEBUG    MQTT       Publishing: emon/emontx5_2/P7 -43
2024-12-31 23:41:47,668 DEBUG    MQTT       Publishing: emon/emontx5_2/P8 -195
2024-12-31 23:41:47,668 DEBUG    MQTT       Publishing: emon/emontx5_2/P9 -107
2024-12-31 23:41:47,668 DEBUG    MQTT       Publishing: emon/emontx5_2/P10 -2
2024-12-31 23:41:47,668 DEBUG    MQTT       Publishing: emon/emontx5_2/P11 118
2024-12-31 23:41:47,669 DEBUG    MQTT       Publishing: emon/emontx5_2/P12 18
2024-12-31 23:41:47,669 DEBUG    MQTT       Publishing: emon/emontx5_2/E7 -8928
2024-12-31 23:41:47,669 DEBUG    MQTT       Publishing: emon/emontx5_2/E8 -28085
2024-12-31 23:41:47,669 DEBUG    MQTT       Publishing: emon/emontx5_2/E9 -13622
2024-12-31 23:41:47,669 DEBUG    MQTT       Publishing: emon/emontx5_2/E10 -1209
2024-12-31 23:41:47,669 DEBUG    MQTT       Publishing: emon/emontx5_2/E11 26140
2024-12-31 23:41:47,670 DEBUG    MQTT       Publishing: emon/emontx5_2/E12 11095
2024-12-31 23:41:47,670 DEBUG    MQTT       Publishing: emon/emontx5_2/digPulse 0
2024-12-31 23:41:47,670 DEBUG    MQTT       Publishing: emon/emontx5_2/anaPulse 0
2024-12-31 23:41:47,670 DEBUG    MQTT       Publishing: emon/emontx5_2/missed -49958
2024-12-31 23:41:47,671 DEBUG    MQTT       Publishing: emon/emontx5_2/missedprc -92.8397539536526
2024-12-31 23:41:47,671 DEBUG    MQTT       Publishing: emon/emontx5_2/rssi -63
2024-12-31 23:41:47,673 INFO     SPI        Packet received 52 bytes
2024-12-31 23:41:47,674 INFO     SPI        Discarding duplicate packet
2024-12-31 23:41:47,774 INFO     SPI        Packet received 52 bytes
2024-12-31 23:41:47,774 INFO     SPI        Discarding duplicate packet
2024-12-31 23:41:47,874 INFO     SPI        Packet received 52 bytes
2024-12-31 23:41:47,875 INFO     SPI        Discarding duplicate packet
2024-12-31 23:41:47,976 INFO     SPI        Packet received 52 bytes
2024-12-31 23:41:47,976 DEBUG    SPI        290624 NEW FRAME : 
2024-12-31 23:41:47,976 DEBUG    SPI        290624 Timestamp : 1735688507.976361
2024-12-31 23:41:47,977 DEBUG    SPI        290624 From Node : 29
2024-12-31 23:41:47,977 DEBUG    SPI        290624    Values : [70643, 0, 0, -43, -195, -107, -2, 118, 18, -8928, -28085, -13622, -1209, 26140, 11095, 0, 0, -49959, -92.8416123097508]
2024-12-31 23:41:47,977 DEBUG    SPI        290624      RSSI : -64
2024-12-31 23:41:47,977 DEBUG    SPI        290624 Sent to channel(start)' : ToEmonCMS
2024-12-31 23:41:47,977 DEBUG    SPI        290624 Sent to channel(end)' : ToEmonCMS
2024-12-31 23:41:48,074 DEBUG    MQTT       Publishing: emon/emontx5_2/MSG 70643
2024-12-31 23:41:48,075 DEBUG    MQTT       Publishing: emon/emontx5_2/Vrms2 0
2024-12-31 23:41:48,075 DEBUG    MQTT       Publishing: emon/emontx5_2/Vrms3 0
2024-12-31 23:41:48,075 DEBUG    MQTT       Publishing: emon/emontx5_2/P7 -43
2024-12-31 23:41:48,076 DEBUG    MQTT       Publishing: emon/emontx5_2/P8 -195
2024-12-31 23:41:48,076 DEBUG    MQTT       Publishing: emon/emontx5_2/P9 -107
2024-12-31 23:41:48,076 DEBUG    MQTT       Publishing: emon/emontx5_2/P10 -2
2024-12-31 23:41:48,076 DEBUG    MQTT       Publishing: emon/emontx5_2/P11 118
2024-12-31 23:41:48,077 DEBUG    MQTT       Publishing: emon/emontx5_2/P12 18
2024-12-31 23:41:48,077 DEBUG    MQTT       Publishing: emon/emontx5_2/E7 -8928
2024-12-31 23:41:48,077 DEBUG    MQTT       Publishing: emon/emontx5_2/E8 -28085
2024-12-31 23:41:48,077 DEBUG    MQTT       Publishing: emon/emontx5_2/E9 -13622
2024-12-31 23:41:48,077 INFO     SPI        Packet received 52 bytes
2024-12-31 23:41:48,078 DEBUG    MQTT       Publishing: emon/emontx5_2/E10 -1209
2024-12-31 23:41:48,078 INFO     SPI        Discarding duplicate packet
2024-12-31 23:41:48,078 DEBUG    MQTT       Publishing: emon/emontx5_2/E11 26140
2024-12-31 23:41:48,079 DEBUG    MQTT       Publishing: emon/emontx5_2/E12 11095
2024-12-31 23:41:48,079 DEBUG    MQTT       Publishing: emon/emontx5_2/digPulse 0
2024-12-31 23:41:48,079 DEBUG    MQTT       Publishing: emon/emontx5_2/anaPulse 0
2024-12-31 23:41:48,079 DEBUG    MQTT       Publishing: emon/emontx5_2/missed -49959
2024-12-31 23:41:48,079 DEBUG    MQTT       Publishing: emon/emontx5_2/missedprc -92.8416123097508
2024-12-31 23:41:48,080 DEBUG    MQTT       Publishing: emon/emontx5_2/rssi -64

If I crank up the maximum allowed delay on the duplicate detection up to <1.0, then my negative missed packet problem seems to go away (still monitoring). I’m shocked to see this many duplicates though, with some packets getting picked up eight times (1 logged packet and 7 discards). Is this some sort of radio problem?

The LPL library that the Tx5 uses has a retry when it doesn’t get an acknowledgement in return from the base. I wonder if this is not being sent by the base or received by the Tx5 so it keeps retrying? Obviously the duplicates should be discarded when received, but this may explain the repeat transmission. Depending on how comfortable you are with the Tx5 firmware, you could add a debug message there?

(I think I might add a verbose debug option to the Pi3 for this sort of in field scenario…)

Fascinating! I always assumed the packet sending was a spray-and-forget/UDP model, but I see the sendWithRetry code you’re talking about. I may try getting some logging from my RPi side first, since I can make code changes there over the network vs. having to compile/plugin/upload on my EmonTX. More projects! :grin:

1 Like

Surely LPL should take care of this? :exploding_head:

You should be able to ask for a value and it should return a (one) value - is there no check within the library that it’s the same question being asked and the same data returned multiple times?

If there isn’t, it’s a major shortcoming.

There are two attempts when sending data. If there is no acknowledgement from the base for both attempts, the function returns false so it doesn’t just keep on sending until it gets the ack.

This obviously doesn’t explain why a packet is being picked up 8 times, but certainly worth considering.

This is the number of retries set in emon_DB_12CT.ino

One suggestion for @davidski: turn up the transmit power in your emonPi but beware, you can destroy the RFM69CW by transmitting at full power without a matched antenna fitted. LPL does have the facility for Automatic Transmission Control with #include <RFM69_ATC.h> which appears to invoke a handshake to set the minimum necessary transmitter power level. It’s not been included in the OEM cut-down version of the LPL library.
A second suggestion: Use a combination of NodeID and message number to block the repeated messages (You must use the Node ID too because for the emonTx4 with the expansion board, both ‘halves’ of the data are given the same message number.)

IMHO, I disagree - retries are transparent at the message originator’s end, it should be the same at the destination end, i.e. wholly contained within the library at both ends of the link. It should not be for the user to invent their own mechanism for handling repeated messages.

This is only set in the 12 CT version - the 6 CT versions use the default 2 retries, and that’s where I read it from. Using the 12 CT version, it’s set by ACK1_RETRIES and ACK2_RETRIES both of which are 8 so that makes sense :slight_smile:

The ATC version is fine, it just rachets up the power on the Tx side upon each retry. On the Rx side, it can package the RSSI in the response. For a fixed setup running off the wall, I wouldn’t worry too much about it from a power perspective at least.

I think I agree with @davidski here - the receiver has got repeat valid packets that are identical (matching node ID, message number, and data fields) - it should not log these as separate messages. On the Tx side, it’s functioning properly - it hasn’t received an acknowledgement, so it attempts to resend as designed as it assumes that the packet has not been received.

It depends a bit where you draw the line with the library and it’s consumer/producer. To my mind, the wireless driver is working fine - in the Tx direction, it sends a packet with a request to ack, and no ack is received, so it resends up to some number of times. In the Rx direction, it receives a packet, sends an ack in response, and passes the data on to the consumer - it has no concept of what the message contains and its meaning. Therefore, the consumer (the interfacer) should handle the filtering (in my opinion, at least :bowing_man:).

My argument is the radio packet should contain in the header sufficient information so that a duplicate packet is detected and discarded at the destination - after all, it’s aware that an ACK has been requested and sent, so there is some expectancy of a duplicate arriving very shortly. This would contain the retry mechanism completely within the confines of the library and invisible to the application, which would receive either a single copy of a valid message or an error.

In the generic case, though, this isn’t true. For a given application, an immediate identical packet could be valid data and the RFM (LPL or whatever) driver does not have the information to handle this - it has done its job acking the packet and passing it on to the application.

In our case, an identical packet is not valid but that’s only because we know the context of the system.

This is not the case with LPL (or JeeLib) at present, but it should be the case. Let’s carefully distinguish between the user-supplied content of the message and the complete radio packet as launched into the air. The transport mechanism within the library should be able to accommodate repeated content whilst distinguishing retried radio packets – in a way that’s transparent to the user (the sketch) and in much the same way as large chunks of the Internet can handle damaged, delayed or missing packets of data.

So agreed, identical content is always valid because the packets that contain it are never identical, an identical packet is a retry and is never valid.

Ideally, I agree. Two issues, not insurmountable :slight_smile: but would require changes at either end so it would no longer be vanilla LPL (which does not have any packet metadata). Very easy for a larger system, but some constraints on a microcontroller.

  • The RFM’s FIFO isn’t very deep, so any header (TCP’s header is 24 bytes) eats into the available data space. 4 bytes as a fingerprint wouldn’t be unreasonable, could just take a hash of, say, (data + time) to allow for the case of identical payloads.
  • On the Rx end, there would need to be a lookup table to account for interleaving of different nodes. The header would be saved, and if a received header matches it would be rejected. The deeper the table, the lower the chance of missing a duplicate.

I can add this to the Pi3 (the SAMD has a hardware CRC unit), but would also need to change the Rx end as well. Calculating a CRC is quite heavy for the AVR, but could add it there too, but we’d have to be careful to avoid fragmentation (e.g. user updates the Rx end but doesn’t update the Tx end).

Note, a CRC is not a good hash function but it would be very cheap computationally, at least for the Pi3.

If you’re looking at an OEM home-made solution, you have all you need already in the radio packet as far as the emonTx is concerned: the Source Node ID (in the packet header) and the Message Number (in the content/payload). Together, those are unique to every message content and identical for a retry packet. So all you need is a one-deep history of message number per sending node: if the message number is equal to the last, it’s a retry, otherwise, it’s good.

This would need a message number adding to the EV charger, heatpump and emonTH softwares, which is a nuisance and for the TH particularly, a small cost in battery life.

Issue there is you have to allocate (and know at compile time) how many nodes you are using. You could statically allocate, say, 8 slots and then fill the node ID dynamically but it needs to be conservatively oversized.

It would be better to alter the driver, rather than “user” software having to append its own data. That way you update one object, rather than having to change all the sources and the data structures. The user just passes their data, and doesn’t need to worry about the underlying transport. The driver then takes any data and adds the tag to it without the user doing anything. The only affect is the reduced payload size, easily caught by a range check.

A bit of a think on this, it’s not a good model for the TH, that should remain single shot. It would greatly increase the duty cycle waiting in receive and sending any retries. Would reduce the battery life (I can do a spreadsheet model if you’d like proof :slight_smile:) to “unacceptable” levels.