A question over whether emonCMS is compatible with MQTT

why would they implement that when all you need to do is send the timestamp within the payload

To put the timestamp in the payload is a proper implementation. Currently emoncms is set up to receive mqtt data as a single raw value in a dedicated topic, there is no place for a timestamp.

There has already been discussions on this (in the EmonHub Development thread) and hopefully, there will soon be a more resilient method of posting mqtt data to emoncms. It has been recognized that there is significant difference in “broadcasting” the current value/status for everything as a separate topic for systems to be linked into and passing time specific data for the purpose of keeping records and accurately “monitoring” energy and environmental data.

The MAC id idea is fairly good, but for me it wouldn’t work as I try to have a “test” emoncms running to pull in changes and updates to, before pulling them in to the live instance.

Most energy data is collected at 10s intervals and every datapoint is valuable, I do not subscribe to the idea that if we have an abundance of data we can afford to be wasteful. I would like to get as complete a picture as possible without any missing data, that is easily done with buffering and confirmed delivery. If I was sending so much data that some was getting lost, that tells me I need to slow things down, if the data isn’t impotant enough to want to receive every datapoint, why send it?

The reason QoS 2 should be used is that (for example) updating and upgrading, when apache or emoncms or mysql needs updating, the mqtt broker should queue the data and pass it when the emoncms instance comes up, likewise when the MQTT broker needs updating, emonhub will (one day) buffer the data until mosquitto comes back on line. emonhub can be updated whilst it is running, it only needs restarting once the update is done, if the broker is online, there will be no buffered data to loose and only the few seconds it takes emonhub to restart will be missing data. This also applies to network outages etc, following a power cut, everything powers up, but the router can be somewhat slower, so buffering data is a good solution if all your servers are not on one device.

2 Likes

okay did not know that as I do not use emoncms- then maybe think about influx as your back end in emoncms at some point .

okay do you send as batch or as average of 10 seconds as most of mine is ~1/2 - 1 second sends so I have alot of data coming in then I might average it at the end. say every 3 second divide by the number of successful sends (usually it is 5)… I can see your point it you send a single data point every 10 seconds that’s alot of missing info if you loose a few data point in a row … if I loose a few data point every so often it has very little effect on my over all data profile.

if you used influx as backend you could of used my method of data preservation that I use on openwrt router as my mqtt server… all the important incoming MQTT data is stored in a CSV timestamped as it arrives then deletes the CSV once it has being sent to influxdb which is recreated once the newest MQTT is recieved and at end of each day a new CSV is created with the days timestamp… then I batch send the data every 30 seconds (user defined) to a influxdb… if the network goes out for days it sends the data when it finally connects. it go through all the CSV and delete them as they are uploaded and moves on to the next CSV… but I guess one could apply it to how you currently send your data to emoncms as well

Yes there is as part of a Full JSON object

There is an outstanding PR though, to improve the date time handling.

The QOS issue is an interesting one. I don’t think we do use is correctly. As you say QOS 2 can only really be used if there is a timestamp, QOS 1 is probably the best as it does actually retry whereas QOS 0 I think just fires and forgets.

I have also added a PR that should prevent the MQTT client raising an exception.

@pb66, I think you might have missed my reply.

In addition,

Need a bit more work around this. If it is a persistent client, you do not need (nor probably want) to resubscribe to the topic. Currently every reconnect includes a subscribe (which can throw an exception if the connect request was in fact unsuccessful).

Does every implementation of an MQTT broker support persistent connection?

Also looking at that reference you supplied;

each MQTT client must store a persistent session too. So when a client requests the server to hold session data, it also has the responsibility to hold some information by itself:

Not sure how to achieve that.

For now, I’d actually suggest a QOS of 1 until we can get the session stuff sorted out.

IMO QoS 1 is actually the last thing the current implementation needs. With an uncontrolled “at least once” this could cause all sorts of problems with duplicate posts, not only with “zero time changes” when the payload gets duplicated within the same second, but also when it gets duplicated with a different timestamp.

QoS 0 would mean only the data that’s missed during “outages” is missed, but at least all the timestamps applied at arrival in emoncms will be almost correct and duplication will be avoided regardless of all the disconnections and reconnections. And QoS 2 should work ok between disconnections (ie no duplicates) and should result in less duplicated inputs than QoS1 because only the unconfirmed are duplicated (exactly once). But currently any QoS2 attempts to ensure delivery is made, even if late, result in skewed timestamps.

Originally emoncms was QoS0 and at some point it’s been deemed that QoS2 would be better, I can’t argue with that sentiment, but changing the MQTT implementation to a QoS 2 level implementation is more than just changing the QoS setting at connection time.

I think QoS0 must be used when data is just “fired off” with no timestamps as the data is time specific and if it’s not delivered immediately so that emoncms can give it a meaningful timestamp it probably is better “forgotten”. Ideally QoS2 should be used with timestamped data if a persistent connection can be made. I really do not like the idea of encouraging duplicate inputs with QoS1. Besides QoS1 requires a client id too.

Since there is only one connection in emoncms, the same QoS setting is used across all the subscriptions and API’s, so we cannot have QoS2 for the timestamped data and QoS0 for non-timestamped, therefore IMO it has to be QoS0 all round until either ALL inputs are timestamped or the publishing source(s) dial back the QoS on non-timestamped data to 0 or emoncms gets multiple topics to subscribe to allowing different QoS settings to be set.

Perhaps the thing to do for now is too change emonhub to QoS0 since it’s the source of the un-timestamped data, which is what deems it “short life” and therefore better suited to “fire and forget”. If/when the MQTT implementation gets improved in emoncms, then emonhub can then change to use the new api’s.

The interaction between emonhub amnd emoncms is discussed at length in the EmonHub Development thread and will possibly result in multiple api’s (perhaps mirroring the http api’s). Perhaps multiple topics to allow different QoS levels might need to be considered.

However even if emoncms has a single QoS2 level implementation it will be able to get QoS 0,1 and 2 level data with the source QoS levels intact (lowest prevails), so it still needs to include some way of filtering duplicate posts in case a 3rd party source publishes data with QoS1 which could still get duplicated even with a steady QoS2 level “persistent” connection between emoncms and the broker.

Had not realised that emonhub was using the MQTT interface by default on EmonPis.

Yes I think we need the ability to subscribe to different topics at different QOS.

I have updated the PR for my attempts to avoid the exceptions and at the same time added in @pb66 code for a persistent connection.

With the persistent connection made, the same topic ID is returned on subscribing when either the daemon is stopped or if the MQTT broker is stopped. This was tested with data from node-red (QOS2, retain=true) rather than emonhub.

Without a persistent connection, the ID increases on each re-subscription.

Stumbled across something else today.

I happened to see this error message and got curious

2018-05-13 15:05:05.803|ERROR|phpmqtt_input.php|exception 'Mosquitto\Exception' in /var/www/emoncms/scripts/phpmqtt_input.php:125
Stack trace:
#0 /var/www/emoncms/scripts/phpmqtt_input.php(125): Mosquitto\Client->connect('localhost', 1883, 5)
#1 {main}
2018-05-13 15:05:05.848|WARN|phpmqtt_input.php|Not connected, retrying connection
2018-05-13 15:05:11.001|WARN|phpmqtt_input.php|Not connected, retrying connection
2018-05-13 15:05:11.003|WARN|phpmqtt_input.php|Connecting to MQTT server: Connection Accepted.: code: 0
2018-05-13 15:05:21.288|WARN|phpmqtt_input.php|Not connected, retrying connection
2018-05-13 15:05:21.298|WARN|phpmqtt_input.php|Connecting to MQTT server: Connection Accepted.: code: 0

Line 125 of phpmqtt_input.php is $mqtt_client->connect($mqtt_server['host'], $mqtt_server['port'], 5);. It looks pretty straight forward but I didn’t know what the “5” was, so I looked into it and found it is the $keepalive setting.

Having read a little about it, I now believe it is being used incorrectly or least it might incomplete. The usual disclaimers apply here, I’m not an MQTT expert and I’m learning “on the job” so to speak.

My understanding is that a client (in this case emoncms) can define a time interval that the broker will use to asses if the client is still “alive” and communicating with the broker.

Bearing in mind that the emoncms mqtt input script is for subscribing only, it doesn’t publish any data, that means that the only time the client contacts the broker is in reply to a received topic ie a confirmation etc.

So I wonder to myself, why is it setting such a low $keepalive setting of 5 secs?

Apparently the broker will assume there is a problem if 5 secs passes between messages from the client and will after a further 2.5secs (total of 150% of defined interval) disconnect the connection to initiate a reconnection.

This would then suggest that emoncms is volunteering to be cut-off from the broker after a maximum of 7.5 secs following any received data.

From what I’ve read it is the clients responsibility to issue/publish a PINGREQ packet to tell the broker it is still connected if 5secs passes since IT last communicated with the broker to ensure the broker never kills a good but quiet connection. It seems to be like a watchdog where the broker will reset/restart the connection if the client appears absent for longer than it has defined as a “keepalive” interval.

I see no evidence of emoncms doing that and since it only responds to published data it appears to be unnecessarily leaving itself open to the broker disconnecting if a 5sec datapoint is just 2.5s late.

In the case of a emonpi this might mean a single missed 5s payload might result in a disconnection. But an emonBase that has no regular 5s data to keep it alive would be forever resetting unless it has several devices well spread out time wise so that 7.5secs never passes between mqtt payloads.

Apparently by omitting the “5” the default is used (although I cannot find what that is definitively) and if a value of “0” is defined the function is disabled, although just extending the keep alive to a minute or 2 might be a big improvement, I suspect the proper solution for emoncms would be to set a “keepalive” variable in emoncms and use that to both inform the broker and for a function to time between service messages, triggering a PINGREQ if the interval passes, when emoncms sends a PINGREQ the broker will respond with a PINGRESP, if emoncms doesn’t get a PINGRESP within a reasonable time it (emoncms) should then initiate a reconnection from the client end.

I believe the documentation is wrong for the php-mqtt extension we use

$keepalive (int) – Optional. Number of sections after which the broker should PING the client if no messages have been recieved.

and have found this issue on github (API docs describe keepalive incorrectly · Issue #621 · eclipse/mosquitto · GitHub) for the Mosquitto broker that shows it to was documented the same (incorrect?) way until late last year.

Although not for Mosquitto specifically, this page (What Is MQTT Keep Alive and Client Take-Over? – MQTT Essentials Part 10) describes it quite well and the offical MQTT spec is here (MQTT Version 3.1.1).

[edit] I have created another PR so this might attract some attention for testing or further consideration.

perhaps users with frequent MQTT reconnection issues could try relaxing the setting to see what effect it has.

1 Like

Remembering that Mosquitto-PHP is a wrapper for libmosquitto, the default in set within the Mosquitto-PHP code. Of course, this is simply sending a setting to libmosquitto. (BTW is there markdown that will reduce the number of lines displayed below?)

Click here to show code

However, the Eclipse Mosquitto documentation for the ‘mosquitto’ broker, also states that the default is 60 seconds and the minimum is 5.

Does seem rather odd to have it set to the minimum and I’d suggest simply reverting to the default.

Looking through my mosquitto logs, I had noticed the occasional disconnection (this is a non-emonhub setup that is getting data every 10s from node-red, both machines are a VM). I’ll go to the default and look and see if the disconnections cease.

Yes I had read that the default was 60s for php-mosquitto, but I hadn’t seen any official docs, I have also read there is no default setting in Mosquitto itself. The link you provide I had seen but it crops up in several discussions that setting is (as indicated on that page) exclusively for broker to broker “bridges” not for clients as we are discussing here.

There are several unsupported claims in various discussions ranging from “there is no default” as in the default is “off” and someone who conducted some tests concluded it was 15s, and others that quote/debate the 60s “bridge” keepalive, so i didn’t want to draw any conclusions.

I guess the Mosquitto default is a moot point if we are using php-mosquitto any ways since that default will kick in if emoncms doesn’t define an interval, so the broker default will never get used.

Indeed, I think someone has chosen that hoping to get more frequent check thinking that would keep the connection alive, the faster you check it, the faster it will be reconnected. When in actual fact that approach ensures you get a very sensitive connection and the maximum chance of causing dis/re-connections.

Cool, lets see what you discover.

Not that I know of unfortunately. That’s why I’ve started do things like

Line 388 of mosquitto.c is

    zend_long keepalive = 60;

but it’s a PITA

@borpin,

I edited your post to include a line that toggles showing / hiding the text when that line is clicked on.
Is that what you were asking about? Here’s how that’s implemented:

<details>
<summary>Click here to show code</summary>

https://github.com/mgdm/Mosquitto-PHP/blob/35d77cdc01bfb0db46df0cb57165487cc0a75001/mosquitto.c#L388

</details>

Thanks, useful, but I am sure I have seen somewhere (possibly on a different Discourse based forum) a code quote that was just a couple of lines long, but was definitely a direct link.

[Edit] Ahha (said piglet). If you amend the URL to specify the lines, you will just get those lines :smiley:. Edited post above.

Ah missed that :frowning_face:

The other thing that has just occurred to me is that if the client is sending a ping, that should be visible using tshark. I’ll have a look tonight.

Ahh ok, I thought you were trying to pinpoint the one exact line as per the first url, to specifically identify as well as show the code and it’s location. But as it turns out, you can still use the same “code block” url format but set the from and to line numbers to the one line eg

That would be good to know, although looking at the emoncms code, I doubt it will be there unless it is handled by the mosquitto-php extension, and I would have thought the extension was just various functions wrapped specifically for use in php, as in a Lib, I didn’t expect the extension to play any directly active role.

Nice.
Thanks for the tip!

It was a bit of both really. The standard number of lines just takes up too much space.

I think it should be the broker that is sending out the Ping to confirm the connection is still alive rather than the client (I think if I have read correctly). However, the broker/client protocol for handling that seems a little fuzzy.

No, as I interpret it, the client sends the ping if it knows it hasn’t sent any service messages in the required interval to prevent the broker disconnecting at 150% of the interval.

Or in the words used in the actual MQTT spec

It is the responsibility of the Client to ensure that the interval between Control Packets being sent does not exceed the Keep Alive value. In the absence of sending any other Control Packets, the Client MUST send a PINGREQ Packet

and then

If the Keep Alive value is non-zero and the Server does not receive a Control Packet from the Client within one and a half times the Keep Alive time period, it MUST disconnect the Network Connection to the Client as if the network had failed

Those are the 2 highlighted lines in the “3.1.2.10 Keep Alive” section of this spec page (MQTT Version 3.1.1).

Well I got completely the wrong end of that stick! Yes you are completely right, unfortunately I don’t remember seeing any mechanism to do that in these libraries. Perhaps subscribing to one of the $SYS topics would suffice or else create a publish ‘heartbeat’ topic?

Aha! Found it!

I think it is in fact part of libmosquitto.

mosquitto_loop_misc() gets called at the very end of the mosquitto_loop() function

(Both these functions lived in lib/mosquitto.c until 11th of April, now they are in lib/loop.c)

It is documented as part of the libraries mosquitto_loop_misc() function but not as part of the libs most commonly used mosquitto_loop() function, nor does it state that the former is included via a call from the latter.

So I now believe you should be seeing both PINGREQ(from the client) and PINGRESP(from the broker) messages, if you are I assume all is well and they are doing there thing, back to the drawing board :slight_smile:

1 Like