EmonCMS API data.json

Tags: #<Tag:0x00007f6e113f0828>


My setup consists of an EmonPI + EmonTx. I’ve added an Arduino with Ethernet that measures temperature (+barometric pressure) and sends it to the EmonCMS API over HTTP every 10 seconds.

It’s been running for about 24 hours now and the temperature graph looks like this:

The screenshot was taken at 21:31, so there should have been data up till then. As you can see, there are several gaps in the graph.

What is kinda strange is that, at the time the screenshot was taken, the feed was last updated 9s ago. So there should be data in the feed. For some reason, it’s not showing in the graph.

So I dug a little further and, using the data.json API call, I polled for the data in feed 25 for the last 5 minutes. The result I got was even stranger. Sometimes the result is empty, sometimes it contains data. Whether or not there is data in the result seems to depend on the start parameter of the call.

First a successful call:

1526709337 = Sat May 19 05:55:37 UTC 2018
1526709637 = Sat May 19 06:00:37 UTC 2018

Just one second later, an empty array is returned

1526709338 = Sat May 19 05:55:38 UTC 2018
1526709638 = Sat May 19 06:00:38 UTC 2018

A few seconds later, still no data:

1526709341 = Sat May 19 05:55:41 UTC 2018
1526709641 = Sat May 19 06:00:41 UTC 2018

Again no data…

1526709342 = Sat May 19 05:55:42 UTC 2018
1526709642 = Sat May 19 06:00:42 UTC 2018

One second later: data!

1526709343 = Sat May 19 05:55:43 UTC 2018
1526709643 = Sat May 19 06:00:43 UTC 2018

It is perfectly reproducible. The second but last and last call for exmaple:

wget -qO- “${APIREADKEY}&id=25&start=1526709342000&end=1526709642000&interval=10”; echo
wget -qO- “${APIREADKEY}&id=25&start=1526709343000&end=1526709643000&interval=10”; echo[[1526709343000,20.10000038147],[1526709353000,20.10000038147],[1526709363000,20.10000038147],[1526709373000,20.10000038147],[1526709383000,20.10000038147],[1526709393000,20],[1526709403000,20.10000038147],[1526709413000,20.10000038147],[1526709423000,20.10000038147],[1526709433000,20.10000038147],[1526709443000,20.10000038147],[1526709453000,20.10000038147],[1526709463000,20.10000038147],[1526709473000,20.10000038147],[1526709483000,20.10000038147],[1526709493000,20.10000038147],[1526709503000,20.10000038147],[1526709513000,20.10000038147],[1526709523000,20.10000038147],[1526709533000,20.10000038147],[1526709543000,20.10000038147],[1526709553000,20.10000038147],[1526709563000,20.10000038147],[1526709573000,20.10000038147],[1526709583000,20.10000038147],[1526709593000,20.10000038147],[1526709603000,20.10000038147],[1526709613000,20.10000038147],[1526709623000,20.10000038147],[1526709633000,20.10000038147],[1526709643000,20.10000038147],[1526709653000,20.10000038147]]

Let’s zoom in on the call that does not return data.
Changing the end parameter doesn’t seem to have any effect.

wget -qO- “${APIREADKEY}&id=25&start=1526709342000&end=1526709643000&interval=10”; echo
wget -qO- “${APIREADKEY}&id=25&start=1526709342000&end=1526709644000&interval=10”; echo
wget -qO- “${APIREADKEY}&id=25&start=1526709342000&end=1526709645000&interval=10”; echo

Changing the start parameter with one second seems to have an effect though.

wget -qO- “${APIREADKEY}&id=25&start=1526709342000&end=1526709642000&interval=10”; echo
wget -qO- “${APIREADKEY}&id=25&start=1526709343000&end=1526709642000&interval=10”; echo

Is there something wrong with my parameters?
Is it a bug?
Could this somehow be related to the gaps in the graph?

Tell us more about the source of the data. What api calls do you use? Can you show us an example request? Is the data timestamped at source or when it arrives at emoncms?

How is the 10 seconds timed? Is there a 10s sleep? is there a clock timer?

Is it possible the data can get delayed with the network connection so the 10s intervals get skewed a little?

What size fixed interval is your feed? This api call should return the meta

If I had to take a stab in the dark I might be tempted you say you are posting 10s data to a 5s feed and 50% of the time when you change the start time of your query by 1s you are landing on empty datapoints as the feed will only have data in every other datapoint.

However there are many reasons you could end up with regular and/or irregular “holes” in your fixed interval feed data so it is too soon for me to draw any conclusions, but personally, I would always look at the source and the relationship between the source and the feeds fixed interval before looking too far into the querying when questioning why the returned data isn’t what I expected.

The source is the Arduino posting temperature with insert.json call. From a tcpdump:

GET /emoncms/feed/insert.json?apikey=APIKEYREMOVED&id=25&value=21.60 HTTP/1.0

HTTP/1.1 200 OK
Date: Fri, 18 May 2018 16:47:47 GMT
Server: Apache/2.4.10 (Raspbian)
Content-Length: 4
Connection: close
Content-Type: application/json


The 10 seconds is not timed with a 10s sleep because otherwise it would drift over time.
It’s not a (hardware interrupted) clock timer either.
I made a small software sleep. How it works would probably get us too far off-topic, but here’s the result of tcpdump on the EmonPI:

Well, I already learned a thing. I did not specify any options when creating the feed using using the API call:


I assumed it was 10 (as in the example on
Apparently, the default is 5. Thanks to @pb66 for pointing that out!

wget -qO- “${APIREADKEY}&id=25

So I guess your stab in the dark is right. Since I assume there is no (straightforward) way to change the interval of a feed, I will change the Arduino code to post every 5s.

I still struggle with how data is stored in the feed internally though. Posting a value every 10s to a 5s feed would result in an array like [ 0:value, 5:null, 10:value, 15:null, 20:value, … ] when calling the data.json, would it not?

If you look at the feed api page the create feed api is{“interval”:10}

Although there isn’t actually a default, there is an absolute minimum of 5s. So buried in the code is “if (interval < 5s) interval = 5s” thus creating a default as “missing” or “null” is less than 5. The documented default (and minimum on is 10s, but you are circumventing the gui recommendations and restrictions by using the feed api.

That is a big decision, it may be easier to change the source to 5s and even seem appealing to have the more granular 5s updates, BUT it does mean double the workload and double the disc space, likewise in a year or so (when you have much more valuable data) the transition will be just as tough once you realise 10s might have been a better option, so it might be better to bite the bullet and replace the feed sooner rather than later.

Also if at a later date you choose to sync with, the 5s interval might be an issue.

No, because your query states a 10s interval, so you have a 50/50 chance of landing on the good datapoints [ 0:value, 10:value, 20:value, … ] or the empty datapoints [ 5:null, 15:null, … ], changing your query to 5s should yield what you expect.

Why have you chosen to post via the feed api not the input api?

At the absolute minimum you should be specifying the timestamp before sending so that you are in control of the interval (ie network and processing delays plus clock differences causing value to be saved with the wrong timestamp) so it doesn’t end up in the wrong time slot of the fixed interval feed).

Because I was already used to the feed api to read data. I just continued using it. So no particular reason. Could as well use the input api.
I can easily see why the input api is more efficient for bulk updates. Since I’m updating one value at a time, is there a difference between doing it one way (feed api) or another (input api)? Is one more future proof than the other perhaps?

You’re absolutely right about that. I will bite the bullet and replace the feed with another one with 10s or more interval.

The arduino is not aware of the real time (i.e. unix time stamp). It is only aware of the microseconds since boot. Making arduino aware of the current time would mean implementing some NTP alike mechanism so that it can pass the UNIXTIME as parameter.
I assume (I know I made a lot of assumptions slapping this together) that not specifying a time would default to “now”. The arduino timing is less than 1 second off as shown in the tcpdump output in one of my previous posts. So I’m fine with the default “now”.
That being said, there is an issue when the arduino is reset. This will certainly introduce a “long” period. And the following api calls will not by in sync with the api calls before the reset. Taking a 10s interval, timings could be for example:

0, 10, 20, 30, 40, reset, 67, 77, 87, 97, 107, reset, 123, 133, 143, 153, …

How would the data stream look like?
The 50 and 60 interval have been missed and the 110 interval was also missed.
So the data would look like… this?

20.0, 20.1, 20.2, 20.3, 20.4, null, null, 20.7, 20.8, 20.9, 21.0, 21.1, 21.2, null, 21.4, 21.5, 21.6, 21.7

Done! Someone may find the exact bash commands useful.

~ $ url=“http://$IPEMONPI/emoncms/feed/create.json?apikey=$APIWRITEKEY&tag=Node%20arduino1&Name=t1&datatype=1&engine=5&options={“interval”:10}
~ $ echo $url{“interval”:10}
~ $ wget -qO- “${url}”
~ $ url=“http://$IPEMONPI/emoncms/feed/getmeta.json?apikey=$APIREADKEY&id=26
~ $ echo $url
~ $ wget -qO- “${url}”