emonTx V3 not updating to emonBase

Hi Paul

Thanks for your attention and help.

  1. yes it did work before

  2. here is node 8 from the emonhub.conf file, which I have never altered

[[8]]
nodename = emontx3
[[[rx]]]
names = power1, power2, power3, power4, vrms, temp1, temp2, temp3, temp4, temp5, temp6, pulse
datacodes = h,h,h,h,h,h,h,h,h,h,h,L
scales = 1,1,1,1,0.01,0.1,0.1, 0.1,0.1,0.1,0.1,1
units =W,W,W,W,V,C,C,C,C,C,C,p

  1. I have already posted what I get using “tail …”, and reported that it is the same as in "emonhub.log view. There is something wrong here as you can see from the time - it is not reporting more recent events. If I navigate to /var/log/emonhub I see that the emonhub.log file (and emonhub.log.1) files are more recent (up to date). I am afraid it is a long time since I did any Unix and can’t recall how to access these files from windows. If you let me know how to get them to you that might be a good thing to do.

The emonhub.log files can be up to 5mb in size, way to big to be uploading here or even to sift through.

If you instigate a tail -f /var/log/emonhub/emonhub.log in a terminal session window and then restart the emonTx you should see that initial post being processed, wait for 30 secs or so after any node 8 entries have passed and then use ctrl-c to end the log tail. Select from a couple of lines before the “initial packet” to the end of the valid section and copy&paste from the terminal window into a text file to upload or if it’s not too big paste it straight into a forum post.

Hi Paul

Well, I did manage to edit the log file. Despite several restarts to emontx there was only one entry referring to node 8

2016-05-06 20:28:24,574 DEBUG RFM2Pi 421 Timestamp : 1462566504.57
2016-05-06 20:28:24,592 DEBUG RFM2Pi 421 From Node : 8
2016-05-06 20:28:24,593 DEBUG RFM2Pi 421 Values : [10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
2016-05-06 20:28:24,594 DEBUG RFM2Pi 421 RSSI : -86
2016-05-06 20:28:24,594 INFO RFM2Pi Publishing: emonhub/rx/8/values 10,0,0,0,0,0,0,0,0,0,0,0
2016-05-06 20:28:24,596 INFO RFM2Pi Publishing: emonhub/rx/8/rssi -86
2016-05-06
20:28:24,597 DEBUG RFM2Pi 421 adding frame to buffer =>
[1462566504, 8, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -86]
2016-05-06 20:28:24,598 DEBUG RFM2Pi 421 Sent to channel’ : ToEmonCMS
2016-05-06 20:28:27,830 DEBUG RFM2Pi 422 NEW FRAME : OK 5 0 0 22 1 22 1 171 95 151 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (-0)
2016-05-06 20:28:27,833 DEBUG RFM2Pi 422 Timestamp : 1462566507.83
2016-05-06 20:28:27,834 DEBUG RFM2Pi 422 From Node : 5
2016-05-06 20:28:27,834 DEBUG RFM2Pi 422 Values : [0, 278, 278, 244.91, 15.100000000000001, 0, 0, 0, 0, 0, 0]
2016-05-06 20:28:27,835 INFO RFM2Pi Publishing: emonhub/rx/5/values 0,278,278,244.91,15.1,0,0,0,0,0,0
2016-05-06 20:28:27,836 INFO RFM2Pi Publishing: emonhub/rx/5/rssi 0
and that is is, between 2016-05-06 19:46:36, and 2016-05-07 02:19:22,Does this help?

I restarted both by pressing the reset button, and powering off and then on.

This is what I get:

pi@emonpi ~ $ tail -f /var/log/emonhub/emonhub.log -n 20
2016-05-07 02:19:12,631 DEBUG RFM2Pi 4970 Values : [0, 197, 197, 243.56, 13.4, 0, 0, 0, 0, 0, 0]
2016-05-07 02:19:12,632 INFO RFM2Pi Publishing: emonhub/rx/5/values 0,197,197,243.56,13.4,0,0,0,0,0,0
2016-05-07 02:19:12,634 INFO RFM2Pi Publishing: emonhub/rx/5/rssi 0
2016-05-07 02:19:12,636 DEBUG RFM2Pi 4970 adding frame to buffer => [1462587552, 5, 0, 197, 197, 243.56, 13.4, 0, 0, 0, 0, 0, 0]
2016-05-07 02:19:12,636 DEBUG RFM2Pi 4970 Sent to channel’ : ToEmonCMS
2016-05-07 02:19:17,684 DEBUG RFM2Pi 4971 NEW FRAME : OK 5 0 0 195 0 195 0 57 95 134 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (-0)
2016-05-07 02:19:17,687 DEBUG RFM2Pi 4971 Timestamp : 1462587557.68
2016-05-07 02:19:17,687 DEBUG RFM2Pi 4971 From Node : 5
2016-05-07 02:19:17,688 DEBUG RFM2Pi 4971 Values : [0, 195, 195, 243.77, 13.4, 0, 0, 0, 0, 0, 0]
2016-05-07 02:19:17,689 INFO RFM2Pi Publishing: emonhub/rx/5/values 0,195,195,243.77,13.4,0,0,0,0,0,0
2016-05-07 02:19:17,690 INFO RFM2Pi Publishing: emonhub/rx/5/rssi 0
2016-05-07 02:19:17,692 DEBUG RFM2Pi 4971 adding frame to buffer => [1462587557, 5, 0, 195, 195, 243.77, 13.4, 0, 0, 0, 0, 0, 0]
2016-05-07 02:19:17,693 DEBUG RFM2Pi 4971 Sent to channel’ : ToEmonCMS
2016-05-07 02:19:22,625 DEBUG RFM2Pi 4972 NEW FRAME : OK 5 0 0 193 0 193 0 245 94 134 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (-0)
2016-05-07 02:19:22,628 DEBUG RFM2Pi 4972 Timestamp : 1462587562.63
2016-05-07 02:19:22,628 DEBUG RFM2Pi 4972 From Node : 5
2016-05-07 02:19:22,629 DEBUG RFM2Pi 4972 Values : [0, 193, 193, 243.09, 13.4, 0, 0, 0, 0, 0, 0]
2016-05-07 02:19:22,630 INFO RFM2Pi Publishing: emonhub/rx/5/values 0,193,193,243.09,13.4,0,0,0,0,0,0
2016-05-07 02:19:22,631 INFO RFM2Pi Publishing: emonhub/rx/5/rssi 0
2016-05-07 02:19:22,633 DEBUG RFM2Pi 4972 adding frame
pi@emonpi ~ $ tail -f /var/log/emonhub/emonhub.log -n 20
2016-05-07 02:19:12,631 DEBUG RFM2Pi 4970 Values : [0, 197, 197, 243.56, 13.4, 0, 0, 0, 0, 0, 0]
2016-05-07 02:19

Richard

These 2 lines tell us a lot, unfortunately what it tells us doesn’t make a lot of sense right now, but it explains why the packets are failing to make it all the way to emonhub so that we can at least see them to debug. All those zero’s will be causing “bit-slip” and the packet can be mis-read and assumed corrupt, at which time it is discarded by the emonPi.

This isn’t going to be easy to debug as the emonPi will not run in non-quiet mode so those discarded packets can be seen, so hopefully the problem will get resolved in one hit and not need much debugging.

So the root cause will be the cause of all the zeros, that is the strange part, the first 4 values are Power values and are not usually enough zeros to trigger the issue alone, the next value should be the ac voltage, why that is missing I have no idea since it is both recognized and reported as ~246v in the serial prints, as for the next 6 values after that should all be 300 if the serial print is to believed and you have no temp sensors attached to the emonTx (300 = absent temp sensor).

So that line should have been something like
2016-05-06 20:28:24,593 DEBUG RFM2Pi 421 Values : [10, 0, 0, 0, 246, 300, 300, 300, 300, 300, 300, 0]
This clearly points at an issue within the emonTx. Have you made any changes to the emonTx sketch?

“Bit slippage” sounds nasty!

I have made no changes to any sketch. Over my pay grade. The problem occurred suddenly. As far as I know never reverted.

Should I try to reload the sketch? Presumably this is a standard one I can get form GitHub?

By the way I have parallel problem that the emonPi will not run a firmware update because the file system is read only. Glyn was attending to this but we don’t seem to have reached a conclusion. I am reluctant to reset to rw in part because is not clear to which folder(s) to apply rw.

At this point it may be better to focus on just the temp sensor zero’s rather than the VAC, the fact the VAC was zero on one packet only is not conclusive, the VAC is measured and updated during every interval so even if there was a “glitch” it should be overwritten with the next reading, it reports 0 rather than “230” so it could be that at that particular point in time the AC wasn’t connected as the programmer was connected or something. I assume the TX is normally AC powered and there is no batteries or 5vdc mini usb or programmer attached?

With regard to the temp sensors the value of 300 is set at start up for each of the 6 variables and then the sketch goes on it’s merry way, if anything occurs to any of those variables they will never return to 300 unles the emonTx is restarted so I guess it could be a possible memory overflow or a glitch in the 1-wire library or implementation.

My original request to add the 300 fault condition was not to just add it at setup() but to change the way the value is returned, so rather than

for(byte j=0;j<MaxOnewire;j++) 
      emontx.temp[j] = 3000;                             // If no temp sensors connected default to status code 3000 
                                                         // will appear as 300 once multipled by 0.1 in emonhub

in the setup() setting the values to “300” once only, the “300” should be added by the get_temperature() function so that it applies to every read of the temp sensors

int get_temperature(byte sensor)                
{
  float temp=(sensors.getTempC(allAddress[sensor]));
  if ((temp<125.0) && (temp>-55.0)) return(temp*10); else return 3000;            //if reading is within range for the sensor convert float to int ready to send via RF or value = 300 if not
}

in fact, I recommended the additional use of

for(byte i=0;i<MaxOnewire;i++) emontx.temp[i]=3010; // set all temp values to 301°c (301= never found & 300 = lost or out of scope)

in setup(), So that we could distinguish between “never seen”(301) and “gone faulty”(300) sensors. but the main “fix” was the modified function.

If this method was used the temp sensor value would only ever be zero if the temperature was actually 0 and if there was a glitch at any time the situation would be reassessed every read so that what ever value was presented is in fact current and not the result of a past incident.

This would avoid the “bitslip” which is it’s intention but it will not cure the root issue here, “something” is causing those 6 original values of 300 to vanish, reset or be set to 0.

see Re: Data loss due to RF packets getting corrupted for the original discussion about the “bitslip” issue.

Hi Richard
I’m not sure what to recommend, you could try reloading the sketch but I seee no reason why it should change anything as you are running the latest FW.
I have posted some thoughts on some changes that should be considered for the sketch but the root cause is still a bit of a mystery.
You could try reloading and if that doesn’t work we could try editing a sketch for you with the changes I mentioned, that may then allow the emonTX to at least recover from whatever the root cause is and perhaps reveal some further detail?

Can you link to the update issue thread? The setting of write mode for editing is a global setting done with the rpi-rw command to put the system in a temporary state to write mode to allow controlled edits etc.

@pb66:
This doesn’t cater for the “85.0” reading you get from a broken wire to the sensor. I’m putting

   if (temp<125.0 && temp>-55.0 && temp!=85.0) …
      else return 3000

into the 3-phase sketch. Hopefully, others will follow :wink:

Well - I have now relocated the emonPi, and am trying to update the emonTx. I don;t have a Pi. How do I load the latest emonTx 3.4 firmware from Windows - presumably through the arduino IDE? Or through Xloader (which the blog says is not tested)?

I will need to download and place the latest firmware in a folder to which the “sketch” is pointed? I’m not sure wha to download from “emonTxFirmware/emonTxV3 at master · openenergymonitor/emonTxFirmware · GitHub” which is the supposed source.

Or do I need to get a Pi and connect it to the emonTx - by wires I presume as shown here- Connect an EmonTx v3 to RaspberryPI via serial | Archived Forum! Not into this yet, but I’m willing to learn…

All help gratefully …

Do you have a programmer to connect from your computer’s USB port to the emonTx’s serial port? If so, the easiest way is via the Arduino IDE, full instructions for W8/W10 are here: Setting up the Arduino Environment for Windows (7, 8.1 & 10) | Archived Forum
You’ll need to get and install some libraries - there’s a link at the bottom of the IDE installation page - and that more-or-less tells you where to put the sketch. It insists on being in a directory of the same name. The one you want is
emonTxFirmware/emonTxV3/RFM/emonTxV3.4/emonTxV3_4_DiscreteSampling at master · openenergymonitor/emonTxFirmware · GitHub
so it has to be in a folder called “emonTxV3_4_DiscreteSampling”

I’m not sure about the !=85, here’s my thinking

I do not know exactly how the “85” is decided by 1-wire, I had heard of it, but never experienced it or read up on it so I couldn’t agree or disagree with how it’s done except for “why have a valid value as a fault value?” I do not know how accurate or reliable it is, Adn for that matter, on the surface I don’t no how it functions with 3 wires! if either the power wires are broken it’s game over, and if the signal wire is broken it cannot tell you. so I have my doubts on it’s ability, but that is an uninformed position, I would hope that all these points are covered in it’s design.

However the element of uncertainty would make me want to air on the side of caution and not specify the “!=85” part to avoid spikes of 300°C during any transition between 84°c and 86°c and back. But given we rarely use the temp sensing in that range it probably doesn’t matter, although I think it would still be clearer if “300” meant a reading outside the normal range and didn’t include whatever 1-wire brings to the party in the form of a perceived error.

I’ll leave it to you, it’s a small point that possibly won’t matter either way, but if I were forced to decide I would say let the “85” work as intended. If it was a non-valid value I might include a 302 = “1-wire reported” error code just for "85"s but I still see 84.8 → 84.9 → 302 → 85.1 → 85.2 as a potential issue

I agree with the point about a slow transition from 84.9 to 85.1, but from the data sheet - the footnote to Table 1 on page 4: “The power-on reset value of the temperature register is +85°C” (this is also the upper limit of the device’s ±0.5°C accuracy band - maybe not a coincidence?) so I read that as meaning that 85 is returned if there isn’t a problem reading the sensor’s register but there was a problem filling it.

It seems amazing that they chose a value within the operational range as the default value, thereby prohibiting the sort of scheme we want to adopt, but we’ve got to live with it.

If the sensor is on a water pipe or tank, I can see a potential problem. If it’s monitoring room temperatures, I can’t believe that there’s a problem.

[EDIT]
Here’s what I’ve now got. In practical terms, it appears to give me:
300 = sensor known, data or GND has become disconnected, or reading out-of-range
301 = sensor absent (=not detected at power-on)
302 = sensor known, power has become disconnected

int get_temperature(byte sensor)                
{
  float temp=(sensors.getTempC(allAddress[sensor]));
  if (temp==85.0)                                        // if reading is '85.00' = power-on value, return value = 302 - device error
    return 3020;            
  else if (temp<125.0 && temp>-55.0 && temp!=85.0)       // else if reading is within range for the sensor convert float to int ready to send via RF 
    return(temp*10); 
  else                                                   // else return value = 300 ('Faulty sensor')
    return 3000;            
}

Mmmmhh!

The more I think about it the less I like it, your code will no doubt give you what I understand to be what you are aiming for, but the cost of possibly providing a slightly better error reporting, is an implementation that definitely cannot report a temp of 85°C even when it is accurate, thus reducing the upper usable “spikeless” operating limit from +125°C to +84.9°C.

I was sorta hoping you were going to reveal some clever library code that ensured the normal operation through 85°C was maintained somehow. but forcing all "85"s to be “302” isn’t something I’m likely to find myself recommending I’m afraid.

I think we are restrained by Maxim’s decision to use “85” as the power-on reset value of the temperature register. I can’t see an easy or reliable way out of it. 85 is either a genuine value or it isn’t, but there’s no foolproof means that I can think of that will tell us which of the two it is. You can probably affirm that a run of values that are exactly 85 mean a fault, and that 85 surrounded by values that are near to it is genuine, or that if you interpolate between the values either side and that comes to 85 or nearly, then it’s genuine, and you can come up with any number of similar schemes. But none offer certainty. At least with 300, it’s obvious that it cannot be a real temperature, even though it might mask a genuine 85°C, so it offers the possibility of screening it out further up the chain. As I inferred earlier, context is extremely important in deciding how to handle this.

I had also suggested with the adoption of the positive fault indication we should also create a emoncms process specifically for temperature sensors, the idea at the time was a simple “if >= 300 do not persist to feed” so that the error values didn’t skew any min, max or average etc it would just be visible on the inputs page and/or could be used to trigger an event.

Such a process could also ask “is it really 85°C” by one of methods you mention without having to delay the current reading by an interval or 2 in order to confirm it’s a valid temp as a sketch would and it would also have the unique advantage of being able to edit/correct a previous datapoint too.

As you say we are stuck with this as it is the way maxim have done it, perhaps different applications can include or exclude it depending on expected scope of readings, likelihood of a failure, the receiving software and “realtime” necessity.

For now I would be much happier living with the position imposed by maxim that means a hopefully rare fault is assumed to be a valid temp, than being responsible for developing a software that cannot report a valid temp of 85°C and could trigger false fault indications when correctly sensing a temp of 85°C.

In or out, it doesn’t matter right now, if it causes an issue either way down the line it can be resolved. The important thing right now is that completely missing sensors do not result in a zero value, which had my original edit to the get_temperature() function been implemented would be the case. It wasn’t easy getting a partial implementation, a full implementation may now be possible as this fault has identified the need, but trying for an extended implementation may well prove too much and prevent any progress.

If it were a library function (or when it becomes a library function?), it seems reasonable to re-evaluate this question. My guess is the best method would be, on a per-sensor basis, either to pass on the 85°C reading or to block it, as determined by the user. Given the restricted capability of the sketch (in relation to emoncms, that is) I think it’s much more reasonable to pass the problem upwards where it can be handled more intelligently.

Robert

OK, I’ve installed libraries and downloaded the .ino; I’ve got the arduino IDE running and opened the discrete_sampling .ino.

I’m connected to the serial port of the the emonTx which is reporting readings as I would expect.

I compile - it reports

In file included from C:\Users\Richard Palmer-Jones\Documents\Arduino\oem\emontx\emonTxV3_4_DiscreteSampling_rpj\emonTxV3_4_DiscreteSampling_rpj.ino:37:0:

C:\Users\Richard Palmer-Jones\Documents\Arduino\libraries\jeelib/RF69_compat.h:9:2: error: #error This file must be included BEFORE the “RF12.h” header file!

#error This file must be included BEFORE the “RF12.h” header file!

Multiple libraries were found for “Ports.h”
Used: C:\Users\Richard Palmer-Jones\Documents\Arduino\libraries\jeelib
Not used: C:\Users\Richard Palmer-Jones\Documents\Arduino\libraries\RFu_jeelib
exit status 1
Error compiling for board Arduino/Genuino Uno.
I moved

#include <RF69.h>
#include <RF69_avr.h>
#include <RF69_compat.h>

before

#include <RF12.h>
#include <RF12sio.h>

but the same error. Help please.
When successfully compiled do I then “Upload UsingPogrammer”?

Edit - fixed formatting, deleted duplicate post - Moderator, BT

You should not need to edit the libraries

Just to confirm, are you using this sketch ? And this line is still set to “#define RF69_COMPAT 1”

What version of Arduino IDE are you using?

I suspect you’ve got some libraries in the wrong place, but you shouldn’t need to move anything in the sketch, as published it is known to compile correctly. Which Windoze version are you using, and have you got the structure of your libraries correct?

Sorry to come in very late, but could this be related to the RFM firmware / Jeelib changes that I had a problem with in the emonSD forum? V2.6 is “bad” and revertiung to V2.5 restores the connection - I also continued to see the emonTH input fine.

See emonSD-03May16 Release - #35 by peter

Actually, re-reading the original post this is exactly the same issue. Downgrade the emonbase/RFM firmware from 2.6 and you will be back.