Community
OpenEnergyMonitor

Community

Recompiling RFM69Pi firmware

firmware
rfm69pi
Tags: #<Tag:0x00007f1fe4649050> #<Tag:0x00007f1fe4648ec0>

(Andrew Peace) #1

I have an RFM2Pi module with an RF69 module (unmodified, as purchased around November last year). I believe it is the RFM69PI V3 but has a blue PCB not red.

It’s been working great, but I wanted to experiment with my own firmware on it. I figured before I did that, I should make sure that I can compile and get the stock firmware working.

If I upload the compiled firmware from the RFM2Pi repository (using the update-RFM69.sh script) then everything works fine. Now, I want to compile it myself. I have checked out jeelib at revision 7fc95a72ec3202f79ac25df0e250a15df48f2f6c and have tried creating compiled hex files both using the Arduino IDE (version 1.6.12) and the Arduino-mk package (my preferred method since I can compile and upload from the Pi that way). Both compile fine and produce a hex file. I can upload the hex file to the device using avrdude and the autoreset script as documented, and once I’ve done that both can be interacted with over serial (e.g. ‘v’ shows the expected output).

However, neither show any packets being received. If I re-upload the pre-compiled hex, everything works fine again.

I suspect it’s possible we’re compiling with different versions of JeeLib, and there is some incompatibility? Please could you state which revision of jeelib is being used for the factory image?

Thanks,
Andy.


(stuart) #2

Did you check the frequency settings for the RFM module in the code ? If its wrong/different, it won’t receive anything over the air.


(Robert Wall) #3

Stuart:
Unless it is VERY close - maybe under 100 mm.

Andy:
I’m not aware that any changes to JeeLib have been incompatible with previous versions. The only change I recall that broke anything, was because the library (Ethercard) became larger and the sketch subsequently failed because it ran out of memory.


(Paul) #4

What is the “current configuration” reported with the “v” command? is it as expected?

Have you tried changing the group, or unsetting “quiet mode” ?

As a test try compiling using an Arduino IDE and transferring the hex to the Pi and upload to the RFM69Pi using avrdude.

I recall having a difficult time using arduino-mk when I experimented with it a couple of years a go, but a lot can change in 2 years. Although I haven’t embraced it yet myself, the projects current favorite method of upload is PlatformIO which is similar in that it also uses Make files and compiles a hex file from the command line.


(Andrew Peace) #5

Thanks for the responses so far.

@stuart, what RFM module settings are you referring to? The frequency etc, as configurable through the serial interface, are the same (433MHz, network group 210, node 15, no frequency offset, quiet mode off). Part of the problem is I don’t know how to tell what code the pre-compiled hex file in that repo was produced from, so I can’t tell what’s different (if anything) about the CPP files I’m using (which are the ones at the head of master).

@pb66: yes I did also try compiling using the Arduino IDE and then copying over, this had the same behaviour. (Works in as far as interacting over serial, but doesn’t print any received data.) Normally I get about a packet per second that doesn’t match the CRC (with quiet mode off) and a packet every five seconds or so from an EmonTX that is in the house - it’s fairly close (1m) to the receiver and the pre-compiled sketch reports -20dBm signal strength for those messages.

I agree my problem might not be JeeLib, that was just one idea and I thought it would be worthwhile trying to determine what the exact version of code used to produce the pre-compiled firmware so I could try to reproduce that to begin with.


emonTx Node Communicates to one RFM69Pi, but not another
emonTx Node Communicates to one RFM69Pi, but not another
(Andrew Peace) #6

Thanks again everyone for your help and suggestions.

I finally made progress on this: it does appear to be either a JeeLib or JeeLib compatibility issue. I hadn’t realised the OEM project has its own JeeLib fork, so I was using master from the upstream branch. Whilst I had already tried an older version, it wasn’t old enough: going back further seems to have resulted in a working firmware as does using the version on OEM’s github.

So something between then and the current JeeLib has broken things. I can use the older JeeLib for now, although I would suggest that the OEM project put a note about compiling this sketch on the hardware page (https://wiki.openenergymonitor.org/index.php/RFM69Pi_V3) indicating which version of JeeLib should be used/has been tested.

If it’s useful I could potentially figure out what revision causes it to break, which might yield a patch to either the RFM69Pi sketch or JeeLib.

Thanks,
Andy.


Getting a new emonpi up and running with old HW
emonTx Node Communicates to one RFM69Pi, but not another
EmonGLCD does not read time nor Power
(Glyn Hudson) #7

RFM69Pi firmware can be compiled by platformIO, currently in dev branch. Will be merged to master soon.

https://github.com/openenergymonitor/RFM2Pi/tree/dev/firmware/RFM69CW_RF_Demo_ATmega328/RFM69CW_RF12_Demo_ATmega328

See compiling guide: https://guide.openenergymonitor.org/technical/compiling

Firmware can be updated to latest by running rfm69piupdate.sh from ~/emonpi

This will pull latest compiled .hex from git hub release and flash directly on the Pi


(Andrew Peace) #8

Thanks Glyn.

It works pretty easily with arduino-mk for anyone wanting to do it that way. Note that I don’t have the emonPi, but just a pi with the RFM69Pi board attached. The compiling guide seems to talk mostly about the EmonTX, and doesn’t mention which version of jeelib is required, which seemed to be the issue in my case. The script also doesn’t check out jeelib to the correct version, so it might be useful to add that to either the script or the documentation? (I guess for users of the stock SD image there is already a copy on there? But I was doing it on a standard Raspbian image.)

Thanks,
Andy.


(Andrew Peace) #9

I debugged this further and found the problem with the newer JeeLib:

commit 6f1af25695a51910d2bb8ca0e796a7edda028848
Author: Jean-Claude Wippler <[email protected]>
Date:   Sat Apr 9 00:39:04 2016 +0200

    add John O's trick to sync on an extra AA from the preamble
    
    this change also sends out one more 0xAA preamble byte

I’m guessing that in order for this to work correctly, all nodes on the network need to be updated with this change, which of course my EmonTX was not and hence I didn’t see any packets from it. Might be worth making this change optional, or at least bearing it in mind if the precompiled firmware is updated with a newer JeeLib version for any of the OEM hardware.


(Paul) #10

Hi Andrew, that sounds logical that both the sender and the receiver should be using the same packet format so it raises questions over why or how the current range of emonTx and emonTH, that are compiled with the latest JeeLib, function with any of the RFM2Pi’s (even the latest firmware is compiled with a JeeLib dating back to at least June 2015).

Earlier today, after reading your comments, I tried compiling and installing the latest RFM2Pi firmware using the latest JeeLib (updated direct from the from the jcw/jeelib repo today), I used Arduino IDE 1.8.0 to install it to a JeeLink, which is basically a RFM2Pi on a USB stick, the only noticable difference is it runs at 16MHz not 8MHz as the RFM2Pi’s do. This installed and worked as expected, immediately receiving data from a 3year old emonTx v2 (rfm12) and a couple of emonTH’s (rfm69’s) that have not been updated since they were commissioned mid 2015.

Whilst writing this I also recall I have a live RFM2Pi that was updated on the 29th of April 2016 with a newly compiled RFM2Pi firmware and the very latest JeeLib at that time for debugging an RFM issue on the old forum (see https://openenergymonitor.org/emon/node/12004#comment-41491), so that would have included those pre-amble changes made on 8th April too and it also receives the devices mentioned.

So whilst what you say makes perfect sense, I’m guessing there must be some sort of backwards compatibility in the code somewhere.


(Robert Wall) #11

This isn’t our ‘long string of zeros’ clock tracking problem in another guise, by any chance?

I know there was a problem in JeeLib in the transmit department, where in effect it relied on the time taken by the function return to successfully complete transmission of the message. The transmit function actually returned before the transmit buffer had completely emptied, and if you did the wrong thing immediately the code came back, the transmission was cut short and the checksum failed. JeeLib were looking at this and I don’t know what the outcome was. (This was May, 2015)


(emjay) #12

The April 9th JeeLib commit will have a strong influence on this. In brief, in a relatively high background noise evironment (e.g. close to a motherboard busy clocking away) the RF Module Rx section can trigger on the random noise since sooner or later, random will match the desired bit pattern after the preamble. Since it is junk, the “packet” will eventually fail with a bad CRC, but the receiver is blind during this variable rejection time. A valid packet arriving during the blind period has no chance - the decode process is busy with the junk after the preamble/group id and cannot recognise the real packet header until the reject occurs and the Rx section is put back into scan mode. With the change, the chance of mis-decoding junk is much smaller (~ 250x lower ) - net result is few if any ‘?’ packet reports. The real benefit is a substantial reduction in the “blind” issue, so fewer valid packets missed.

The additional validation of the pre-amble should work fine with earlier driver code - it was tested thoroughly, especially interoperating with RFM12B where it might well be a chore to dig out years old source code and rebuild. The other change referenced of sending out a extra preamble byte on Tx is for a couple of edge cases (e.g. where the environment is really noisy and bits in the pre-amble are getting squished). Apart from these edge cases, there is no requirement to rebuild old node images - with thousands deployed, backwards compatibilty is a clear design goal.

What RSSI value is reported on the ‘?’ packets you get with whatever combination of versions is working for you?


(Paul) #13

Quite possibly, if there are some discarded packets reported with acceptable RSSI’s.

Thanks for clarifying, that fits with my own experience, I have mentioned elsewhere that the “?” packets numbers I am seeing recently has dropped dramatically.

So this would explain a massive drop in reported traffic and if the remainder consists of little or no valid packets, then it could be that there is another issue at play, quite possibly the “zero runs” depending on the emonTx’s firmware vintage.

Andrew could you confirm what traffic you saw (if any) with the “later JeeLib” firmware, were the RSSI’s low or high and were there any “?” packets etc? And what emonTx firmware version are you running? or more specifically what is the usual packet content? does it have the 6 temperatures? if so are they used? if not used are the unused temperatures usually reported as 0 or 300.0?


(Andrew Peace) #14

Thanks for all the replies again, the help is much appreciated.

With the version installed that doesn’t work, I don’t receive any ‘OK’ packets at all. I don’t receive many ? packets as discussed, but I did capture one here for reference:

? 17 171 24 180 139 62 61 209 108 168 236 121 130 121 243 175 10 120 183 156 219 (-78) 

Without that version of the firmware installed (one with the same RFM2Pi version and one revision earlier of jeelib) I see packets like this:

 ? 1 46 1 0 0 0 0 0 0 140 93 184 11 184 11 184 11 184 11 184 11 (-78) 
 ? 6 46 1 0 0 0 0 0 0 140 93 184 11 184 11 184 11 184 11 184 11 (-78) 
 ? 9 10 132 136 121 40 80 101 189 251 231 226 47 53 96 0 5 67 145 214 237 (-79) 
OK 10 55 1 0 0 0 0 0 0 123 93 184 11 184 11 184 11 184 11 184 11 184 11 1 0 (-23) 
 ? 25 55 1 0 0 0 0 0 0 123 93 184 11 184 11 184 11 184 11 184 11 (-78) 
 ? 24 55 1 0 0 0 0 0 0 123 93 184 11 184 11 184 11 184 11 184 11 (-78) 

Note that there is only one EmonTx in the house (with just one CT sensor and no temperature sensors). I’m not sure what firmware it’s running - it was purchased in Nov 2015 and hasn’t been updated (though I did actually get a programming cable, so I potentially could update it if that might be useful).


(Paul) #15

Looking at the group of 6 where only the 4th is “OK”, the 1st, 2nd, 5th and 6th lines are all almost there, the actual payloads for those 4 packets appear correct, only the 1st power and the voltage vary very slightly, the 6th temperture (184 + (11*256) = 3000) is missing as the pulse because packets that fail CRC only print the first 20 bytes. Only the 3rd line looks like noise.

But the node ids for those 4 failed packets are all over the place and the RSSI is -78dB, a long way off -23dB but very consistent and not reason enough alone for those packets to be dropped.

All the “184 11” values are unused temp sensors and tell us the firmware version is since the “string of zero” issues from unused temp sensors were fixed.

I think you might have more than one issue here.

Could you install the firmware hex I compiled in April this year (rfm2pi_rfm69.hex_0.txt) and post the output here for comparison?


(Andrew Peace) #16
> 0v
[RFM2Pi_v1.2_(rfm69)] E i5 g210 @ 433 MHz

Doesn’t seem to produce any OK packets.

I wondered if the values in the ‘?’ packets in my earlier output that look like EmonTX packets were just parts of the buffer that hadn’t been updated, i.e. it was just printing part of the OK packet that was received prior. Would have to check that code to see if that’s possible.

I’m thinking about trying to get a dump of the packet headers to see if I can tell why they’re being filtered.


(Paul) #17

But does it produce any “?” packets if you disable quiet-mode? If so can you post some, since this firmware uses a later JeeLib and is a tried and tested compile, I was looking to see if the expected “OK” packets got through or not. It would be very useful to see what’s being received but not passing CRC and to see if the node and RSSI issues are still apparent.

I would be surprised if that was the case, although I can see why that looks possible from that small batch of packets. Since the rfm2pi sketch prints the data direct from the rf12_data array you would need to delve into JeeLib to confirm if it is possible. But a larger sample set might confirm that theory one way or tuther.

Edit - Are you able to get a feel for the origin of some of the “?” packets by watching the LED of the emonTx? how frequently are the “?” packets landing and are they in sync with the emonTx LED?


(Glyn Hudson) #18

I included a link to the compiling guide to illustrate how to use
platformIO. The same can be applied to compiling rfm69pi.

The beauty of using platformIO to compile is that the version of the
libraries including jeelib is specified and all libs are downloaded and
installed automatically. Take a look in platformio.ini to see the jeelib
version:

https://github.com/openenergymonitor/RFM2Pi/tree/dev/firmware/RFM69CW_RF_Demo_ATmega328/RFM69CW_RF12_Demo_ATmega328

  • sent from my mobile device

(emjay) #19

Looking at the limited trace info, it’s clear that the local noise level is rather high. The reported RSSI of ~ -78db is sampled early in the preamble scan, whether real or noise triggered.
The real traffic is very strong at ~ -23dB which you would expect to “punch through” the high noise floor. However I suspect the noise is not uniformly spread in time, but in short bursts.
I suggest taking advantage of the good signal’s strength by raising the RSSI detection floor. The driver has an init block where several RF module setup parameters are stuffed in - scan for 0x29 and set to say 0x8C (threshold at -70 db)
If the background EMI has structure (e.g. a clock harmonic falling inside or close to the Rx passband), then another effect can come into play. The AFC may drag the listening frequency well away from the target frequency, creating a different “blind” time waiting for the listening frequency to be brought back to the target as part of re-enabling the Rx scan. A quick check for this is to comment out this line in RF69.cpp:
0x1E, 0x2C, // FeiStart, AfcAutoclearOn, AfcAutoOn
The RegAfcFei register then defaults to the POR setting which has AFC disabled


(Andrew Peace) #20

I have spent a bit more time investigating this, here is my conclusion:

I bought a JeeLink to have an additional transceiver to test with. I started creating a spreadsheet of the combinations of the JeeLink and RFM69Pi as sender and receiver with and without the patch, as well as the EmonTX (preumably without the patch) only as a sender. I initially found that when the receiver has the patch but the sender does not, the packet is not received. That is UNTIL I got to the JeeLink as receiver with the patch, receiving from an RFM69Pi without the patch, which worked.

Then I decided to move the transceivers closer together and ensure the the direction of the antennas was the same on all the devices (by propping up the wire antenna on the JeeLink and RFM69Pi). With that, I could get the unpatched receivers to receive from a patched sender. This suggests that when you have a mix of nodes like this, reception becomes more problematic even with a strong signal (noting that background noise for me in this location is on the order of -87dBm according to the JeeLink with the ‘y’ option.

I guess the reason for this is that in my environment, more bytes of preamble are required to lock the receiver than are available when the sync value is consuming the last 0xAA byte of preamble and the sender isn’t transmitting an extra byte of preamble (via the first sync byte) to compensate for this.

For my environment this isn’t an issue, I can ensure all my hardware is running with or without that particular patch. As noted elsewhere, it does actually seem to be backwards compatible, although at least in some cases seems to cause reception issues in a mixed environment.

 

This didn’t seem to make a difference…

…although this, as expected, eliminated many of the ‘?’ packets when the sync-value patch was not applied.