emonSD next steps: filesystem & logrotate

Trystan has documented his previous findings and working through the tests again currently. I think the basic idea is that the smaller block sizes mean less writes because every block is read and rewriten to write just one bit. So by increasing your data to save (feedwriter) and lowering your block size, you reduce the partial block writes aka multiple rewrites.

Those days pre-date even the very first emon images i think. I agree lower cost means we maybe do not need to be quite so careful, but as I said itā€™s the inconvenience not the cost that is the main factor here. Not to mention the environmental impact of ā€œdisposable sdcardsā€. Plus whilst the cost may have gone down, the tech has gotten better, cards do not fail as often these days so while they are cheaper, they have a longer expected life too.

Not sure I understand this

No, not at all! You cannot back up every other minuit of every day, even swapping out a sdcard for another pre-formated and imaged card results in lost data, even a reboot results in lost data!

If you card fails every few months are you going to watch it every day? I guarantee it will fail when you are the other side of the globe on your hols! Even with a daily backup you could lose a day or so by the time you notice and create a new image and set it up and import the old data etc etc Most of use have better things to do and want an almost fit and forget setup.

Indeed, even humans fail eventually, but doesnā€™t stop us taking reasonable steps to delay the inevitable.

Iā€™ve got some sdcard based RPi devices that have been logging for 5 years on the same card, just as well, some of them are over 4 hours drive away, thatā€™s a long way to go to change a sdcard!

Iā€™m all for delaying the inevitable, just want to be sure it is not a case of 'weā€™ve always done it that way, which blinds one to other options. Equally ainā€™t broke donā€™t fix is a great mantra.

[edit]Your other thread reminds me one of the reasons for the separate partition was that the root partition was RO and the data partition RW. Again, things have changed.

Wow!!! You surprise me, why are we changing all the init.d and logs then? emonhub init.d not only ā€œainā€™t brokeā€ but it has been the single most consistently dependable part of the project for years. It was the only service on the whole image that created itā€™s own log directory avoiding any need to be listed in the rc.local shenanigans and the log files have been the main source of debugging info for everything from rfm firmware issues, mqtt, data loss, even emoncms.org rejecting data etc etc etc. Those logs were also self rotated and were the only logs that couldnā€™t stuff up the ram partition prior to the logrotate being implemented and even when other logs filled the ram emonhub kept running. Now since itā€™s been ā€œfixedā€ it is no only less dependable, it is causing issues for other services and stopping when the log is full.

And not always for the better. The fact part of the sdcard is RW undermines the RO part to a degree, because itā€™s the same card, if any partition wears out the whole card is no good. And now the rootfs is no longer RO it is prone to other errors too.

IMO RO should still be an option at least! Notice I say ā€œoptionā€! I know many user do not want RO emonPiā€™s and I get that and TBH thereā€™s not a great deal of gain in having a RO FS and RW data on the same sdcard. But there is still a good argument for a RO data-forwarding only image option. One that is locked down with no disk writes at all, the ā€œrock-solid gatewayā€ was a great image.

Not me Guv. However, we are where we are and a sensible way forward needs to be identified.

As I never used a standard emonhub, I never really knew how it worked (in terms of logging etc).

And I donā€™t disagree there either.

As a matter of interest, what was the retention period for the emonhub logs previously?

Since it prodominantly write itā€™s logs to a space that is capped in size, it didnā€™t have a time retention, just a size cap, it rotated the log at 5M and retained just one old file so it would (eventually) maintain between 5M and 10M of logs. Al this was undermined by the new logrotate regime that rotated hourly at 1M and originally retained 2 files, effectively reducing emonhub logs to between 2M and 3M after it was running a while.

Mmmmmmmmmh!

Iā€™m all for consistency, but I would not consider that ā€œbrokeā€. For me I would like to have made all the other services as reliable, robust and dependable as the emonhub init.d service if consistency was a goal.

Then we could try to get a single systemd service functioning well (by well I mean a prolonged test under real and stressed conditions not a quick spin up ā€œit works fine for meā€ test) and then migrated the known good init.d services to a known good systemd implementation, fixing any issues enroute.

@ TrystanLea

Yr results as between diskstats and iotop are very different. But if the objective is low-write then both suggest that yr System 2 is better.

Iā€™m running emonSD images updated to 9.9.8. I tried iotop but found it difficult/unfriendly ā€¦

  • It takes over the ssh screen
  • Data did not persist on the 2 top lines
  • There is no indication of the start time
  • When you quit ā€¦
    • The data is gone (a screen shot before quitting is clunky)
    • There is no indication of the finish time
  • I ran it for over an hour on a node ā€¦
    • No disk reads recorded
    • 32 K written to jbd2 mmcblk0p2 (the OS?)
    • 5.39M written to /var/www/emoncms/scripts/feedwriter.php ā€¦ is that to the OS partition or to the Data partition? ā€“ was not clear.

My diskstats sectors written results are very different to yours (assume you are only talking about writes?).

Also my results are very different as between my different nodes ā€“ number of sensors, amount of input processing and whether just doing the emoncms task, all clearly have an impact. And this could also explain the difference between our results.

I guess it would be disappointing if it turns out that the choice of best low-write solution is dependent on number of sensors, amount of input processing, etc.

Using diskstats involves a bit of manual/excel calculation. I made a very simple script that helps getting the basic snapshot data ā€¦

#!/bin/bash
echo -e " starting diskstats-snapshot.sh"
echo " "
echo -e "date ... gives the time & date now"
date
echo " "
echo -e "uptime -p ... gives the time since last boot"
uptime -p
echo " "
echo -e "sudo cat /proc/diskstats ... gives the diskstats since last boot up until now"
sudo cat /proc/diskstats
echo " "
echo -e "it's now time for manual calculations ... :("
echo -e "the 6th number in diskstats is the SECTORS READ since last boot up until now"
echo -e "the 10th number in diskstats is the SECTORS WRITTEN since last boot up until now - the key stat?" 
echo " "
echo -e "that's it ..."
exit
1 Like

Ah, but I was referring to the install instructions not the move to the systemd method :grinning:.

[edit]i.e. not copying the service file but linking it to the /lib folder.

Ah ok!

Thanks again for sharing your test results @johnbanks looks very similar to what Im seeing

yes is doesnt say in the iotop output as far as Im aware, it should be the data partition.

Thats true, your seeing 122.4 sectors/min for 13 feeds while I was seeing 16.4 sectors/minute for 23 feedsā€¦ I have extended the interval at which I save to disk to 5mins rather than default 1min periodā€¦ it might be thatā€¦ I will try again with 60s write period.

@TrystanLea

Thx for the additional insights.

I use phptimeseries almost exclusively.

From the emonhub.conf logs, I can see that 2 of my nodes are updating every 10 secs.

A third node was updating every 5 secs which threw me a bit until I remembered that its emonTx was running 3 phase firmware that I updated with a new release last June. Apparently it has a user defined reporting interval. After a quick look thruā€™ the sketch, I didnā€™t see any ref - so I must be blindly using the default and itā€™s 5 secs.

Similarly, I donā€™t recall selecting a write to disk frequency and so again Iā€™m blindly using the 60 sec default.

Now I appreciate that sensor reporting and writing to disk are different aspects. However, I would have thought the number of sector writes was more dependent on the sensor data ā€˜batchā€™ size rather than the frequency with which ā€˜batchesā€™ are written to disk. Pls point my simplistic thinking in the right direction.

Thx

Thanks @johnbanks

I think it should be a combination of both, if the file system block size is 1024 bytes and you can batch the data in such a way that utilises a greater proportion of each block write (by buffering and writing at a lower frequency) that should reduce the overall sector writes per minute. These a limit to how much this can be done, if you buffer one feed for 5 mins and its posting 4 bytes every 10s, thatā€™s 120 bytes but a write load of 1024 bytes.

PHPTIMESERIES writes 9 bytes per data point so 270 bytes. In theory the write load should be the same? but im not sure that it isā€¦ would be good to test and double check

Iā€™ve been running some longer tests looking at the difference between different logging configurations, interested to understand difference between /var/log writing straight to the OS partition and log2ram with and without RSYNC.

To start with I reduced the logging level of emonhub and emoncms to ā€˜errorā€™, converted the remaining ā€˜echoā€™ and ā€˜printā€™ statements in the feedwriter to use the emoncms logger and the same for the demandshaper background process. I cleared all the logs and rebooted the pi prior to starting the test.

Here are the results for OS partition write load with logs going straight to the OS partition, log2ram removed:

Mean sector writes per minute 64.4 = 0.54 kb/s over almost 9 hours.

iotop recorded 6.4 mb over this period ~ 0.2kb/s

The /var/log folder itself shows an increase of 456kb (not including subdirectories)

Hereā€™s the same but with log2ram installed and RSYNC enabled, over ~16h, starting with mostly reset/ empty logs:

Mean sector writer per minute 29 = 0.24 kb/s

An earlier test but without paying attention to reducing amount of logging, gave the following:

Mean sector writer per minute 321 = 2.7 kb/s, this included the feedwriter logging that I disabled for the above two tests, emonhub at ā€˜warningā€™ level with quite a few decoder match errors filling the log and I had not cleared the logs. So its not really a relevant comparison but useful to understand what kind of magnitude the write load can get too.

It seems that even with rsync enabled the log2ram sync write load keeps escalating, while I would have thought it should have reduced the size of the write to the relevant difference in the log fileā€¦

I will try log2ram again next but with rsync disabled.

@TrystanLea
Ah! - if Iā€™ve understood correctly ā€¦
The 4 bytes vs 9 bytes / phpfina vs phptimeseries is irrelevant - it just affects the amount of ā€˜wasteā€™ in the 1024 block.
And that would be true up to 113 (1024/9) datapoints per block write.
So if posting every 10 secs, that would mean buffering and only writing every 18 mins (113*10/60) or thereabouts in order to eliminate ā€˜wasteā€™.
Thx

Have you fixed the emonhub log entries going to both daemon.log and syslog?

Yes thatā€™s applied as part of the image build script :slight_smile:

1 Like

Ok. The issue then is that it is not just emonhub log entries saved in daemon.log - all services default to writing log entries to that file.

Iā€™ve developed a mechanism to isolate the emonhub log entries here emonSD next steps: emonhub logging which might be worth investigating.

Can you explain what you mean here?

One of the problems is the rate of logging fills the logs. For emonhub at least, I donā€™t think there is an appetite to reduce the granularity, just a means of managing it. I also think we need to cater for worst case i.e. maximum logging.

Are you still running hourly logrotates?

1 Like

Do you mean the write speed i.e. writes per second or the number of actual writes in a second? Probably the latter but just to be sure, as it is (of course) the number of writes that happen, not how quickly they happen.