emonSD next steps: filesystem & logrotate

TrystanLea · 20 April 2019 17:43

Here’s the same but with log2ram installed and RSYNC enabled, over ~16h, starting with mostly reset/ empty logs:

Mean sector writer per minute 29 = 0.24 kb/s

An earlier test but without paying attention to reducing amount of logging, gave the following:

Mean sector writer per minute 321 = 2.7 kb/s, this included the feedwriter logging that I disabled for the above two tests, emonhub at ‘warning’ level with quite a few decoder match errors filling the log and I had not cleared the logs. So its not really a relevant comparison but useful to understand what kind of magnitude the write load can get too.

TrystanLea · 20 April 2019 17:46

It seems that even with rsync enabled the log2ram sync write load keeps escalating, while I would have thought it should have reduced the size of the write to the relevant difference in the log file…

I will try log2ram again next but with rsync disabled.

johnbanks · 20 April 2019 18:15

@TrystanLea
Ah! - if I’ve understood correctly …
The 4 bytes vs 9 bytes / phpfina vs phptimeseries is irrelevant - it just affects the amount of ‘waste’ in the 1024 block.
And that would be true up to 113 (1024/9) datapoints per block write.
So if posting every 10 secs, that would mean buffering and only writing every 18 mins (113*10/60) or thereabouts in order to eliminate ‘waste’.
Thx

borpin · 20 April 2019 18:51

Have you fixed the emonhub log entries going to both daemon.log and syslog?

TrystanLea · 20 April 2019 19:48

Yes that’s applied as part of the image build script

borpin · 20 April 2019 20:05

Ok. The issue then is that it is not just emonhub log entries saved in daemon.log - all services default to writing log entries to that file.

I’ve developed a mechanism to isolate the emonhub log entries here emonSD next steps: emonhub logging which might be worth investigating.

Can you explain what you mean here?

One of the problems is the rate of logging fills the logs. For emonhub at least, I don’t think there is an appetite to reduce the granularity, just a means of managing it. I also think we need to cater for worst case i.e. maximum logging.

Are you still running hourly logrotates?

borpin · 20 April 2019 20:14

Do you mean the write speed i.e. writes per second or the number of actual writes in a second? Probably the latter but just to be sure, as it is (of course) the number of writes that happen, not how quickly they happen.

TrystanLea · 20 April 2019 21:51

Thanks @borpin

Here are the feedwriter changes: move all feedwriter logging to emoncms log · emoncms/emoncms@b561ac1 · GitHub all logging now goes via the logger, Im not sure why there was still print statements in there.

no, its just log2ram with rsync setting on/off.

the measurement is the output of proc/diskstats read every 60s, so that ends up being sectors written over a 60s period.

borpin · 20 April 2019 21:55

Ah Ok into the emoncms.log. Thanks.

TrystanLea · 20 April 2019 21:56

Whats not quite clear to me yet is how much less write load log2ram with an hourly sync is resulting in vs logging straight to the SD card in the first place and what the difference is between rsync enabled or disabled. That’s what Im trying to test at the moment. Im testing a reduced logging configuration but the results should maintain similar proportions if logging was more granular… although that would also be good to confirm with testing

pb66 · 20 April 2019 22:17

I would be surprised if there is any real advantage using rsync over cp. In general rsync only offers an advantage if there are large files (>GB) and/or remote targets. The small files in play here on a local device I expect to be at a disadvantage using rsync due to the additional processing to calculate.

That may be true, however whilst you are working at such a macro level, but still with the same size sectors and blocks the differences are going to be that much harder to detect and/or measure. Ideally you should be over logging so that the differences are obvious. If it were me, I would write a small program to log dummy data to a handful of log files at such a rapid rate as to make any other un-controlled activity negligible and therefore all tests are comparable. It would speed up the tests, make the tests more consistent and the results more obvious. Doing it this way also stress tests things at the same time.

[edit] the results of your tests above seem to show L2R at least halves the write activity if I’m reading correctly. Half the wear means twice the life span, what are your aims?

The fact you see so much reduction when generating so few log messages is outstanding IMO, that effectiveness can only get better with more traffic, there is only so far you can reduce “barely writing” down to. With larger writes there would be more to save.

TrystanLea · 21 April 2019 08:15

My understanding was that rsync transfered only the content of the file that had changed rather than copying the file anew. rsync - Wikipedia I guess it depends what the chunk size is which isnt entirely clear?

The test showing half the writes is for logs starting from a cleared state. As the log file size increases it appears the size of the hourly sync spike increase as well, the question is does it start becoming a larger write load than direct writes after x number of days, or with a lower logging resolution, that is what is not entirely clear to me…

TrystanLea · 21 April 2019 08:21

Here are the overnight results for log2ram with rsync turned off

again showing increasing spike magnitudes over time.

The mean sector writes per minute is 38.4 = 0.32 kb/s
The size of the /var/log folder using ‘du -sh /var/log’ increased from 384k to 1016k over 12h15 mins
iotop reported 1784 k of jbd2/mmcblk0p2 writes over the same period.

TrystanLea · 21 April 2019 17:01

The write rate from 9pm to 6am not including the large spike just before 7am was 20.8 or 0.17kb/s, from 7am until 5:30pm this evening it was 36.4 or 0.3 kb/s, the spike magnitude stepped down a little after midday today before continuing an upwards trend. Still not really enough data to be sure what the longer term implications are. It would be good to have 3-4 days at least.

yes that would be good as well, i was keen to test a reduced logging level to see what kind of write rate having the emoncms and emonhub logging at ‘error’ entails. I’d be grateful for any help replicating the above logs on different configurations. Do you see your sector writes per minute continue to increase over time?

Ideally the sync step should only copy the new data and not overwrite the entire file every time?..

borpin · 21 April 2019 18:43

Can you do some setup instructions?

TrystanLea · 21 April 2019 18:53

Sure, here’s the script Im using to record /proc/diskstats:

<?php
$apikey = "";
$host = "http://localhost/emoncms";

$redis = new Redis();
$redis->connect("localhost");

// Retrieve last value from redis
$last_sectors_written2 = $redis->get("diskstats:sectorswritten:2");
// Fetch current value
$total_sectors_written2 = exec("sudo awk '/mmcblk0p2/ {print $10}' /proc/diskstats");
// Save current value to redis
$redis->set("diskstats:sectorswritten:2",$total_sectors_written2);
// Calculate difference
$sectors_written2 = $total_sectors_written2 - $last_sectors_written2;

// Retrieve last value from redis
$last_sectors_written3 = $redis->get("diskstats:sectorswritten:3");
// Fetch current value
$total_sectors_written3 = exec("sudo awk '/mmcblk0p3/ {print $10}' /proc/diskstats");
// Save current value to redis
$redis->set("diskstats:sectorswritten:3",$total_sectors_written3);
// Calculate difference
$sectors_written3 = $total_sectors_written3 - $last_sectors_written3;

// Send to emoncms
file_get_contents("$host/input/post?node=diskstats&data=totalsectorswritten2:$total_sectors_written2,sectorswritten2:$sectors_written2,totalsectorswritten3:$total_sectors_written3,sectorswritten3:$sectors_written3&apikey=$apikey");

I run this from cron with:

* * * * * php /opt/emon/emonpi/track_diskstats.php > /dev/null 2>&1

This then creates 4 inputs, all of which I log to feed using PHPFINA at 60s interval. The screenshots above show examples of the resulting graphs.

TrystanLea · 22 April 2019 07:05

The trend is still increasing, average from the start is now 47.2 sectors/minute 0.4kb/s, the last 6 hours 72.7 sectors/minute 0.6 kb/s (higher than the write rate logging direct to disk of 0.54 kb/s). The /var/log folder size itself is now 2.1mb (increased at 0.014 kb/s for last 15 hours, the same rate as in the 12h15 period posted above).

borpin · 22 April 2019 07:23

I ran these stats overnight. Running a standard EmonSD on an EmonPi with just a CT but picking up RF from my other setup. Using the modified emonhub logging as mentioned elsewhere. emonSD next steps: emonhub logging - #13 by borpin

Overnight.

What might be interesting is plotting where feedwriter is writing and what volume. I cannot explain why for some parts of the period the minimum was not zero then from 05:00 it was (not me guv!)! The spike is probably a logrotate of emonhub log. Could it be feedwriter is not exactly 60s?

A close up of the writes as a result of the logrotation at midnight UTC. It was 3 stage rotation, concatination of latest log, a 1.gz file rotated to 2.gz and a new 1.gz file created. [edit]Log compression is about 10:1 - first compressed log was not 24Hrs

-rw-r--r-- 1 root root  19M Apr 22 06:58 /home/pi/data/emonhub-log/emonhub.log
-rw-r--r-- 1 root root 6.3M Apr 22 00:21 /home/pi/data/emonhub-log/emonhub.log.1.gz
-rw-r--r-- 1 root root 792K Apr 21 00:14 /home/pi/data/emonhub-log/emonhub.log.2.gz

TrystanLea · 22 April 2019 08:53

Thanks @borpin

did you record sectorswritten3 as well? that should include all your emoncms data write load.

other than the emonhub log modification are your logs otherwise on tmpfs without any persistance to disk? as in standard emonSD configuration?

What logging level is emonhub set? and can you post the mean writes/minute given in the graph stats below the graph?

borpin · 22 April 2019 09:44

Yes

Standard setup so DEBUG I think.

Yes. The peaks coincide with the emonhub log rotations.

I’ll repost the graph as well so the 2 match

BUT those peaks are unexplained they might have been something I did but are very strange. The data before the first peak is:

Not sure how useful this is…