Bulk API - overwrite existing data

Continuing the discussion from Octopus Agile emoncms app:

@TrystanLea, I have had a number of gaps in my octopus data and am trying to ‘fill in’ the data I have recovered using the agile script.

I have made a number of changes to the script that will allow me to specify different configuration files (so it can be uploaded to different servers) and also to recover a certain number of days data - this can be used to fill in gaps.

My issue now is that it seems the bulk upload API will not overwrite existing data.

This is probably be design, but would it be possible to be able to force data overwrite?

[edit]
It also looks like this API call is not documented either.

You and I have discussed this is great depth before, when your emoncms instance is “low-write” ie uses the feedwriter service, it won’t overwrite, a “basic” or “standard” emoncms install will over write. Do you have a test emoncms instance? It might be as easy as switching “low-write” off temporarily.

but I’d strongly recommend trying it out on a test server before trying it on an instance with valuable data. (Unless you do a full virtual server image backup immediately before perhaps?)

I’ll blame it on old age, but I don’t remember that at all (I’m sure you will point it out to me).

I really do not see why it should make any difference. I’m writing the data to a feed (albeit in bulk). How then that gets written to disk should be irrelevant.

I’m just about to head out so can’t reply in full just now, see this thread which should help.

Blimey, that is some memory. I have absolutely no recollection of that discussion.

Probably this post is the more relevant Data Viewer questions - #14 by pb66.

Bugger.

I suspect the remaining points in that thread are still relevant. :frowning:

Stuff like that just sticks with me (forever, it feels like sometimes) but whilst I can vividly remember tech info, debugging and tech discussions for decades, I used to feel a sense of achievement if I could manage to remember a new acquaintance’s name for longer than the handshake lasted (back when that was still allowed).

Try switching off “redisbuffer” as I suggested (test 1st) or stop feedwriter and delete the feed data back far enough so the data doesn’t overlap, clear redis, restart feedwriter and repost the data. Not something I’d want to do too frequently, but if you need to fix a feed with lots of gaps it could be done. Or is there a “merge feeds” post-process?

I could quite easily add the option to skip the buffer to the bulk feed data upload end point, it’s already an option on the feed/update end point:

emoncms/feed_controller.php at master · emoncms/emoncms · GitHub

emoncms/feed_model.php at master · emoncms/emoncms · GitHub

Im not 100% on what the implications would be on a deeper change to the redis buffer implementation, it might be worth investigating but I dont want to break something that works. Adding a skip buffer option on the feed/insert end point itself would not affect the main data path via emoncms inputs etc, so is unlikely to have an SD wear issue.

@TrystanLea That sounds good. Could you update the API docs page at the same time please to show this call and the JSON data format to use.

Ok, here it is, currently in a feature branch for testing:

branch:
https://github.com/emoncms/emoncms/tree/feed_insert_skipbuffer
commit:
skipbuffer option on feed insert · emoncms/emoncms@48d4063 · GitHub

I’ve also added the parameter to the usefulscripts agile script (currently only in the master branch - it still works even if you dont have the right emoncms branch - it will just ignore the parameter if thats the case) .

1 Like

@TrystanLea that works for me :slight_smile:

My current (hacked) agile script looks like this. I can call it with a specified config file (I upload the data to different instances of emoncms) and I can specify the number of days to retrieve to allow me to pick up older data. The parameter handling needs to be better :slight_smile:. yes I know it needs both the file and the number of days to work properly.

python3 /opt/emoncms/modules/usefulscripts/octopus/agile.py /home/pi/agile47.conf 30

You might consider implementing some of these ideas.

#!/usr/bin/env python3

import sys, os, requests, json
from datetime import datetime, timedelta
from configobj import ConfigObj

script_path = os.path.dirname(os.path.realpath(__file__))

if len(sys.argv) == 1:
    # No config file specified
    settings = ConfigObj(script_path+"/agile.conf", file_error=True)
else:
    try:
        settings = ConfigObj(sys.argv[1], file_error=True)
    except IOError:
        print('config file not found - did you use the full path?')
        sys.exit(0)

if len(sys.argv) == 3:
    # days offset - number of days to get
    days_offset = int(sys.argv[2])
else:
    days_offset = 0

now = datetime.now()

print ("Current date and time : " + now.strftime("%Y-%m-%d %H:%M:%S"))
# Step 1: Create feed via API call or use input interface in emoncms to create manually
result = requests.get(settings['emoncms']['server']+"/feed/getid.json",params={'tag':settings['emoncms']['tag'],'name':settings['emoncms']['name'],'apikey':settings['emoncms']['apikey']})
if  not result.text:
    # Create feed
    params = {'tag':settings['emoncms']['tag'],'name':settings['emoncms']['name'],'datatype':1,'engine':5,'options':'{"interval":1800}','unit':'kWh','apikey':settings['emoncms']['apikey']}
    result = requests.get(settings['emoncms']['server']+"/feed/create.json",params)
    result = json.loads(result.text)
    if result['success']:
        feedid = int(result['feedid'])
        print("Emoncms feed created:\t"+str(feedid))
    else:
        print("Error creating feed")
        sys.exit(0)
else:
    feedid = int(result.text)
    print("Using emoncms feed:\t"+str(feedid))

# Agile request parameters
params = {'page':1,'order_by':'period','page_size':25000}

# Step 2: Fetch feed meta data to find last data point time and value
result = requests.get(settings['emoncms']['server']+"/feed/getmeta.json",params={'id':feedid,'apikey':settings['emoncms']['apikey']})
meta = json.loads(result.text)
print("Feed meta data:\t\t"+result.text +" -- "+datetime.fromtimestamp(meta['start_time']).astimezone().isoformat())

if meta['npoints']>0:
    end_time = datetime.fromtimestamp(meta['start_time'] + (meta['interval'] * meta['npoints']))
    end_time = end_time - timedelta(days=days_offset)
    params['period_from'] = datetime.astimezone(end_time).isoformat()
    print("Request from:\t\t"+params['period_from'])

# Step 3: Request history from Octopus
url = "https://api.octopus.energy/v1/electricity-meter-points/%s/meters/%s/consumption/" % (settings['octopus']['mpan'],settings['octopus']['serial_number'])
result = requests.get(url,params=params,auth=(settings['octopus']['agile_apikey'],''))
data = json.loads(result.text)

if not data: sys.exit(0)
if not 'results' in data: sys.exit(0)

dp_received = len(data['results'])
print("Number of data points:\t%s" % dp_received);

# Step 4: Process history into data array for emoncms
data_out = []
for dp in data['results']:
    time = int(datetime.timestamp(datetime.strptime(dp['interval_start'],"%Y-%m-%dT%H:%M:%S%z")))
    value = dp['consumption']
    print(dp['interval_start']+" "+str(value))
    data_out.append([time,value])

# Step 5: Send data to emoncms
if len(data_out):
    print("Posting data to emoncms")
    result = requests.post(settings['emoncms']['server']+"/feed/insert.json",params={'id':feedid,'apikey':settings['emoncms']['apikey'],'skipbuffer':1},data={'data':json.dumps(data_out)})
1 Like

Thanks @borpin that looks good, Il have a look at integrating these changes

1 Like

Thanks TrystanLea, I could really use this feature but the branch seems to have disappeared and I don’t think the feature made it into master. Is there a supported way to bulk-load data including overwriting “old” data points?

I made a small change to /var/www/emoncms/Modules/feed/engine/PHPFina.php to allow overwriting old data, which works fine on my local install but of course won’t help me with the public emoncms.org server:

        foreach ($data as $dp) {
            // Calculate interval that this datapoint belongs too 
            $timestamp = (int) $dp[0];
            $timestamp = floor($timestamp / $meta->interval) * $meta->interval;
            // Value is float or NAN
            $value = (float) $dp[1];
            if (is_nan($value)) $value = NAN;
            // commented out the following code
            /*
            // Append new
            if ($timestamp>$last_timestamp) {
                $last_timestamp = $timestamp;
                $valid[] = array($timestamp,$value);
                $index++;
            // Update last
            } else if ($timestamp==$last_timestamp) {
                if ($index>0) {
                    $valid[$index-1][1] = $value;
                }
            }
            */
            // and replaced it with:
            $valid[] = array($timestamp,$value);
            $last_timestamp = max($last_timestamp, $timestamp);
        }