Might not be for everyone but the question of how to get data out of the emonPI for analysis elsewhere has come up a few times on here.
Over Christmas I wrote some Python functions to query the API, and return the data as a Pandas dataframe. It’s attached below emon_api_functions.py and an example ahowing how it is included get_data_2.py
The emon API has a query limit of 8928 samples, When fetching historic data the query is set to 10 seconds, this means a day’s worth of data is just less than this limit. The example file shows how you can loop through a number of days and simply concatenate the resulting queries into a single DataFrame. Before saving as a CSV file or a binary pickle.
Doing this it takes about 2 minutes to download 12 months of data at 10 second resolution for 4 feeds. The data is then stored locally as a binary Pickle file so you don’t have to hammer the API each time.
At the moment the feed names need to be unique across the API so if you have for example two feeds called “temperature” but on different tags “boiler” “cylinder” etc it won’t work. I only have a single emonPI and the feed names are unique so this is not a problem.
I intend to develop further when I have time but if useful to you or feedback do post below
redacted_api_for_emon_py.zip (2.8 KB)
Looks really useful
Had you found the UsefulScripts repo? GitHub - emoncms/usefulscripts: Some useful scripts for administering Emoncms accounts
I like that!
I’m going to have to investigate what a Binary Pickle file is!
I’m interested in a static file/source of average data to thin down my data Feeds. This might be the way to do it! Do you use the metadata to determine feed start for instance? I’d then be inclined to feed the data back into emoncms as a new aggregate feed.
Why not use the FeedID?
Thanks - did not know of the useful scripts, archive - I see one for getting the carbon intensity from NationalgridESO API which I was considering from scratch! I see it’s on Github this is on my “2023 to learn” list. I’m not a software engineer but you see more and more data being made available on GitHub now, not just code, need to get my head round pull, fork,merge etc.
Do you use the metadata to determine feed start for instance?
At the moment the script just takes the dates it’s given (and I know those from how long I have had the emon) but it’s easy to convert it back if you get from the
Out: datetime.datetime(2021, 10, 23, 11, 55, 20)
Although important to note note the metadata from the API is in seconds from the epoch and the download requests you make via to the API are in miliseconds from the epoch.
Why not use the Feed ID?
The main download function does use feed ID. I just wanted to be able to use a list of feed names I know [‘use’,‘solar’,‘export’] etc. so I wrote a feed_id_from name function
The resampling is all done in Pandas, for time series data this is brilliant, it’s simply one line daily_df = main_df.resample(‘D’).mean()
I have used Pandas elsewhere with datasets of several million rows, with queries executed in seconds (although with a bigger computer than a PI! ). A better example is my weather station which uses Pandas to take 1440 samples over a day and calculating the daily stats at midnight with no issues. We getting a bit off topic but there are loads of examples & “courses” on the web
Have you tried ‘R’ at all for data analysis?
Nope! I think R more for professional statisticians, I know several who use it to do analysis I can’t understand! If you want to do basic analysis / manipulation of data Python and Pandas are great, there is also NumPy and SciPi if you want to do more advanced stuff. e.g. for my immersion diverter I used NumPy to fit a polynomial to the calibration data curve, and then produce a power from current. For graphing there is Matplotlib
There is an overlap between them, different people probably have different views but I’m just a enthusiastic amateur and went the Python/Pandas/NumPy route.