I’m new to your community, but I was looking at some of the commercially available energy monitors and wanted access to the raw data and realized I’d have to go for something open source to get any sort of access to the data. I also don’t really feel comfortable sending information about when I am home to someone else’s server.
My goal would be to have low level access to information similar to the data Sense makes available in their app. I want low level access so I can integrate it into something like HomeAssistant (or openHAB if thats your thing).
My relevant credentials: I’m a neuroscience PhD student with a background in Computer Science. I do electrophysiological recordings (continuous voltages recordings from the brain with high frequency and resolution on many channels) then spike sort (identify the unique voltage-time shape templates of spikes from individual neurons). I also do unsupervised segmentation and analysis of birdsong which probably gives me the useful parts of voice recognition without the biases caused by all the human specific natural language models (a lot of the Sense team has voice recognition backgrounds). A lot of what I’ve done with birdsong has involved machine learning and Deep Neural Networks, but I’m not sure whether that will be completely necessary in this application.
- Unsupervised identification of template models for energy usage. IoTaWatt only provides 35-40 samples per second which is quite a bit fewer than the millions recorded by Sense or the thousands I use in my neuroscience recordings, but the important frequencies are much lower for this application and I’m pretty sure Sense throws out most of that information, even before uploading it to their servers, otherwise the data usage would be unreasonably high. The basic idea would be to predict the derivative of (changes in) voltage, power, and energy (VPE) given some time history of VPE. If the derivative is predictable, then the VPE is probably a result of some new process. I’m taking my inspiration from delay differential equations and some of the results I use to parse songbird songs since I don’t have a good model of what a “songbird word” is. This will probably require some computational power, would probably be run on a desktop, laptop or AWS, possibly overnight or whenever a bunch of data had been accumulated.
- Use the template models to post to a mqtt thread when different energy processes begin and end for use in homeassistant or something. This should have pretty low computational costs and could easily be done on a raspberry pi like the one I have running Home Assistant.
Things I would need to do this project:
- Help! I have slightly more time than money but I don’t have much time either… I’m trying to wrap up my PhD in the next year so I’m pretty busy. If you’re an aspiring Data Scientist or Machine Learning expert, this would be a very nice project to put into your portfolio (one of the big reasons I’m interested in doing this)
- A literature search. I’m not sure what the SotA is for unsupervised non intrusive load monitoring is. Hopefully some experts can provide some pointers for where to start.
- Some Datasets to try out some algorithms on. I’d like to gather some data together and probably format similar to the way NILMTK does just so that we can compare to other methods. It sounds like there are quite a few of you with one of these installed so hopefully I can get a couple volunteers to share their data. I’m not sure how much I’ll need but it seems to take Sense and the other energy monitors several months to detect some of the devices so I think starting with 3 or 4 months would be a good idea. I’m not sure how big this kind of dataset would be and probably depends a bit on how the data is stored. Bittorrent might be a good protocol to share this data.
- an IoTaWatt. $190 is a bit steep for me until I figure out if I can do the analysis I hope to, especially since the government is deciding to raise my taxes by several thousand dollars a year on an already low grad student stipend. Once I show I can get the data I want I’ll probably spring for one but I want to make sure that 40 Hz is a high enough sampling rate first.
If this post gathers enough attention I can create a slack channel, a github repo, and maybe some bittorrent folders to transfer data. Reply if you’re interested in helping out in any way.