There are multiple issues to address here, yes the incoming data could be better perhaps, but since we encourage data to be sent from here there and everywhere, the inputs must be robust enough to handle rogue data and/or occurrences.
This issue is as old as I can remember, I used to see it often with http inputs, not so much recently though. It was a prolific problem when the emonpi variant of emonhub had issues and it was just as bad when the emoncms mqtt input had issues.
Looking just at emoncms, there are 2 parts to the issue, firstly it shouldn’t create the duplicate input(s) and secondly, when it does, it shouldn’t continue to update the wrong (duplicate) input. If the duplicate inputs never occur (due to a more robust input module) then obviously the latter part of the issue is moot, but the fact that this can occur becomes a much bigger problem when emoncms then continues to update the wrong input. If it simply reverted to the original after creating the duplicate, there would be a minimal interruption in the data and no user intervention required except to perhaps delete any duplicate inputs every so often, annually or so?
The issue with updating the wrong input is another manifestation of a common issue that effects other parts of emoncms too (see Apps feedlist issues) where emoncms defaults to using the last found item if there are multiple items of the same name. Since this is an issue for duplicated inputs, this behaviour could perhaps be modified to always use the first found, that would minimise the impact of the occasional duplicated input.
A more aggressive approach would be to delete all but the first occurrence of any given input, after all the keys are supposed to be unique within that node/device. Obviously I would prefer to see the duplicate inputs not happen at all rather than them getting automatically removed, but however it’s done it must be fixed within emoncms for emoncms to be in control of it’s own reliability.
This duplication of inputs may “not be a problem” currently, for some or even most users, but whenever an input issue does arise it tends to result in duplicate inputs and therefore, lost data and unhappy users.