It might be worth investigating why the delay helps, especially after EmonLibCM_TemperatureEnable() is called. That routine appears to do all the bus discovery stuff so is presumably successful at that with or without a later call to delay() - that would imply it’s not a power settling kinda’ issue.
It looks like the last thing that routine does is a global COPY_SCRATCHPAD command to store the conversion resolution, followed by its own call to delay()…
oneWire.write(COPY_SCRATCHPAD, true);
delay(20); // required by DS18B20
It’s not clear where that 20 comes from, according to the datasheet that transaction typically takes 2 msecs and worst case 10, so you’d expect 20 to be sufficient:
In my ds18b20 code I poll the bus to see when the TCONV command is finished - that enabled me to write up this report on vastly varying conversion times amongst the clones. I don’t currently use the COPY_SCRATCHPAD command, but when I get time I’ll hack something up that does, and see how reliable the 2 to 10 msecs datasheet claim is across my samples.
@NickT - do you happen to know the UID of your ds18b20s?