This is great to hear!
I see from the repo that you are using a NUCLEO-F401RE development board which has a STM32F401RET6 fitted.
Although the F4 range are faster more powerful MCU’s the board I choose for testing was the NUCLEO-F303RE running slightly slower and with less SRAM because the ADC specs were far more appealing.
The F401 has only one 12-bit ADC with up to a max of 16 channels and a max speed of 2.4msps where as the F303 has 4 12-bit ADC’s with a potential fore up to 40 channels and a max speed of a whopping 18msps. That’s 7.5x as many samples per second from a slower MCU. The 64-pin package on the NUCLEO-F303RE dev board allows access to 22 ADC channels (you need to 128pin package to access all 40)
The F303 also has 4 op amps that can be used as programmable gain amps or perhaps use one for the mid-rail voltage instead of voltage dividers.
Did you have to modify the shield at all as that was designed for an Arduino running at 5v? eg the voltage dividers to the RFM would need bypassing and the burdens changed to suit 3.3v?
Maybe I should consider grabbing a shield for initial dev’ing on the STM? I was actually thinking about doing a short run of 10 or 20 boards to break out 18 adc’s via the ST Morpho connectors with some headers for the mid-rail and voltage dividers so we could try out and compare the onboard op amp options.
These devices have an internal voltage ref calibrated in production and since we have the increased resolution (10bit to 12bit) we might be able to use a smaller input range and use much lower value burdens which could remove much of the phase shift from the CT’s, yes there is a potential for more noise, but I wonder if the result would be better or worse?