STM32 3CT Example

What can produce infinity ‘inf’ in C?

We have the 3CT example here:
put together by Trys, based on the txShield example.

The output of the serial terminal is here: (43.1 KB)

There are a number of these 0 outputs. Searching the file for ‘inf’ also yields these problem lines.


The zeros only ever occur in either the first or middle line, never the 3rd. There seems to be a patterns to the an amount of error between I0, I1 and I3.
The negative counter is also strange.

My day could be getting to the bottom of this one I think.

Just looking at the numbers, I’d have thought it was obvious. All this is very, very basic debugging:
Where if the “inf” coming from? Which line of code?
What are the values of all the variables in that line?
Do the maths in that line by hand.
Where does each of the variable used come from?

Using that pattern trace backwards through the code that produces the errant value or values until you find the problem.

It’s much the same in any computer language. And if my hunch is right, any desktop calculator will have the same problem. So would my brain, but YMMV.

Dividing by zero, indeed.

I’ve started printing back the results of the certain steps of the calculation… Good idea to go through the whole chain… There are a couple of other things not right, sussing out inf was an easy one.


The consistent error between I0, I1, and I2. I should check the hardware first, see if the error’s consistent with the channels or not:


Different channels, different pattern of error, it’s hardware. That’s something to go into with a scope.

And it pays to pause and think what the physical reality of the maths means. In other words, should the calculation even be attempted if a certain set of conditions exist?

Noted, thanks.
This comes to mind also, to watch for a variables proximity to it’s upper limit, i.e. a uint64_t has a limit of 2^64 = 18,446,744,073,709,551,615.
Because an overflowed variable could make everything doolally.

And this question is important also, is it worth performing a calculation on a disconnected channel?

But for testing the maths, an assumption needs making: everything is connected hardware-wise. What are the maximum input values expected and how can we ensure variables aren’t overflowed, or if they are, how is it handled?

As I wrote in another thread, it’s your job as project lead to decide what the conditions need to be so that “it” functions as you intend it to do. Define “it” as the hardware, or the software, or a part of either. Then you make it clear what those operating limits are, so that people know what it can do and what it can’t.

Appreciating the clarity.
It goes without saying that this defining process is done in tandem with Trys and Glyn.

I’ve had a stunningly unproductive day getting to the bottom of these intermittent zeros: learning about arm specific methods and memory alignment. Those days gotta be had though.
. @dBC I’m reminded today (my memory’s definately getting worse) that we encountered these zeros before, and it could have something to with a makefile option, the optimisation setting. I’ll check tomorrow.
. In debugging the accumulators I spotted something else going on there too, but I’ll double check tomorrow before posting any details, it could have been a sprintf formatting thing.
. Could also check whether any of these functions are happy with being interrupted by the DMA half/cplt interrupts… the notables are memcpy(), memset(), and sqrt().
. Noting also there’s a arm_sqrt_f32() function in arm_math.h which could be of use.

Going back to the the txShield example, on which the 3CT example is based, came to mind the other day. @dBC it worked well for you right?
I’m going off now to read more of the stm32 development thread to see what I can find. I may not use Trys’ method, despite liking the simpler approach attempted there, and instead engineer something from the more tricky looking TxShield example, tricky of course, to my untrained eyes.

I saw and reported the zeroes running your binary on the old Nucleo/TxShield h/w combo.

That was your theory at the time, but I was unconvinced. In my experience, if changing the optimisation settings changes the behaviour then you’ve probably got a bug. Tuning the optimisation settings to make it go away is usually just hiding it rather than fixing it.

My recollection is that I initially posted some code to demonstrate how fast the ADCs could run (way faster than needed for your application). Trystan then took that code and turned it into an energy monitor. When my txShield turned up I started over and that evolved into the emonTXshield demo tar files. So the two forks are at best distant cousins. AFAIK there are no spurious zeroes in the emonTxshield demo images, but if you decide to go that route keep in mind that it really was just demo code. In particular it lacks phase error adjustment - apart from a h/w trick to start the two ADCs running with a hardcoded interval between them.

Phase error adjustment is ideally necessary for different CT/VT combos correct? I understand the filters of the inputs being matched, and continuous sampling, do away with much error.

Okay, I’ve read as much as I can find on emonTxShield example, and other days reading have covered other things. I get the unitless approach. The main thing I’m unsure about at the moment is firing off the processing of new samples directly from the interrupt… Interrupt priority settings maybe help here, I saw a mentioning of this in relation to the ds18b20, albeit to stress test.
I’ll look to porting it to the larger chip tomorrow.

I think you should delete the word “ideally”.

That’s questionable. It gets rid of a component of the error. I think you have in mind the Atmel '328P as an example. In that, there is a phase/timing error that has two components - the difference in phase error between the two transformers, and the time difference between the readings. The second part can removed by sampling both voltage and current channels at the same time¹. But that still leaves the phase errors, and the problem there is both transformers have a phase error that depends, to a lesser or greater degree, on the value of the quantity being measured. And of course that’s different for each model, and even varies between samples of the same model. You only have to look at the test reports to see that.

To a first approximation, phase differences can be regarded as time differences, so by introducing a controlled time difference between the voltage and current readings, the phase error can be compensated. To take this a stage further, the time difference needs to be altered according to the values of the v.t’s and c.t’s phase error that results at the measured voltage and current.

How practical that turns out to be will depend on how much processing power and memory you have available.

¹ Had emonLib been written with the voltage and current samples taken in the reverse order, the timing difference would largely compensate for the transformer’s phase errors.

Yep, which is why the one-size-fits-all ADC kick-off approach was deemed inadequate.

Matched filters and simultaneous V : I sampling ensures the phase errors are external to the box … i.e. just down to the sensors (VTs and CTs), but you’ll always need to adjust for them.
@Robert.Wall’s paper has an approach you might like to consider.

I’m pretty sure the emonTxShield demo did all of that, even with scope traces showing one ISR interrupt the other. By careful choice of interrupt priorities the NVIC effectively gives you “processes” with different priorities.

Not by coincidence I don’t think, I downloaded RW’s paper earlier today. I should take a look.

That’s a method that can only be used if you’re doing interpolation. Not knowing the granularity of the timing, it might or might not be possible to start the ADCs so as to not need to interpolate. Or if the sampling frequency is adequately high, choosing which pair of V & I samples to use together also provides a means of adjustment after the samples have been collected.

It might of course be necessary to use both - timing or choice or sample for the coarse adjustment and interpolation for the fine adjustment.

Using the usec offset from the TxShield example seems like phase errors can be dealt with.
A function can be made to set the usec advance or delay upon receiving a command from the Pi.

I think I understand what you mean by interpolation, this relates to the staggered timing of samples taken by the ATMEGA. Interpolation might not be necessary with synchronous timing on the stm32. I.e. there is no PHASECAL constant, instead it’s the usec delay which does the job.
When getting into three phase monitoring, we can state a condition that the VTs are all the same.
The hardware as it stands is designed such that all VTs are on ADC1. The first 9 CTs are on ADC3. The 6 planned and yet to be designed expansion CTs are on ADC2. The choices of here reflect the physical layout of chip’s pins to board layout to minimise track lengths.

The VTs all on the same ADC1 mean the usec delay can apply equally to all transformers, and of course, helps with imagining the buffer indexes in the code.

I should make a table, Trys and I sketched out something illustrating a potential limitation the other day with indexing, buffer indexes…

i VT CT phase
0 1 CT1 A
1 2 CT2 B
2 3 CT3 C
3 1 CT4 A
4 2 CT5 B
5 3 CT6 C
6 1 CT7 A
7 2 CT8 B
8 3 CT9 C
9 1 CT1 A
n . . .

In sync VT/CT pairs between CTs 1,4,7…2,5,8…3,6,9

Looking at the expansion CTs now, just exploring options…

i VT CT Phase CTex
0 1 CT1 A CT1
1 2 CT2 B CT2
2 3 CT3 C CT3
3 1 CT4 A CT4
4 2 CT5 B CT5
5 3 CT6 C CT6
6 1 CT7 A .
7 2 CT8 B .
8 3 CT9 C .
9 1 CT1 A CT1
n . . . .


i VT CT Phase CTex
0 1 CT1 A CT1
1 2 CT2 B CT1
2 3 CT3 C CT1
3 1 CT4 A CT2
4 2 CT5 B CT2
5 3 CT6 C CT2
6 1 CT7 A CT3
7 2 CT8 B CT3
8 3 CT9 C CT3
9 1 CT1 A CT4
n . . . .

The pattern is set upon initialisation. Food for thought.

Edit: Darn. 16 is the limit to the pattern number, so sayeth cubeMx.

What about the CTs? If they’re all the same then the usec delay between starting the ADCs can work, but the primary objection to that approach when we first prototyped it was the need to support different model CTs on each input, each with quite different phase errors.

As I thought I’d implied, phase adjustment must be made on a per-channel basis at least, and for best accuracy, load and system voltage needs to be taken into account too.

It’s back to interpolation then.
In the for loop, iterating over the DMA buffer, defining which indexes to correlate. The crucial bit of knowledge in this case is the time per ADC conversion. I don’t know this value, I know it’s not as simple as 601.5 cycles as defined in CubeMX.
. Sampling time.
. Conversion time.
. and?

Anyway, 0.000008354166667 usecs for 601.5 cycles at 72MHz. Correct?..

RM0316 pages 322 and 325.

This is it. We can calculate the time represented between buffer indexes.
601.5 cycles with an ADC prescaler of 2 is used in the 3CT example.
A better calculation example then is:

601.5+12.5 x (1/(72,000,000/2))

16.70833347usecs + 0.347222225usec = 17.055555695usecs.

58.7kHz sampling rate.

978 samples per complete waveform at 60 Hz.

Can someone verify?

How does this translate into phase correction?

I derive 5kHz as sufficient based on 100 samples per 1/50Hz waveform. So 58kHz is great. Slowing this down significantly can be done if we need more CPU/memory resources.

Spotted this: ADS1115 and sampling speed - #2 by Robert.Wall

But do we need to define a strict sampling rate based on 50/60Hz if we have zero-crossing detection? Probably not.

A technique I use to verify the ADC is sampling at the rate I think it is, is to drive a spare GPIO signal low at the start of the handler (both half and full complete) and drive it high at the end. If you probe the resultant square wave with a scope, the frequency can be used to verify your sampling rate and the duty cycle will reveal how much cpu is being used processing the data.

Also remember that each ADC is typically set up to cycle through a series of inputs. So if it’s doing a new sample every 17 usecs (say) and you’ve programmed it to cycle through 3 inputs, then it’s only sampling each pin every 51 usecs so the sampling rate of any given signal is 1 / 51 usecs.

Clear. Let’s take it to 9 CTs then, 154usecs or 6,515Hz.
Scoping, experimental data, that’s right.
Another method was found today after fixing a bug in the Trys’ 3CT program.
125 50Hz AC cycles is 2.5 seconds. In that time a counter gave us a value of around 48,400 samples for three channels. 2.5*58.7kHz / 3ch = 48,917. This is out by around 10%. I wonder if the scope will display a lower sampling rate than calculated, I’ve made a note to check this. I’ll tag @TrystanLea here so we we’re on the same page.

The bug was resolved by passing, at a sprintf, a format specifier for the counter variable (%lld is needed for a uint64_t, instead of %d)
and then by removing the linker flag from the makefile: -specs=nano.specs

N.B. I’ve found before the scope can be connected directly at the ADC input, as the sampling charges a capacitor the voltage drops, the dip can be picked up clearly.

Here perhaps?