STM32 Development

Tags: #<Tag:0x00007f56b7e8d9f0> #<Tag:0x00007f56b7e8d928>

I have to say it doesn’t feel that way :slight_smile: .

I was only keen to use the Arduino IDE as it was familiar, if it’s making life harder I’ll look at other methods.

Bearing in mind the previous steps will be in a GUI, I assume this would then be on the command line from the root of that project folder that cubeMX created?

I wonder if the toolchain will all be there, I think all the other options (to makefile) will assume you have that specific environment set up, so would the generic “makefile” option make any assumptions? I guess the best thing for me to do it to try it I guess.

Thanks for the info @dBC. I look forward to reading your next installment :grin:

Yes, assuming your setup is similar to mine (are you a Linux user?). I’ve a directory called STM32CubeMX in my home directory. Under there is a Projects directory and under there is a directory for each project I’ve created. It’s in each of those project directories that the Makefile is created, and you run make from the same directory you find the Makefile in.

It’ll assume the toolchain is installed somewhere on your system. If you type

$ arm-none-eabi-gcc --version

do you get something sensible? If it’s not installed, you just need to follow the normal install instructions for your system to install the arm-none-eabi devtools. Actually, one of the bugs in the Makefile is that it looks in the root directory for the tools. There’s an easy fix for that. Just edit the Makefile to put the correct path in for BINPATH. There’s an even better fix I found in the forums that I’ll include in the next installment. Also, you’re likely to get lots of “mutiply defined symbols” errors at link time, due to the other bug in the Makefile. It duplicates source files in the C_SOURCES variable. Just edit it to remove the duplicates.

Code seems to be working, now I just have to write it all up! There were a few differences between the ADC on your F3 compared to my F0 that I needed to come up to speed with. And speaking of speed, it’s ridiculously fast!

To whet your appetite, my demo program uses just one of your ADCs (ADC2) in standalone mode. It sets it up to continuously sample 11 analog pins. Each conversion takes just 194 nsecs, Instead of just stepping through the 11 sequentially it treats one as special which is something you might want to do say if you have one V input and 10 I inputs. The sampling sequence is: 1,3,4,1,5,6,1,7,8,1,9,11,1,12,14. That way you’ve always got a V reading that’s not too stale compared to each I reading. So each sequence is 15 readings long, and it delivers them to you in batches of 50 (x15). So you get an interrupt every ~145usecs to tell you there are 15x50 readings in one half of your array. Meanwhile it’s dumping the next 15x50 into the other half of your array. Compare that to analogRead()!

It’s all a bit manic, most notably because at that sampling rate you have just 20.8 nsecs to charge the S&H cap, and you only have 145 usecs to do all your maths on 750 readings. And that’s just one ADC, you have 3 more to play with! But the write up will include instructions on how to wind that back to something more manageable. I figured the demo should at least show what it can do.

1 Like

I’ve followed your steps @dBC with STMCubeMX, installation of arm-none-eabi-gcc, edits to makefile, make and copy to the ‘mass storage device’ and it looks like it uploaded ok! :slight_smile:

Source code so far:

ADC Tutorial Part1, the GUI.

OK, here are the instructions for creating this demo project with the GUI tool STM32CubeMX.

  1. Fire up the tool and in the Peripherals section on the left of the Pinout panel, expand ADC2 and enable IN1, IN3, IN4, IN5, IN6, IN7, IN8, IN9, IN11, IN12 and IN14 as Single-ended like this:


As you do, you should see the corresponding pins on the CPU in the panel to the right turn green, like this:

When it comes time to wire something to a pin you’ll need to know where to find it on the Nucleo board. So for example, the CPU pic above tells you that ADC2_IN1 is found on PA4. There’s another pic on the card that came with the Nucleo that tells you where to find that on the connectors. All CPU pins are accessible, some via the Arduino connectors and other by the Morpho connectors. Here’s a blown up version of the Nucleo pic. It tells you you’ll find PA4 on the Arduino connector at A2, or on the Morpho connector at pin 32

  1. Next configure the peripherals via the “Configuration” tab up the top.

2.1 USART2 is used for debug printfs. You can connect to them with a terminal emulator program such as minicom. Point it at the serial USB device that appears when you plug in the Nucleo (/dev/ttyACM0 on my system). I prefer to run at 115200,8N1 like this:

2.2 ADC2. So far, in step 1, we’ve just configured which channels of ADC2 are enabled. The bulk of the ADC config happens here:

I think you need to change Number of Conversions to something other than 1 before it’ll let you enable Scan Conversions Mode. We also want Continous conversions, DMA continous requests, End of Conversion to be End of Sequence. Because we’ve specified 15 conversions, you’ll get 15 Rank entries. Open up each of them and select which channel each should scan. I’ve left 3 open in the pic so you can get the gist of it. The order I went with was: 1,3,4,1,5,6,1,7,8,1,9,11,1,12,14

You’ll also note Sampling Time for each channel, defaulting to 1.5 Cycles (the fastest). There are a bunch of possible values in that pop-up menu right up to 601.5 Cycles. These are ADC cycles, and the default ADC clock setting is 72MHz (see Clock Configuration tab). That Sampling Time is how long it will spend charging the S&H cap (20.8 nsecs to 8.35 usecs). Total conversion time, for a single pin in the sequence, is Sampling Time + 12.5 cycles. Total conversion time for the sequence is 15x that (assuming they’re all set to the same sampling time - they don’t need to be). I didn’t drive any of the signals in my testing, nor did I look at the data, but I suspect you’ll need to increase Sampling Time to quite a bit more than 1.5 cycles for stable readings. It all comes down to the source impedance of your signal.

Next, still inside ADC2 Configuration, click on the DMA Settings tab and add a DMA channel like this:

CIrcular tells the DMA controller to loop back to the beginning when it hits the end. We want it to increment the Memory address during the xfer, but not the Peripheral address, and we want it to do Half Word width xfers… the 12-bit conversion results are read from a 16-bit register in the ADC and the dma buffer is an array of uint16_ts.

Next, still inside the ADC2 Configuration, click on the NVIC Settings tab to configure interrupts like this:

We don’t actually need any interrupts from the ADC, because the DMA controller is doing all the heavy lifting (although you could enabled them if you wanted to). Enabling the DMA interrupts means it will interrupt you when the xfer buffer is half-full and full.

2.3 NVIC. Now exit ADC2 Config and go back to the main Configuration panel and click on NVIC. We’ve already configured most of the interrupt stuff via the ADC2 panel, but here we add “Generate IRQ handler” for DMA2 like this:

  1. Now configure Project Settings under the Project pull-down at the top. Give your project a name, and chose Makefile as the Toolchain/IDE like this:

And under Code Generator, make these selections:

You can now hit the Save button (creates a .ioc file that stores everything you’ve just done), and follow with a Project->Generate Code. That’ll create the source tree for you. You’ve now got 99% of the project written including all HAL initialisation calls to set up the ADC, NVIC, DMA etc.

Coming next… adding some user code to kick it all into action. The one thing they don’t add for you are any of the HAL_Start calls, because they figure you need to choose how/when to do that.


Excellent news!

This is a great tutorial @dBC, looking forward to the next instalment!

Great stuff @dBC, I checked for the toolchain this morning and it’s not installed. I hope to install it and run though your guide later today.

Do you think I’ve picked the right MCU? I was looking for ADC speed and channels over anything else, the combined speed of all 4 ADC’s does seem a bit OTT but I understood that multiple ADC’s were needed to get the channel count up and I had thought multiple ADC’s would allow simultaneous V and I samples to negate any interpolation, but I guess at these speeds not only is the gap to interpolate reduced, but I dare say the phase angle between samples is so small the corrections might actually become negligible unless we slow it all down. (I haven’t done any math to that effect, I’m just thinking out loud)

Nice work @dBC
I suspect this would have been a very steep learning curve without your support!


I reckon it’ll do the job, and the price is right! BTW, I’ve not completed my tests yet, but my original motivation for following your lead and getting an F303 was to test out the maths capability of the M4. It’s impressive! It’ll do an FFT on 1024 floating point values in 1.3 msecs. That’ll get you to the 512 complex pairs and it takes another 284 usecs to calculate their magnitudes. I suspect it’ll have a lot of potential for people wanting to do further analysis of the data.

Yeh, that would probably be useful if you were using shunts to measure I, and resistor dividers to measure V. Simultaneous conversions would then mean there’s pretty much nowhere for phase errors to creep in. But since you’ll always have VTs and CTs between you and the signal, I think you’re always going to have to deal with phase errors.

Yeh, I suspect you will want to slow it all down. It’ll ultimately come down to the effective sample rate of each channel. I think on those “current” channels above (i.e. the 10 of the 11 channels that aren’t channel 1) I’m sampling at 345 kHz which is way faster than it needs to be. And the “voltage” signal (channel 1), I’m sampling at 1.7MHz if I’ve done the maths right. The demo is more a demo of the potential horsepower available… those crazy high clock rates are not recommended for measuring power. Apart from anything else, you’ll need so many data points just to capture a mains cycle because each one has so little reach along the time axis.

On timing/phase errors, what we really want to do is start the conversions a defined time apart, or pick the sample pairs a defined time apart, so that the voltage and current appear to have been sampled simultaneously after taking their respective phase errors into account.

There’s no advantage in sampling too fast, because the high order harmonics should be very small in amplitude, so there should be little energy in them. EN61000-3-2 specifies limits up to the 40th harmonic, so in rough and ready terms, that’s 4 k samples/s (per channel of course). And that’s before it gets through the transformers. Those will function as low pass filters to a certain extent.

I’ll go back and play with the 3-phase sketch. To check that, I was setting the phase correction in steps of 0.1°, and that seemed adequate. But I’ll do it again carefully and record the effect on power factor.

So I just installed gcc-arm-none-eabi-7-2017-q4-major using the windows installer, no problems reported, I checked the “add to PATH” checkbox during the install.

I then opened a new project in the cubeMX and saved it and then created source without changing any MCU configurations, using the same “code generator” settings as recommended by dBC and “makefile” selected. it has created a project folder and populated it with various files including a “makefile”.

However, issuing the make command results in

'make' is not recognized as an internal or external command,
operable program or batch file.

I’ve checked the arm-none-eabi-gcc version to check it’s installed

C:\Users\paulb\Desktop\stm32\stm32project1>arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors 6-2017-q2-update) 6.3.1 20170620 (release) [ARM/embedded-6-branch revision 249437]
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO

and have also checked PATH and found an entry for C:\Program Files (x86)\GNU Tools ARM Embedded\6 2017-q2-update\bin\; but not make explicitly.

After spending some time searching and reading, I now believe “make” wasn’t installed as part of the tool chain and I need to install separately. I have hit a bit of a wall here as I can find very little about installing it to windows (this decade) and no official/reputable source to download from. Some sites even suggest installing the cygwin as a step towards installing “make”. Am I doing something obviously wrong, I’m sure this shouldn’t be this hard?

Using Windows :wink:

On Unix, make is definitely a completely separate package unrelated to which compiler tools you’ve installed, so it would make sense if that’s also the case on Windows. I’m afraid I can’t be much more help than that on this one.

Don’t know if this helps -


That would be very nice if you could get the h/w to correct for sensor induced phase errors… that would save a lot of maths for the cpu. In theory it sounds feasible, and with 4 ADCs you’d have to think you’re in with a chance. In that “DAC2 Configuration” panel above you can see there’s an option for conversion trigger source. In my tutorial example I’ve set it to “Regular conversion launched by software”, but there are a lot of Timer trigger options in there as well.

I’m having trouble reconciling that with all the effort stm have put into “Synchronised Mode” though. I would have thought they’d be synchronised just by triggering them both from the same Timer event. Up until yesterday, I’d never seen an stm32 device with more than one ADC, so I’m no expert on multiple ADCs (and even in the tutorial, I just stick to one ADC). I think the whole scheduling of inputs across ADCs could be an interesting optimisation problem, especially to support 3 phase.

I concur. And there are disadvantages, including just the sheer amount of data you have to grind through to get the answer you’re seeking… and S&H cap charge times. From memory I think my energy IC does all its work at 8 k samples/s (per channel) which is roughly in the same ballpark as what you’re suggesting.

Certainly a lot closer to that than to the crazy-high sampling rates in the tutorial program. I keep expecting to have a “Doh!” moment when I realise I’m out by a few orders of magnitude, but it really does seem to be sampling as fast as I think it is. My inputs are just floating, so I connected an extremely weak (1M) pull-up to Ch3 (one of the slow “current” channels) and to Ch1 (the faster “voltage” channel - the one that I interspersed repeatedly through the conversion sequence) and checked them on the scope. They really do seem to be sampling at the rates I predicted above:

Ch3 Input pin:

Ch1 Input pin:

I don’t have a problem there. This particular use has to be pretty thin on the ground, as compared to all the possible places I can think of where simultaneous is an absolute requirement. Like your 'scope - and that’s without trying.

What we’re looking at doing is using the capabilities of the processor to overcome shortcomings in the transducers. We can’t get away from those, and we can’t significantly reduce them even, without a significant cost penalty.

What I mean though is that if you can trigger the individual ADCs off the same Timer signal, and they’re running with the same clock, why do they need a special “Synchronised Mode”. It seems they’d be naturally synchronised with just the existing tricks. Although Sychronised Mode does let you fetch the two results at once in a single 32-bit read, so I guess that’s helpful for bus bandwidth etc. At these crazy high sampling speeds I can imagine all those internal busses are starting to get a bit busy.

I was guessing that “Synchronised” is better than simply started at the same instant. I’ve no idea though whether that’s valid.

ADC Tutorial Part2, fixing the Makefile and flashing an image

This post is likely platform dependent. @pb66 if you want to add any of your Windows findings, feel free to just edit it into this post if you think appropriate.

The “Makefile” selection in the Toolchain/IDE above is a new feature, and still a bit buggy at least on my Linux machine. Ideally, after Step 3 above, you’d be able just type ‘make’ in your project directory, and it would build your image e.g build/OEM.bin ready for flashing to your Nucleo.

If your link stage bombs out with lots of multiply defined symbols, open the Makefile in a text editor and check the list of files in the C_SOURCES variable. Remove all the duplicates.

If you get an error about the compiler not being found in the root directory, that’s because the BINPATH variable is not set. A solution I came across in the stm forums was to add this line to your Makefile, on the next line after the "all:’ line and before the “build the application” comment:

-include $(TARGET).mak

Then you can create a file in the project directory, right alongside the Makefile called project_name.mak and put your own directives/fixes in there. Here’s my OEM.mak:

BINPATH = /usr/bin
CFLAGS += -std=gnu99
LDFLAGS += -Xlinker --no-wchar-size-warning

WIth all that done, you should be able to type ‘make’ and get a clean build. To flash resultant binary onto your Nucleo board, simply copy it to the USB mass storage device like:

$ cp build/OEM.bin /media/dbc/NODE_F303RE/

One warning about the Makefile, it doesn’t seem to have built up the proper dependency rules based on which .c files include which .h files, so if you change a .h file you might want to do a

$ make clean
rm -fR .dep build

before issuing the make command. Note that ‘make clean’ completely nukes the build directory, so never put anything in there that you want to keep.

Coming up next… the /* USER CODE BEGIN */ additions to actually make the image do something, other than initialise HAL and hardware… finally!

1 Like

I thought that might be the case.

Thanks for the links Paul.

I will take another look at this tomorrow, am I the only one using windows? I know dBC isn’t, Trystan and Glyn will both be using Linux, perhaps I can find a better (non-windows) way of doing it.

3. ADC Tutorial Part3, adding user code to make it useful.

Once you’ve got a clean build out of part 2, it’s time to actually write your own code to make it do something useful. Actually, this ADC example doesn’t do anything useful, other than start the ADC running and monitor how often the data comes in, but it hopefully forms a good starting point for getting the ADC started and then adding your own code. You should always add your code only in the areas provided in the generated source, denoted by comments like /* USER CODE BEGIN n / / USER CODE END n */, thereby ensuring your code won’t get clobbered next time you need to regenerate via the GUI (double check you’ve ticked the relevant tick box in the Panel in Step 3 of Part 1 to ensure that).

I’ll attach a tar file of the Src and Inc directories at the end so you can have something concrete that compiles. But it goes roughly like this:

in main.c before the infinite loop starts (so a bit like Arduino’s setup()):


  HAL_GPIO_WritePin(LD2_GPIO_Port, LD2_Pin, GPIO_PIN_SET);      // LED on
  snprintf(log_buffer, sizeof(log_buffer),
	   "\nOEM ADC Demo 1.0\n");

  /* USER CODE END 2 */

in main.c the infinite loop (so a bit like Arduino’s loop()):

  /* Infinite loop */
  while (1)
    if (adc2_half_conv_complete && !adc2_half_conv_overrun) {
      HAL_GPIO_WritePin(LD2_GPIO_Port, LD2_Pin, GPIO_PIN_RESET);      // LED off
      adc2_half_conv_complete = false;                                // ready for the next batch

    if (adc2_full_conv_complete && !adc2_full_conv_overrun) {
      HAL_GPIO_WritePin(LD2_GPIO_Port, LD2_Pin, GPIO_PIN_SET);        // LED on
      adc2_full_conv_complete = false;

    // See if we've overrun and lost our place.
    if (adc2_half_conv_overrun || adc2_full_conv_overrun) {
      snprintf(log_buffer, sizeof(log_buffer), "Data overrun!!!\n");
      adc2_full_conv_complete = adc2_half_conv_complete =
	adc2_full_conv_overrun = adc2_half_conv_overrun = false;


Those four adc2_xxx_conv_xxx boolean flags are all flags set in the DMA ISR (below). The half version tells us the bottom half of the array has just been filled with conversions, and the full version tells us the top half of the array has just been filled with conversions. They’re cleared here in the main loop to acknowledge we’ve “processed them”. In fact, the only processing we do is turn the LED off when a new batch of data has arrived in the bottom half and turn it back on for the top half. It’s all happening way too fast for you to see that with the naked eye, but a scope on the LED reveals:

So a new batch of 50x15 readings is arriving every 146 usecs as theory predicts (50x15 x 14 cycles/conversion / 72MHz).

The rest happens in adc.c:


volatile uint16_t adc2_dma_buff[ADC2_DMA_BUFFSIZE];
volatile bool adc2_half_conv_complete, adc2_full_conv_complete;
volatile bool adc2_half_conv_overrun, adc2_full_conv_overrun;


ADC2_DMA_BUFFSIZE is15x100 from adc.h (also user code, that’s not some HAL setting). The booleans need to be volatile because they’re set by the ISR. The DMA buffer needs to be volatile because it’s being continuously written to by the DMA controller.

Still in adc.c, we have the two functions we called from main.c before we entered the infinite loop:



void calibrate_ADC2 (void) {

  HAL_ADCEx_Calibration_Start(&hadc2, ADC_SINGLE_ENDED);

void start_ADC2 (void) {

  HAL_ADC_Start_DMA(&hadc2, (uint32_t*)adc2_dma_buff, ADC2_DMA_BUFFSIZE);


If you check the hardware reference manual for the device, you’ll see the ADCs have a pretty complicated self-calibration process. Fortunately, it’ll all been implemented in the HAL by stm, just don’t forget to call it or you get lousy results.

The start routine is what kicks it all off. From the moment you call that, there’ll be a new conversion written somewhere in adc2_dma_buff[] every 194 nsecs forever… well until you call Stop and this example never does. This call is where you associate your data buffer with that DMA channel and tell it how big it is. Recall we configured it as “Circular” in the GUI which means instead of running off the end, it starts back at the beginning.

There’s no way of knowing where in the buffer the DMA is currently writing, except that we know: it writes sequentially from adc2_dma_buff[0] to adc2_dma_buff[1499] 16-bits at a time and generates a half-full interrupt as it writes to adc2_dma_buff[749] and a full interrupt as it writes to adc2_dma_buff[1499] but it just keeps on trucking whether you process the data or not. 194nsecs after writing to adc2_dma_buff[1499] it’ll be writing the next conversion to adc2_dma_buff[0]… on and on forever with no CPU intervention… even if you disable the notification interrupts, it just keeps trucking.

You can make the DMA buffer as big or as small as you like. The bigger you make it, the less frequent the interrupt rate will be and the more data you’ll have to process when they do occur, which is good for amortising the cost of the ISR overhead. Make it small enough and you’ll be doing nothing but servicing interrupts, especially at the crazy fast speeds this example is running the DAC.
[EDIT] - but make it a multiple of the number of channels in the ADC sequence thereby ensuring any particular position in the array will always correspond to the same channel. In this example it’s defined in adc.h like this:

/* USER CODE BEGIN Private defines */

#define ADC2_DMA_BUFFSIZE 15*100    // 15 samples in a sequence, 100 sequences
extern volatile uint16_t adc2_dma_buff[ADC2_DMA_BUFFSIZE];
extern volatile bool adc2_half_conv_complete, adc2_full_conv_complete;
extern volatile bool adc2_half_conv_overrun, adc2_full_conv_overrun;
/* USER CODE END Private defines */

Finally in adc.c are the handlers for the half-full and full-full interrupts:

void HAL_ADC_ConvHalfCpltCallback(ADC_HandleTypeDef* hadc)
  // If the flag is already set, process level has been too slow
  // clearing it down.
  if (adc2_half_conv_complete) {
    adc2_half_conv_overrun = true;
    adc2_half_conv_complete = false;
  } else
    adc2_half_conv_complete = true;

void HAL_ADC_ConvCpltCallback(ADC_HandleTypeDef* hadc)
  // If the flag is already set, process level has been too slow
  // clearing it down.
  if (adc2_full_conv_complete) {
    adc2_full_conv_overrun = true;
    adc2_full_conv_complete = false;
  } else
    adc2_full_conv_complete = true;



Normally that’s where you’d want to do some preliminary processing of the data in adc2_dma_buff[]. This example just notes its arrival so that the main loop can flash the LED.

You might wonder how just declaring two functions gets them linked into the ISR. Their names are special. The HAL has already declared two functions of the same name (that do nothing), but declared them WEAK. If you don’t provide these functions, the linker will resolve it to the WEAK version. If you do, the WEAK version disappears to allow your version (no doubly defined symbol errors, because one is defined WEAK).

Note also there’s no ADC2 in the name, just ADC. These two functions get called for all ADCs and it’s up to you to determine which ADC just passed a boundary by looking at the handle that is passed in. This example doesn’t bother doing that because there is only one ADC enabled.

There are other ways to do ISRs within the HAL boundaries. Have a look in Src/stm32f3xx_it.c. That’s where the actual ISRs live, i.e. those routines are what the vectors point to. And they too include /* USER CODE BEGIN */ so you can put your code in there, or you can use the Callback approach that this example uses.

There are two common gripes users have about HAL:

  1. uses a lot of RAM (all those init structures are static)
  2. interrupt latency

In this example I measured the latency to be about 2usecs and that’ll get slightly worse when you have to demux the incoming handle when you have more than one ADC pumping. If that’s an issue, moving your critical ISR code to Src/stm32f3xx_it.c can help.

Finally, debug_printf() is very simple it lives in usart.c and looks like:


void debug_printf (char* p) {
  HAL_UART_Transmit(&huart2, (uint8_t*)p, strlen(p), 1000);


I broke it out like that because some of my projects have other output devices that I sometimes want to send the debug messages out over. That ‘1000’ isn’t the length of the buffer, but rather the maximum time this routine should wait (in msecs) to get the message out through the UART. It’s a simple block-until-done interface so be careful where you call it. uart2 is hardwired to the st-link programmer strip at the top of the Nucleo board, and those messages will get sent all the way to the host via /dev/ttyACM0 etc.

When you fire up this program, you should see:

OEM ADC Demo 1.0

in your minicom window.

Oh, and finally, all these HAL routines return a status. This example ignores that and assumes the best, but probably shouldn’t.

Tar file of Src and Inc directories attached:
ADC_demo.tar.gz (12.5 KB)