Tangible Computing
24. The AVR Tool Chain




24.1 Overview

Computing systems are built on layers of abstaction. For us, the bottom layer is rarely below the level of a board containing the processors and other interface circuits. We tend not to look inside the (literal) black boxes in terms of chip hardware. But we do look in the boxes in terms of functional components. Here is what the main processor on the arduino looks like in terms of functional boxes Atmel Mega Block Diagram While this Atmel Mega CPU Diagram is the a view of the Central Processing Unit

The actual processor is programmed by loading the bits of the various instructions into the program memory, and the required data into the Data SRAM. This machine code is loaded into the machine with a bootloader. We don't like to write machine code if we can avoid it.

Instead, we would use assembly language, which lets us work with reasonable appearing names, and enables us to write programs that are just a little above the binary loaded into the machine. The assembly language for each series of processor varies, so one needs to look at a variety of lengthly manuals.

The complete description of the Atmel ATMega 2560 (The Arduino Mega) can be found in Atmel ATMega 2560 Processor Manual and the Atmel ATMega 328p (The Arduino Uno) can be found in Atmel ATMega 328p Processor Manual

Machine code is generally quite sensitive to where the instructions are placed in memory, since many addresses appear in the instructions. It is possible to write relocatable code that can be placed any where, but this is not that common anymore. The assembler usually produces object code, which is code that does not have to be placed into one specific portion of memory. This enables us to to modularize our program into small chunks, usually for reuse. So there is a process of integrating a number of object code chunks into a final monolithic piece of machine code. This is called the program linker.

So the first layers of our world look like this, and the program that translates a layer into the one below is beside the |
Assembly Language
| Assembler
Object Code
| Linker
Machine Code binary programs
| Boot loader
Processor Hardware
There are two big problems with assembly language: So we add another layer of abstraction, in the form of the a higher-level programming language. In this case C/C++. The idea is that the high level language can be translated, more or less faithfully, into a variety of target machines.

Here is the our complete stack of abstractions:

Higher level programming language (C, C++, etc), .c, .cpp, and .h files
    | Compiler (target architecture specific) avr-gcc, avr-g++
Assembly Language, .s files
    | Assembler
Object Code, .o files
    | Library builder,
Object Libraries, .a files
    | Linker, takes .o and .a files
Executables, portable .elf files
    | Boot converter (takes .elf {.a})
Machine Code Binary, .hex files
    | Boot loader (avrdude)
Bits in the memory of the processor


This is the so-called compiler tool chain.

24.2 Software Abstraction Layers

Now over time, compiler writers realized that there are a number of common services that need to be provided to programs. So they wrote these services and packaged them into libraries. The most famous is libc, the C support library. Things like malloc, printf, etc are part of libc, not the C language. The next most famous is libm, the math library.

So, we can take a different view of the world, and think in terms of functional abstractions, not language. The translation between layers is one of "using", in that the layer above is built assuming the existence of the layer below. So typically:
main program
libm
libc
Processor hardware
The Arduino environment takes this one step further, by abstracting the main program into two parts: setup and loop, and supplying a collection of Arduino core functionality (like delay, Serial, pinMode, etc).
setup(), loop()
Arduino core
main program
libm
libc
Processor hardware
So, let's take a look at the steps in the tool chain.

  1. Write the program Blink.cpp

  2. It compiles into assembly (conceptually), stored in Blink.s

  3. That gets translated into object code Blink.o, which is binary with some special symbols. Look at the output of avr-nm Blink.o to see the various symbols.

  4. After linking, there is a Blink.elf file Executable and Linkable Format that is a standard file format for executables, object code, and dumps. Look at the output of avr-nm Blink.elf to see the resulting fixed location of symbols in memory.

  5. This is then converted into a Blink.hex file, which can be loaded by the avrdude program into the arduino.
At any stage of the tool chain, things can go wrong! Also the steps in the process can be quite complex, depending on which system you are running in, and what target machine you are producing code for. To control this process we need some kind of build system. The one we will be using is called Gnu makeC, and it is controlled by Makefiles.

24.3 Makefiles for the Arduino Environment

Compiling and loading to the arduino boards is quite complicated. Cross-compilation is particularly complex, since you are running tools on one machine to produce code that is to be placed on another machine with a potentially different architecture and operating environment.

You need to specify the kind of board you are using, and what serial port it is attached to. If you are compiling different programs for different boards, keeping track of what board gets what program can be interesting. You have to manage details such as ensuring sure that you don't try to upload to a board while the serial monitor is also talking to board. You also need to specify what libraries your program requires, and make sure that the right versions of the libraries are used for the type of board you are targetting.

The arduino-ua toochain comes with a complex Makefile that manages these concerns for you. All you need to do is supply a few pieces of information. The template for this file is in
arduino-ua/mkfiles/Makefile-template
which you copy over to your Makefile in the directory where you are doing code development.

The main things you need to supply are described in the Makefile below. If you don't need anything other than the defalt libraies, you can use it as is.

code/MakefilesUA/Makefile

    # Template makefile for arduino-ua toolchain
    # Version 2.0 - 2012-11-07
     
    # Remember to `make clean` before `make upload to a different board type 
    # Actually, when you resume development its a good idea to start with a
    # make clean.
     
    # This is needed only if you have a main .ino file from a use of the IDE.
    # Normally you will only have .cpp files.
    TARGET = 
     
    # Two common board types, mega2560 and uno.
    # Either set your board type here or supply it on the make command as in
    #               `make upload BOARD_TAG=uno`
    BOARD_TAG = mega2560 
     
    # Identify and of the extra libraries that are not in the core collection
    # here.  If you don't have any, leave it undefined.
    # For example:
    #       ARDUINO_LIBS = SPI Adafruit_GFX Adafruit_ST7735 \
    #          Adafruit_SD Adafruit_SD/utility SD/utility UAUtils
    ARDUINO_LIBS = UAUtils
     
    # if there is a ARDUINO_UA_ROOT environment variable, it defines the
    # root of the arduino_ua install.  If not, we assume it is $(HOME)
    ifndef ARDUINO_UA_ROOT
      ARDUINO_UA_ROOT=$(HOME)
    endif
     
    # now set up all the toolchain defaults for the arduino-us install
    include $(ARDUINO_UA_ROOT)/arduino-ua/mkfiles/ArduinoUA.mk
     
    # This is magic to define the names MEGA or UNO in C/C++ files. on the 
    # basis of the board type.
    BOARD_DEFINE := $(shell echo $(BOARD_TAG) | tr 'a-z' 'A-Z' | tr -d [0-9])
     
    # You can also add extra symbols to be defined, like DEBUG 
    DEFINITIONS = $(BOARD_DEFINE) 
     
    # this puts the -D arguments into the format rquired by the compiler.
    DEFINES := ${DEFINITIONS:%=-D%}
     
    # Define your compiler flags. Remember to `+=` the rule.
    #CFLAGS += -Wall -Werror -std=c99
    #CXXFLAGS += -Wall -Werror
    CPPFLAGS += $(DEFINES) 
     
    # Override the default optimization levels here
    # For example, set to optimize level 0 to get the mimimum amount
    # CPP_OPTIMIZE = -O0
    # C_OPTIMIZE = -O0
    # LD_OPTIMIZE = -O0
     
    # Sometimes it is a good idea to see what code is being generated by the 
    # compiler.  Set GENERATE_ASM to 1 to do this, and you will get .s files
    # in the build-cli directory.  To do this only on demand, you can say
    #    make GENERATE_ASM=1
    # So for example if you only want to look at the assembly for the main 
    # program, compile everything, touch the code, and do a make with the 
    # GENERATE_ADM set in the command.
    # GENERATE_ASM = 1 


24.4 Makefiles in General

Until otherwise stated, assume that you are working in a unix-like system. The $ at the beginning of the line indicates a command line prompt.

Here is a simple C program hello.c that you want to compile and run. code/Makefiles/hello.c

    #include <stdio.h>
    int main(int argc, char* argv[]){
        printf("Bonjour tout le monde\n");
        }


This program is self-contained, in that it uses only the standard C libraries, and nothing else. To compile this program and produce an executable called hello, you can simply do:
$ make hello
The make program is a very powerful build-manager, with many default rules, so that if you ask it to make something in particular, like hello it will look in the current directory for a source file hello.c and attempt to compile it. To see what make will do, without actually performing any actions, you can use the -n option and try
$ make -n hello
If you tried the earlier make command, you will probably get output that looks like this:
make: `hello` is up to date.
This is because hello.c has not changed, and make will in general not attempt to build something for which the building blocks have not changed. Make's notion of "not changed" is very simple. If a component, like hello.c is older than the target thing being built, that is, hello, then it assume that the taget is up to date.

We can edit and make a change to hello.c, or simply fake a change by to it by touching its time stamp as with
$ touch hello.c
Now when you do
$ make -n hello
the output (under ubuntu linux) is
cc hello.c -o hello
Which indicates that the C compiler named cc is being used for compilation.

The default use of cc may not be what we want. Suppose we always want to use gcc. Then we need to create our own custom Makefile that contains the single line that sets the variable CC to have the value gcc
CC = gcc
Note, tab characters are special in Makefiles, and are used to define "recipies", so everything else (including the variable definition above) should start without a tab.

Now when we do
$ touch hello.c
$ make -n hello
the output (under any linux) is
gcc hello.c -o hello


24.5 Makefiles For Collections Of Files

In a more complicated situation where programs depend on each other, you need to put this dependency information in your Makefile.

For example, suppose we have a main program, construct.c that depends on two modules count and thing.
code/Makefiles/construct.c

    #include <stdio.h>
    #include "thing.h"
    #include "count.h"
     
    #ifndef NUM_THINGS
    #define NUM_THINGS 10
    #endif
     
    int main(int argc, char* argv[]) {
        int j;
     
        printf("Making some things\n");
        for (j=0; j < NUM_THINGS; j++) {
            int thing = make_thing();
            printf("Made thing %d\n", thing);
            }
     
        printf("We made %d things\n", get_count());
     
        }

code/Makefiles/count.c

    /* shared global counter */
    static int count = 0;
     
    int get_count() {
        return count;
        }
     
    int inc_count() {
        count++;
        return count;
        }

code/Makefiles/count.h

    /* shared global counter */
    #ifndef _COUNT_H_
    #define _COUNT_H
    #else
    extern int get_count();
    extern int inc_count();
    #endif

code/Makefiles/thing.c

    /* thing factory */
    #include "count.h"
     
    int make_thing() {
        return inc_count();
        }
        

code/Makefiles/thing.h

    /* thing factory */
    #ifndef _THING_H_
    #define _THING_H
    #else
    extern int make_thing(void);
    #endif


If we just try to
$ make construct
we get an error.
$ gcc construct.c -o construct
/tmp/ccfV1GKk.o: In function `main':
construct.c:(.text+0x20): undefined reference to `make_thing'
construct.c:(.text+0x4a): undefined reference to `get_count'
collect2: ld returned 1 exit status
make: *** [construct] Error 1
The problem is that make has no idea that construct requires two other object modules:
thing.o and count.o
in order to build the final executable.

If we add this dependency to the Makefile
construct : thing.o count.o
Then we get:
$ make construct
gcc construct.c thing.o count.o -o construct
Which produces the executable.

Now, if we change any file then the appropriate files will be rebuilt. then we get:
$ touch construct.c count.c
$ make construct
gcc -c -o count.o count.c
gcc construct.c thing.o count.o -o construct
Now, thing.c depends on count.c, because changing the interface to count will potentially affect thing (and construct). But if we change count.c make does not know that thing needs to be recompiled.
$ make construct
gcc -c -o count.o count.c
gcc construct.c thing.o count.o -o construct
So we should add the other functional dependencies
thing.o : count.o
Also the *.h files define interface dependencies, and so should also be mentioned, but in the context of the *.o files generated that depend on them:
thing.o : thing.h
count.o : count.h


If you are making the intermediate .o object file for construct, then you need a dependency like this:
construct.o : count.h thing.h
IMPORTANT NOTE: Do not add a dependency like this:
program.c : foo.h
because if foo.h is changed, it means to make will want to remake program.c, and unless you have a way of generating C source files, you have a problem since make potentially removes all targets.

Here is our resulting Makefile, that contains only dependencies and variable definitions, no recipies. code/Makefiles/Makefile

    CC = gcc
    construct : thing.o count.o
    thing.o : count.o
     
    count.o : count.h
    thing.o : thing.h count.h



24. The AVR Tool Chain
Tangible Computing / Version 3.20 2013-03-25