打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
From Zero to main(): Demystifying Firmware Linker Scripts | Interrupt

This is the second post in our Zero to main() series.

Last time, we talked aboutbootstrapping a C environment on an MCU before invoking our main function. Onething we took for granted was the fact that functions and data end up in theright place in our binary. Today, we’re going to dig into how that happens bylearning about memory regions and linker scripts.

You may remember the following things happening auto-magically:

  1. We used variables like &_ebss, &_sdata, …etc. to know where each of oursections was placed in flash and to define where some needed to go in RAM.
  2. A pointer to our ResetHandler was found at address 0x00000004 for the MCUto find.

While these things are true for many projects, they are held together at best byconvention, at worst by generations of copy/paste engineering. You’ll find someMCUs have different memory maps, some startup scripts name those variablesdifferently, and some programs have more or less segments.

Since they are not standardized, those things need to be specified somewhere inour project. In the case of projects linked with a Unix-ld-like tool, thatsomewhere is the linker script.

Once again, we will use our simple “minimal” program, available onGithub.

Like Interrupt? Subscribe to get our latestposts straight to your mailbox

Brief Primer on Linking

Linking is the last stage in compiling a program. It takes a number of compiledobject files and merges them into a single program, filling in addresses sothat everything is in the right place.

Prior to linking, the compiler will have taken your source files one by one and compiledthem into machine code. In the process, it leaves placeholders foraddresses as (1) it does not know where the code will end up within the broaderstructure of the program and (2) it knows nothing about symbols outside of thecurrent file or compilation unit.

The linker takes all of those object files and merges them together along withexternal dependencies like the C Standard Library into your program. To figureout which bits go where, the linker relies on a linker script - a blueprint foryour program. Lastly, all placeholders are replaced by addresses.

We can see this at play in our minimal program. Let’s follow what happens toour main function in minimal.c for example. The compiler builds it intoan object file with:

$ arm-none-eabi-gcc -c -o build/objs/a/b/c/minimal.o minimal.c <CFLAGS>

We can dump symbols in minimal.o to look at main within it:

$ arm-none-eabi-nm build/objs/a/b/c/minimal.o...00000000 T main...

As expected, it does not have addresses yet. We then link everything with:

$ arm-none-eabi-gcc <LDFLAGS> build/objs/a/b/c/minimal.o <other object files> -o build/minimal.elf

And dump the symbols in the resulting elf file:

$ arm-none-eabi-nm build/minimal.elf...00000294 T main...

The linker has done its job, and our main symbol has been assigned an address.

The linker often does a bit more than that. For example, it can generate debuginformation, garbage collect unused sections of code, or run whole-programoptimization (also known as Link-Time Optimization, or LTO). For the sake ofthis conversation, we will not cover these topics.

For more information on the linker, there’s a great thread on StackOverflow.

Anatomy of a Linker Script

A linker script contains four things:

  • Memory layout: what memory is available where
  • Section definitions: what part of a program should go where
  • Options: commands to specify architecture, entry point, …etc. if needed
  • Symbols: variables to inject into the program at link time

Memory Layout

In order to allocate program space, the linker needs to know how much memory isavailable, and at what addresses that memory exists. This is what the MEMORYdefinition in the linker script is for.

The syntax for MEMORY is defined in the binutilsdocs and is as follow:

MEMORY  {    name [(attr)] : ORIGIN = origin, LENGTH = len    …  }

Where

  • name is a name you want to use for this region. Names do not carry meaning,so you’re free to use anything you want. You’ll often find “flash”, and “ram”as region names.
  • (attr) are optional attributes for the region, like whether it’s writable (w), readable (r), or executable (x). Flash memory is usually (rx),while ram is rwx. Marking a region as non-writable does not magically make itwrite protected: these attributes are meant to describe the properties of thememory, not set it.
  • origin is the start address of the memory region.
  • len is the size of the memory region, in bytes.

The memory map for the SAMD21G18 chip we’ve got on our board can be found in itsdatasheetin table 10-1, reproduced below.

SAMD21G18 Memory Map
MemoryStart AddressSize
Internal Flash 0x00000000 256 Kbytes
Internal SRAM 0x20000000 32 Kbytes

Transcribed into a MEMORY definition, this gives us:

MEMORY{  rom      (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00040000  ram      (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000}

Section Definitions

Code and data are bucketed into sections, which are contiguous areas of memory.There are no hard rules about how many sections you should have, or what theyshould be, but you typically want to put symbols in the same section if:

  1. They should be in the same region of memory, or
  2. They need to be initialized together.

In our previous post, we learned about two types of symbols that are initializedin bulk:

  1. Initialized static variables which must be copied from flash
  2. Uninitialized static variables which must be zeroed.

Our linker script concerns itself with two more things:

  1. Code and constant data, which can live in read-only memory (e.g. flash)
  2. Reserved sections of RAM, like a stack or a heap

By convention, we name those sections as follow:

  1. .text for code & constants
  2. .bss for unintialized data
  3. .stack for our stack
  4. .data for initialized data

The elf spec holds a full list.Your firmware will work just fine if you call them anything else, but yourcolleagues may be confused and some tools may fail in odd ways. The onlyconstraint is that you may not call your section /DISCARD/, which is areserved keyword.

First, let’s look at what happens to our symbols if we do not define any ofthose sections in the linker script.

MEMORY{  rom      (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00040000  ram      (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000}SECTIONS{    /* empty! */}

The linker is perfectly happy to link our program with this. Probing theresulting elf file with objdump, we see the following:

$ arm-none-eabi-objdump -h build/minimal.elfbuild/minimal.elf:     file format elf32-littlearmSYMBOL TABLE:no symbols

No symbols! While the linker is able to make asumptions that will allow it tolink in symbols with little information, but it at least needs to know eitherwhat the entry point should be, or what symbols to put in the text section.

.text Section

Let’s start by adding our .text section. We want that section in ROM. Thesyntax is simple:

SECTIONS{    .text :    {    } > rom}

This defines a section named .text, and adds it to the ROM. We now need totell the linker what to put in that section. This is accomplished by listing allof the sections from our input object files we want in .text.

To find out what sections are in our object file, we can once again useobjdump:

$ arm-none-eabi-objdump -hbuild/objs/a/b/c/minimal.o:     file format elf32-littlearmSections:Idx Name          Size      VMA       LMA       File off  Algn  0 .text         00000000  00000000  00000000  00000034  2**1                  CONTENTS, ALLOC, LOAD, READONLY, CODE  1 .data         00000000  00000000  00000000  00000034  2**0                  CONTENTS, ALLOC, LOAD, DATA  2 .bss          00000000  00000000  00000000  00000034  2**0                  ALLOC  3 .bss.cpu_irq_critical_section_counter 00000004  00000000  00000000  000000342**2                  ALLOC  4 .bss.cpu_irq_prev_interrupt_state 00000001  00000000  00000000  000000342**0                  ALLOC  5 .text.system_pinmux_get_group_from_gpio_pin 0000005c  00000000  0000000000000034  2**2                  CONTENTS, ALLOC, LOAD, READONLY, CODE  6 .text.port_get_group_from_gpio_pin 00000020  00000000  00000000  000000902**1                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE  7 .text.port_get_config_defaults 00000022  00000000  00000000  000000b0  2**1                  CONTENTS, ALLOC, LOAD, READONLY, CODE  8 .text.port_pin_set_output_level 0000004e  00000000  00000000  000000d2  2**1                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE  9 .text.port_pin_toggle_output_level 00000038  00000000  00000000  000001202**1                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 10 .text.set_output 00000040  00000000  00000000  00000158  2**1                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 11 .text.main    0000002c  00000000  00000000  00000198  2**2                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE

We see that each of our symbol has a section. This is due to the factthat we compiled our firmware with the -ffunction-sections and -fdata-sections flags.Had we not included them, the compiler would have been free to merge severalfunctions into a single text.<some identifier> section.

To put all of our functions in the .text section in our linker script, we usethe following syntax: <filename>(<section>), where filename is thename of the input files whose symbols we want to include, and section is thename of the input sections. Since we want all .text... sections in all files,we use the wildcard *:

.text :{    KEEP(*(.vector*))    *(.text*)} > rom

Note the .vector input section, which contains functions we want to keep atthe very start of our .text section. This is so the Reset_Handler is where the MCUexpects it to be. We’ll talk more about the vector table in a future post.

Dumping our elf file, we now see all of our functions (but no data)!

$ arm-none-eabi-objdump -t build/minimal.elfbuild/minimal.elf:     file format elf32-littlearmSYMBOL TABLE:00000000 l    d  .text  00000000 .text...00000000 l    df *ABS*  00000000 minimal.c00000000 l     F .text  0000005c system_pinmux_get_group_from_gpio_pin0000005c l     F .text  00000020 port_get_group_from_gpio_pin0000007c l     F .text  00000022 port_get_config_defaults0000009e l     F .text  0000004e port_pin_set_output_level000000ec l     F .text  00000038 port_pin_toggle_output_level00000124 l     F .text  00000040 set_output00000000 l    df *ABS*  00000000 port.c00000190 l     F .text  00000028 system_pinmux_get_config_defaults00000000 l    df *ABS*  00000000 pinmux.c00000208 l     F .text  0000005c system_pinmux_get_group_from_gpio_pin00000264 l     F .text  00000110 _system_pinmux_config00000164 g     F .text  0000002c main000001b8 g     F .text  0000004e port_pin_set_config00000374 g     F .text  00000040 system_pinmux_pin_set_config...

.bss Section

Now, let’s take care of our .bss. Remember, this is the section we putuninitialized static memory in. .bss must be reserved in the memory map, but thereis nothing to load, as all variables are initialized to zero. As such, this iswhat it should look like:

SECTION {    ...    .bss (NOLOAD) :    {        *(.bss*)        *(COMMON)    } > ram}

You’ll note that the .bss section also includes *(COMMON). This is aspecial input section where the compiler puts global unitialized variables thatgo beyond file scope. int foo; goes there, while static int foo; does not.This allows the linker to merge multiple definitions into one symbol if theyhave the same name.

We indicate that this section is not loaded with the NOLOAD property. This isthe only section property used in modern linker scripts.

.stack Section

We do the same thing for our .stack memory, since it is in RAM and not loaded.As the stack contains no symbols, we must explicitly reserve space for it byindicating its size. We also must align the stack on an 8-byte boundary per ARMProcedure Call Standards(AAPCS).

In order to achieve these goals, we turn to a special variable ., also knownas the “location counter”. The location counter tracks the current offset into agiven memory region. As sections are added, the location counter incrementsaccordingly. You can force alignemnt or gaps by setting the location counterforward. You may not set it backwards, and the linker will throw an error if youtry.

We set the location counter with the ALIGN function, to align the section, anduse simple assignment and arithmetic to set the section size:

STACK_SIZE = 0x2000; /* 8 kB */SECTION {    ...    .stack (NOLOAD) :    {        . = ALIGN(8);        . = . + STACK_SIZE;        . = ALIGN(8);    } > ram    ...}

Only one more section to go!

.data Section

The .data section contains static variables which have aninitial value at boot. You will remember from our previous article that since RAMisn’t persisted while power is off, those sections need to be loaded from flash.At boot, the Reset_Handler copies the data from flash to RAM before the mainfunction is called.

To make this possible, every section in our linker script has two addresses,its load address (LMA) and its virtual address (VMA). In a firmware context,the LMA is where your JTAG loader needs to place the section and the VMA iswhere the section is found during execution.

You can think of the LMA as the address “at rest” and the VMA the address duringexecution i.e. when the device is on and the program is running.

The syntax to specify the LMA and VMA is relatively straightforward: every address istwo part: AT . In our case it looks like this:

.data :{    *(.data*);} > ram AT > rom  /* "> ram" is the VMA, "> rom" is the LMA */

Note that instead of appending a section to a memory region, you could also explicityspecify an address like so:

.data ORIGIN(ram) /* VMA */ : AT(ORIGIN(rom)) /* LMA */{    . = ALIGN(4);    _sdata = .;    *(.data*);    . = ALIGN(4);    _edata = .;}

Where ORIGIN(<region>) is a simple way to specify the start of a region. Youcan enter an address in hex as well.

And we’re done! Here’s our complete linker script with every section:

Complete Linker Script

MEMORY{  rom      (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00040000  ram      (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000}STACK_SIZE = 0x2000;/* Section Definitions */SECTIONS{    .text :    {        KEEP(*(.vectors .vectors.*))        *(.text*)        *(.rodata*)    } > rom    /* .bss section which is used for uninitialized data */    .bss (NOLOAD) :    {        *(.bss*)        *(COMMON)    } > ram    .data :    {        *(.data*);    } > ram AT >rom    /* stack section */    .stack (NOLOAD):    {        . = ALIGN(8);        . = . + STACK_SIZE;        . = ALIGN(8);    } > ram    _end = . ;}

You can find the full details on linker script sections syntax in theldmanual.

Variables

In the first post, our ResetHandler relied on seemingly magic variables to knowthe address of each of our sections of memory. It turns out, those variable camefrom the linker script!

In order to make section addresses available to code, the linker is ableto generate symbols and add them to the program.

You can find the syntax in the linkerdocumentation,it looks exactly like a C assignment: symbol = expression;

Here, we need:

  1. _etext the end of the code in .text section in flash.
  2. _sdata the start of the .data section in RAM
  3. _edata the end of the .data section in RAM
  4. _sbss the start of the .bss section in RAM
  5. _ebss the end of the .bss section in RAM

They are all relatively straightforward: we can assign our symbols to the valueof the location counter (.) at the start and at the end of each sectiondefinition.

The code is below:

    .text :    {        KEEP(*(.vectors .vectors.*))        *(.text.*)        *(.rodata.*)        _etext = .;    } > rom    .bss (NOLOAD) :    {        _sbss = . ;        *(.bss .bss.*)        *(COMMON)        _ebss = . ;    } > ram    .data :    {        _sdata = .;        *(.data*);        _edata = .;    } > ram AT >rom

One quirk of these linker-provided symbols: you must use a reference tothem, never the variable themselves. For example, the following gets us apointer to the start of the .data section:

uint8_t *data_byte = &_sdata;

You can read more details about this in the binutilsdocs.

Closing

I hope this post gave you confidence in writing your own linker scripts.

In my next post, we’ll talk about writing a bootloader to assist with loadingand starting your application.

EDIT: Post written! - Writing a Bootloader from Scratch

As with previous posts, code examples are available on Github in the zero to mainrepository

See anything you’d like to change? Submit a pull request or open an issue atGithub

Like Interrupt? Subscribe to get our latest postsstraight to your mailbox

本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
STM32 Developing with GCC tools
arm开发经验笔记(ZT)
u-boot 内部工作机制
Ramdisk Versus Ramfs
Scatter File的用法
Part 3 - The Linker Script
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服