Introduction
Typically, embedded systems use a non-volatile storage device to store the software. Nevertheless, modifiable data or applications must reside in dynamic memory to benefit from their read/write capability as well as the increased execution speed they provide.
The transfer of any code or data between the two types of memory is achieved by a small piece of code, the loader.
The ZynQ loader functionality is split into two different small programs: The BootROM code and the First Stage bootloader. The former, part of the system that executes at start up, runs first to transfer the First Stage Bootloader from non-volatile storage to dynamic On Chip Memory (OCM). The latter then executes from OCM and copies the application from non-volatile memory to external dynamic memory, DDR memory, where the application then executes.
The mission, if you to choose to accept it…
Some time ago, I was asked to look into the feasibility of a DDR less ZynQ embedded system in a view to execute applications from the 256Kbyte OCM.
This DDR less configuration implies allocating two regions of OCM memory, one to host the First Stage Bootlader and the other for the application. The FSBL cannot obviously copy the application where it executes from, it would overwrite itself.
Therefore, the FSBL currently takes space from the original 256Kbyte OCM and the idea of having the FSBL executing from the ROM or Flash, known as executing in place, instead of being loaded in OCM, becomes an attractive alternative that leaves the OCM (most of anyway) free for the application.
Scope
Typically, due to the space limitation, the application executing from OCM is a bare metal executable as an RTOS and the application code together would just not fit.
Therefore, this application note examines how to enable ZynQ eXecute In Place (XIP) feature for a bootloader compiled and linked together with a bare metal application in a flash image, stored in QSPI, booting in non-secure mode. The bootloader then transfers control to the bare metal application loaded in OCM.
The methods used can also be applied to NOR Flash and in secure mode.
This document does not fully cover each tool, such as the PS configuration Wizard, the compiler, the linker, the flash image generator, etc… or each file like the FSBL, the linker command file, the flash image, etc…, instead it provides a practical approach on developing for XIP, on the components involved, and how they relate to one another.
From Hardware configuration to Flash image.
This chapter starts with a succinct description of what is currently supported by the tools and follows on with how to enable XIP before there is full support in later versions of the tools.
Currently it takes just a few minutes for a developer to build and run a bare metal executable image on ZynQ.
Support for such is provided by the tools and descriptions are available in the User Guides “Zynq-7000 programmable SoC Software developers Guide” UG821 and “Zynq-7000 programmable SoC Software developers Guide” UG585 .
The Hardware tools (ZynQ configuration Wizard) generate files that describe the ZynQ Processing System (PS) configuration.
Then, these files form the SDK software tools sources used to generate BSP, FSBL source files, Flash image etc…representing the custom embedded system.
Nevertheless, the configuration Wizard does not allow as yet declaring a ZynQ PS without DDR memory attached, therefore the software tools generate a BSP and FSBL with DDR memory support.
The standard flow to create a flash image is summarised by the following chart:
Xilinx tools SDK, facilitate bootloader development by providing the First Stage BootLoader (FSBL) targeted to the embedded system created by the Hardware tools.
The flash image, generated by the flash image generator bootgen, executable from the SDK tools, starts with a header (the bootheader) containing information on the FSBL location within that image (the Load address) and where it should be executed from (The Execution address).
The flash image is in the MCS file format, ready to be programmed to flash using Xilinx Flash programmer, another tool accessible from SDK.
The MCS file Endianness uses little Endian byte ordering, i.e the least significant byte of a 4-byte word goes into memory first. First, here, means the lower memory address. This complies with ARM Endianness.
Booting ZynQ
The BootROM code searches the synchronisation word within the header which validates the BootHeader, then goes on determining if the FSBL needs to be loaded into OCM or Executed in place as per the following diagram:
As long as the BootROM code is concerned, the flash image requirements to enable XIP can be determined from UG585, table 6-3 which describes the BootHeader fields.
Here is an example of MCS file out of Bootgen, colour matched with the Bootheader fields from table 6-3 of the TRM:
The fields of interest to enable XIP are the “Length of Image” and the “Total Image Length” respectively word offset 0x34 and 0x40 in the MCS file.
If these fields are marked with value 0x0 in the BootHeader, the BootROM code then jumps to the FSBL start, further down the image, at an address location given as an offset by the field “Source Offset”, word 0x30 in the BootHeader.
In this example, the Length of Image value is 0x0001358C, the Total Image Length equals 0x0001358C and the offset of the FSBL from the top of the image, Source Offset, is 0x00000A80.
One note concerning word offset 0x38 marked as “reserved”. The documentation stipulates that it needs to be initialised to 0x0 as a general rule, but it should be emphasised that this is particularly important to enable XIP, otherwise XIP will not work. The current MCS example shows that the “reserved” word is not 0x0 but 0x01000000.
While creating a ZynQ boot image with the Bootgen utility from SDK, various attributes can be used to modify the flash image.
Provision for XIP has been made by introducing the “-static” attribute but its functionality is not yet available through the tools. This attribute, attached to the FSBL file, would populate with 0x0 the required fields to enable XIP.
At the moment, bootgen populates the three fields, “Length of Image”, “reserved” and the “Total Image Length” with non-zero values, matching the description provided in UG585.
Again “-static” would zero these fields.
A simple tcl script can replace the “-static” function if run on the MCS file output by bootgen..
The tcl script provided is mcs_dev.tcl.
The first instruction line contains the MCS source file name. This name should be changed to the one corresponding to the MCS source file.
set MCS_file hello_DDR_Less_std
For convenience, the script should be placed in the same folder as the MCS source file.
Mcs_dev.tcl modifies the required fields for XIP, recalculates the modified lines and header checksums then copies the source MCS file content with the modified fields in a new file named with a “XIP_” prefix appended to the MCS file name.
Bootgen places the FSBL at an offset (the Source Offset in the BootHeader) from the beginning of the executable image it creates. This offset is a function of the amount and size of the other input files that are stitched together to create the final image. Therefore this offset can change if partitions are added or removed from the original image. But Bootgen provides the option to control this offset and change it to a convenient location, to fix it to an address so that it does not change when bootgen is run again. When a custom offset is used, bootgen will anyway make sure it is valid and issue an error otherwise.
As for the FSBL Start of execution address, at offset 0x3C in the bootHeader, it is actually extracted from the FSBL ELF file by Bootgen, as it is calculated by the linker from the linker script directives.
The FSBL and linker Script
The linker script cannot be generic, because it must define the specific memory used for XIP, as well as other application-specific information.
Minor changes to the FSBL source allow a maximum re-use of the code, and therefore of its functionality.
The FSBL source code modifications include removing any code relative to the DDR memory.
Ps7_init.c: commenting out -> ps7_config (ps7_ddr_init_data)
commenting out -> if (ps7_config (ps7_ddr_init_data) == -1) return -1;
Main.c :
commenting out ->
//#ifndef FSBL_PERF
//Status = Check_ddr_init();
//if (Status == XST_FAILURE) {
//fsbl_printf(DEBUG_GENERAL,"DDR_INIT_FAIL \r\n");
///* Error Handling here */
//OutputStatus(DDR_INIT_FAIL);
//FsblFallback();
//}
//#endif
The FSBL executable image is comprised of multiple program sections. Some of them require careful placing and handling.
The FSBL linker script is used for describing, via directives and commands, the physical memory and where each section of code resides and executes.
Because of this capability, a program section can have both a load address (LMA) and a run address (VMA).
The FSBL program code (.text section) resides and executes from flash for XIP, whereas the FSBL program data (.data section) resides in Flash at power up but needs to stay in dynamic memory (OCM) during execution because it contains modifiable data hence needing read/write memory accesses.
The BootROM code does not have to copy the FSBL to OCM as it executes in place, so while executing, the FSBL has to copy its own data section to OCM as it is both readable and writeable
This operation is done first thing, with the following lines of code placed at the start of main:
In main.c:
extern char _image_start, _dataLMA, _dataVMA_start, _dataVMA_end, _vectorscopy, __vectors_start, __vectors_end;
static void copy(char *src, char *dstStart, char *dstEnd) {
while (dstStart < dstEnd) {
*dstStart++ = *src++;
}
}
#define WRITE_VEC_BASE_ADDR(value) mtcp(XREG_CP15_VEC_BASE_ADDR,value)
#define READ_VEC_BASE_ADDR(value) value = mfcp(XREG_CP15_VEC_BASE_ADDR)
int main(void)
{
#ifdef XPAR_PS7_DDR_0_S_AXI_BASEADDR
u32 BootModeRegister = 0;
u32 HandoffAddress;
volatile u32 RebootStatusRegister = 0;
volatile u32 MultiBootReg = 0;
u32 ImageStartAddress = 0;
u32 PartitionNumber = 0;
u32 Status = XST_SUCCESS;
#endif
copy(&_dataLMA,&_dataVMA_start,&_dataVMA_end);
_dataLMA: represents the start address of the data section in flash (the Load Memory Address)
_dataVMA_start: represents the start of the data section in OCM (The Virtual Memory Address)
_dataVMA_end : represents the end of the data section in OCM.
The FSBL linker script is customized to provide values to the three addresses needed in copying the data section at linking time. The output of the linking process is an ELF executable.
The allocation information for the sections, such as section size and section run address, is part of the ELF file section header.
Bootgen extracts this information to create the flash image BootHeader and partition headers for the BootROM code to read.
The FSBL source code and linker script are provided in the XIP SDK package along with the mcs_dev.tcl file.
The XIP functionality has been developed using SDK 14.4.
As new functionalities and modifications are brought to the FSBL at every release of the tools, it is recommended to use XIP first with SDK 14.4 before migrating to later versions of the tool.
Practical example using the package for XIP.
1. Open the workspace with SDK 14.4.
2. Recompile the FSBL by cleaning the project. The FSBL is compiled with the debug options ON so that messages are output while executing. A Warning appears :
section `.data' can't be allocated in segment 0
but ld (the GNU linker) will place the data section correctly at address 0x0 The warning is because the VMA for data is outside the virtual addresses covered by the linker script program header. (The warning should probably say something to that effect rather than "can't be allocated" when the section *is* allocated there.)
But this warning can be gnored safely.
To prove it, in the SDK project for xip_FSBL, simply double click the ELF file under the debug or Binary to display it. Searching the following symbols
.data is of size 0x1104.
_dataLMA symbol has value 0xfc009388, representing the address where the data section is loaded in Flash.
_dataVMA_start symbol has value 0x00000000, representing the address in OCM where the data section starts when the FSBL executes.
_dataVMA_end symbol has value 0x00001104, representing the address in OCM where the data section ends when the FSBL executes.
3. A “hello World” application from the SDK tools is already generated. The only change compared to the default is in the linker script MEMORY region defined. The region targeted is the free space in OCM. (away from the data section heap and stack).
- The heap is allocated the address range 0x1420 (symbol _heap_start) to 0x3420 (symbol _heap_end) with size 0x2000.
- The stack is allocated the address range 0x3420 (symbol _stack_end) upto 0x10020 (symbol __abort_stack) with size 0xcc00.
- The application can then be placed anywhere above address 0x3420 in OCM and its heap and stack could be placed in the portion of OCM from 0x0 to 0x3420 as these sections only start “existing” when the application is executed.
4. It is now possible to create a ZynQ boot image with the FSBL and the “Hello world” application.
5. The MCS file output by bootgen should be edited and the “source offset” field should then be used as the start of memory range available to the FSBL in its linker script. In the current case Source Offset = 0xA80.
6. From a Xilinx Dos prompt, executing the command xtclsh mcs_dev.tcl creates the XIP_xip_Hello.mcs that can be programmed in Flash with the Flash programmer from SDK.
© Copyright 2013 Xilinx