From b0679cbeeebba6424cdced880338e0076b26515d Mon Sep 17 00:00:00 2001 From: Ludovic Vanasse Date: Sun, 20 Oct 2024 15:00:46 -0400 Subject: [PATCH] Doc: Migrate NuttX Protected Build Migrate https://cwiki.apache.org/confluence/display/NUTTX/NuttX+Protected+Build to official wiki Signed-off-by: Ludovic Vanasse --- Documentation/guides/index.rst | 3 +- Documentation/guides/protected_build.rst | 449 +++++++++++++++++++++++ 2 files changed, 451 insertions(+), 1 deletion(-) create mode 100644 Documentation/guides/protected_build.rst diff --git a/Documentation/guides/index.rst b/Documentation/guides/index.rst index bd0def0189..af2102a040 100644 --- a/Documentation/guides/index.rst +++ b/Documentation/guides/index.rst @@ -46,4 +46,5 @@ Guides versioning_and_task_names.rst logging_rambuffer.rst ipv6.rst - integrate_newlib.rst \ No newline at end of file + integrate_newlib.rst + protected_build.rst \ No newline at end of file diff --git a/Documentation/guides/protected_build.rst b/Documentation/guides/protected_build.rst new file mode 100644 index 0000000000..c5ad95e06d --- /dev/null +++ b/Documentation/guides/protected_build.rst @@ -0,0 +1,449 @@ +===================== +NuttX Protected Build +===================== + +.. warning:: + Migrated from : + https://cwiki.apache.org/confluence/display/NUTTX/NuttX+Protected+Build + +The Traditional "Flat" Build +============================ + +The traditional NuttX build is a "flat" build. By flat, I mean that when +you build NuttX, you end up with a single "blob" called ``nuttx``. All of the +components of the build reside in the same address space. All components +of the build can access all other components of the build. + +The "Two Pass" Protected Build +============================== + +The NuttX protected build, on the other hand, is a "two-pass" build and +generates two "blobs": (1) a separately compiled and linked `kernel` blob +called, again, `nuttx` and separately compiled and linked `user` blob called +in ``nuttx_user.elf`` (in the existing build configurations). The user blob +is created on pass 1 and the kernel blob is created on pass2. + +These two make commands are identical: + +.. code-block:: bash + + make + make pass1 pass2 + +But the second is clearer and I prefer to use it for the protected build. +In the second case, the user and kernel blobs are built separately; in the +first, the kernel and user blob builds may be intermixed and somewhat +confusing. You can also build the kernel and user blobs separately with +one of the following commands: + +.. code-block:: bash + + make pass1 + make pass2 + +At the end of the build, there will be several files in the top-level NuttX build directory. From Pass 1: + +* ``nuttx_user.elf``. The pass1 user-space ELF file +* ``nuttx_user.hex``. The pass1 Intel HEX format file (selected in ``defconfig``) +* ``User.map``. Symbols in the user-space ELF file + +From Pass 2: + +* ``nuttx``. The pass2 kernel-space ELF file +* ``nuttx.hex``. The pass2 Intel HEX file (selected in ``defconfig``) +* ``System.map``. Symbols in the kernel-space ELF file + +The Memory Protection Unit +========================== + +If the MCU supports a Memory Protection Unit (MPU), then the logic within +the kernel blob all execute in kernel-mode, i.e., with all privileges. +These privileged threads can access all memory, all CPU instructions, +and all MCU registers. The logic executing within the user-mode blob, +on the other hand, all execute in user-mode with certain restrictions +as enforced by the MCU and by the MPU. The MCU may restrict access to +certain registers and machine instructions; with the MPU, access to all +kernel memory resources are prohibited from the user logic. This includes +the kernel blob's FLASH, .bss/.data storage, and the kernel heap memory. + +Advantages of the Protected Build +================================= + +The advantages of such a protected build are (1) security and (2) +modularity. Since the kernel resources are protected, it will be much +less likely that a misbehaving task will crash the system or that a +wild pointer access will corrupt critical memory. This security also +provides a safer environment in which to execute 3rd party software +and prevents "snooping" into the kernel memory from the hosted applications. + +Modularity is assured because there is a strict control of the exposed +kernel interfaces. In the flat build, all symbols are exposed and there +is no enforcement of a kernel API. With the protected build, on the +other hand, all interactions with the kernel from the user application +logic must use `system calls` (or `syscalls`) to interface with the OS. A +system call is necessary to transition from user-mode to kernel-mode; +all user-space operating system interfaces are via syscall `proxies`. +Then, while in kernel mode, the kernel system call handler will +perform the OS service requested by the application. At the +conclusion of system processing, user-privileges are restored +and control is return to the user application. Since the only +interactions with the kernel can be through support system calls, +modularity of the OS is guaranteed. + +User-Space Proxies/Kernel-Space Stubs +===================================== + +The same OS interfaces are exposed to the application in both the "flat" +build and the protected build. The difference is that in the protected +build, the user-code interfaces with a `proxy` for the OS function. For +example, here is what a proxy for the OS ``getpid()`` interface: + +.. code-block:: c + + #include + #include + pid_t getpid(void) + { + return (pid_t)sys_call0(SYS_getpid); + } + +Thus the ``getpid()`` proxy is a stand-in for the real OS ``getpid()`` interface +that executes a system call so the kernel code can perform the real +``getpid()`` operation on behalf of the user application. Proxies are +auto-generated for all exported OS interfaces using the CSV file +``syscall/syscall.csv`` and the program ``tools/mksyscalls``. Similarly, +on the kernel-side, there are auto-generated `stubs` that map the +system calls back into real OS calls. These, however, are internal +to the OS and the implementation may be architecture-specific. +See the ``README.txt`` files in those directories for further information. + +Combining Intel HEX Files +========================= + +One issue that you may face is that the two pass builds creates two +FLASH images. Some debuggers that I use will allow me to write each +image to FLASH separately. Others will expect to have a single Intel +HEX image. In this latter case, you may need to combine the two Intel +HEX files into one. Here is how you can do that: + +1) The `tail` of the ``nuttx.hex`` file should look something like this + (with my comments and spaces added): + +.. code-block:: bash + + $ tail nuttx.hex + # 00, data records + ... + :10 9DC0 00 01000000000800006400020100001F0004 + :10 9DD0 00 3B005A0078009700B500D400F300110151 + :08 9DE0 00 30014E016D0100008D + # 05, Start Linear Address Record + :04 0000 05 0800 0419 D2 + # 01, End Of File record + :00 0000 01 FF + +Use an editor such as vi to remove the 05 and 01 records. + +2) The `head` of the ``nuttx_user.hex`` file should look something like this + (again with my comments and spaces added): + +.. code-block:: bash + + $ head nuttx_user.hex + # 04, Extended Linear Address Record + :02 0000 04 0801 F1 + # 00, data records + :10 8000 00 BD89 01084C800108C8110208D01102087E + :10 8010 00 0010 00201C1000201C1000203C16002026 + :10 8020 00 4D80 01085D80010869800108ED83010829 + ... + +Nothing needs to be done here. The ``nuttx_user.hex`` file should be fine. + +3) Combine the edited nuttx.hex and un-edited ``nuttx_user.hex`` file to produce + a single combined hex file: + +.. code-block:: bash + + $ cat nuttx.hex nuttx_user.hex >combined.hex + +Then use the ``combined.hex`` file with for FLASH/JTAG tool. If you do this +a lot, you will probably want to invest a little time to develop a tool +to automate these steps. + +Files and Directories +===================== + +Here is a summary of directories and files used by the STM32F4Discovery +protected build: + +* ``boards/arm/stm32/stm32f4discovery/configs/kostest``. This is the kernel + mode OS test configuration. The two standard configuration files + can be found in this directory: (1) ``defconfig`` and (2) ``Make.defs``. +* ``boards/arm/stm32/stm32f4discovery/kernel``. This is the first past + build directory. The Makefile in this directory is invoked to + produce the pass1 object (``nuttx_user.elf`` in this case). The + second pass object is created by ``arch/arm/src/Makefile``. Also + in this directory is the file ``userspace.c``. The user-mode blob + contains a header that includes information need by the kernel + blob in order to interface with the user-code. That header is + defined in by this file. +* ``boards/arm/stm32/stm32f4discovery/scripts``. Linker scripts for + the kernel mode build are found in this directory. This includes + (1) ``memory.ld`` which hold the common memory map, (2) ``user-space.ld`` + that is used for linking the pass1 user-mode blob, and (3) + ``kernel-space.ld`` that is used for linking the pass1 kernel-mode blob. + +Alignment, Regions, and Subregions +================================== + +There are some important comments in the ``memory.ld`` +file that are worth duplicating here: + +"The STM32F407VG has 1024Kb of FLASH beginning at address +0x0800:0000 and 192Kb of SRAM. SRAM is split up into three blocks: + +* "112KB of SRAM beginning at address 0x2000:0000 +* "16KB of SRAM beginning at address 0x2001:c000 +* "64KB of CCM SRAM beginning at address 0x1000:0000 + +"When booting from FLASH, FLASH memory is aliased to address +0x0000:0000 where the code expects to begin execution by jumping +to the entry point in the 0x0800:0000 address range. + +"For MPU support, the kernel-mode NuttX section is assumed to +be 128Kb of FLASH and 4Kb of SRAM. That is an excessive amount +for the kernel which should fit into 64KB and, of course, can +be optimized as needed... Allowing the additional memory does +permit addition debug instrumentation to be added to the kernel +space without overflowing the partition. + +"Alignment of the user space FLASH partition is also a critical +factor: The user space FLASH partition will be spanned with a +single region of size 2||n bytes. The alignment of the user-space +region must be the same. As a consequence, as the user-space +increases in size, the alignment requirement also increases. + +"This alignment requirement means that the largest user space +FLASH region you can have will be 512KB at it would have to be +positioned at 0x08800000. If you change this address, don't +forget to change the ``CONFIG_NUTTX_USERSPACE`` configuration +setting to match and to modify the check in ``kernel/userspace.c``. + +"For the same reasons, the maximum size of the SRAM mapping is +limited to 4KB. Both of these alignment limitations could be +reduced by using multiple MPU regions to map the FLASH/SDRAM +range or perhaps with some clever use of subregions." + +Memory Management +================= + +At present, there are two options for memory management in the +NuttX protected build: + +Single User Heap +---------------- + +By default, there is only a single user-space heap and heap +allocator that is shared by both kernel- and user-modes. +PROs: Simple and makes good use of the heap memory space, +CONs: Awkward architecture and no security for kernel-mode +allocations. + +Dual, Partitioned Heaps +----------------------- + +Two configuration options can change this behavior: + +* ``CONFIG_MM_MULTIHEAP=y``. This changes internal memory manager interfaces + so that multiple heaps can be supported. +* ``CONFIG_MM_KERNEL_HEAP=y``. Uses the multi-heap capability to enable + a kernel heap + +If this both options are defined defined, the two heap partitions and +two copies of the memory allocators are built: + +One un-protected heap partition that will allocate user accessible memory +that is shared by both the kernel- and user-space code. That allocator +physically resides in the user address space so that it can be called +directly by both the user- and kernel-space code. There is a header at +the beginning of the user-space blob; the kernel-space code gets +address of the user-space allocator from this header. + +And another protected heap partition that will allocate protected +memory that is only accessible from the kernel code. This allocator +is built into the kernel block. This separate protected heap is +required if you want to support security features. + +NOTE: There are security issues with calling into the user space +allocators in kernel mode. That is a security hole that could be +exploit to gain control of the system! Instead, the kernel code +should switch to user mode before entering the memory allocator +stubs (perhaps via a trap). The memory allocator stubs should +then trap to return to kernel mode (as does the signal handler now). + +The Traditional Approach +------------------------ + +A more traditional approach would use something like the interface +``sbrk()``. The ``sbrk()`` function adds memory to the heap space +allocation of the calling process. In this case, there would +still be kernel- and user-mode instances of the memory allocators. +Each would ``sbrk()`` as necessary to extend their heap; the pages +allocated for the kernel-mode allocator would be protected but +the pages allocated for the user-mode allocator would not. +PROs: Meets all of the needs. CONs: Complex. Memory losses +due to quantization. + +This approach works well with CPUs that have very capable +Memory Management Units (MMUs) that can coalesce the +srbk-ed chunks to a contiguous, `virtual` heap region. +Without an MMU, the sbrk-ed memory would not be +contiguous; this would limit the sizes of allocations +due to the physical pages. + +Many MCUs will have Memory Protection Units (MPUs) that can +support the security features (only). However these lower +end MPUs may not support sufficient mapping capability to +support this traditional approach. The ARMv7-M MPU, for +example, only supports eight protection regions to manage +all FLASH and SRAM and so this approach would not be +technically feasible for th ARMv7-M family (Cortex-M3/4). + +Comparing the "Flat" Build Configuration with the Protected Build Configuration +=============================================================================== + +Compare, for example the configuration +``boards/arm/stm32/stm32f4discovery/configs/ostest`` and the +configuration ``boards/arm/stm32/stm32f4discovery/configs/kostest``. +These two configurations are identical except that one builds a +"flat" version of OS test and the other builds a kernel version +of the OS test. See the file ``boards/arm/stm32/stm32f4discovery/README.txt`` +for more details about those configurations. + +The configurations can be compared using the ``cmpconfig`` tool: + +.. code-block:: bash + + cd tools + make -f Makefile.host cmpconfig + cd .. + tools/cmpconfig boards/arm/stm32/stm32f4discovery/configs/ostest/defconfig boards/arm/stm32/stm32f4discovery/configs/kostest/defconfig + +Here is a summary of the meaning of all of the important differences in the +configurations. This should be enough information for you to convert any +configuration from a "flat" to a protected build: + +* ``CONFIG_BUILD_2PASS=y``. This enables the two pass build. +* ``CONFIG_BUILD_PROTECTED=y``. This option enables the "two pass" + protected build. +* ``CONFIG_PASS1_BUILDIR="boards/arm/stm32/stm32f4discovery/kernel"``. + This tells the build system the (relative) location of the pass1 build directory. +* ``CONFIG_PASS1_OBJECT=""``. In some "two pass" build configurations, + the build system need to know the name of the first pass object. + This setting is not used for the protected build. +* ``CONFIG_NUTTX_USERSPACE=0x08020000``. This is the expected location + where the user-mode blob will be located. The user-mode blob + contains a header that includes information need by the kernel + blob in order to interface with the user-code. That header will + be expected to reside at this location. +* ``CONFIG_PASS1_TARGET="all"``. This is the build target to use for + invoking the pass1 make. +* ``CONFIG_MM_MULTIHEAP=y``. This changes internal memory manager + interfaces so that multiple heaps can be supported. +* ``CONFIG_MM_KERNEL_HEAP=y``. NuttX supports the option of using a + single user-accessible heap or, if this options is defined, + two heaps: (1) one that will allocate user accessible memory + that is shared by both the kernel- and user-space code, and + (2) one that will allocate protected memory that is only + accessible from the kernel code. Separate heap memory is required + if you want to support security features. +* ``CONFIG_MM_KERNEL_HEAPSIZE=8192``. This determines an approximate + size for the kernel heap. The standard heap space is partitioned + into a kernel- and user-heap space. This size of the kernel heap + is only approximate because the user heap is subject to stringent + alignment requirements. Because of the alignment requirements, the + actual size of the kernel heap could be considerable larger than this. +* ``CONFIG_BOARD_EARLY_INITIALIZE=y``. This setting enables a special, + `early` initialization call to initialize board-specific resources. +* ``CONFIG_BOARD_LATE_INITIALIZE=y``. This setting enables a special + initialization call to initialize `late` board-specific resources. + The difference between ``CONFIG_BOARD_EARLY_INITIALIZE`` and + ``CONFIG_BOARD_LATE_INITIALIZE`` is that the ``CONFIG_BOARD_EARLY_INITIALIZE`` + logic runs earlier in initialization before the full operating + system is up and running. ``CONFIG_BOARD_LATE_INITIALIZE``, on the + other hand, runs at the completion of initialization, just before + the user applications are started. Neither ``CONFIG_BOARD_EARLY_INITIALIZE`` + nor ``CONFIG_BOARD_LATE_INITIALIZE`` are used in the OS test + configuration but other configurations (such as NSH) + require some application-specific initialization before + the application can run. In the "flat" build, such initialization + is performed as part of the application start-up sequence. + These includes such things as initializing device drivers. + These same initialization steps must be performed in kernel + mode for the protected build and ``CONFIG_BOARD_LATE_INITIALIZE``. + See ``boards/arm/stm32/stm32f4discovery/src/up_boot.c`` for an + example of such board initialization code. +* ``CONFIG_NSH_ARCHINITIALIZE`` is not defined. The setting + ``CONFIG_NSH_ARCHINITIALIZE`` does not apply to the OS test + configuration, however, this is noted here as an example + of initialization that cannot be performed in the protected build. + +Architecture-Specific Options: + +* ``CONFIG_SYS_RESERVED=8``. The user application logic + interfaces with the kernel blob using system calls. + The architecture-specific logic may need to reserved a + few system calls for its own internal use. The ARMv7-M + architectures all require 8 reserved system calls. +* ``CONFIG_SYS_NNEST=2``. System calls may be nested. The + system must retain information about each nested system + call and this setting is used to set aside resources for + nested system calls. In the current architecture, a maximum + nesting level of two is all that is needed. +* ``CONFIG_ARMV7M_MPU=y``. This settings enables support for + the ARMv7-M Memory Protection Unit (MPU). The MPU is used + to prohibit user-mode access to kernel resources. +* ``CONFIG_ARMV7M_MPU_NREGIONS=8``. The ARMv7-M MPU supports 8 + protection regions. + +Size Expansion +============== + +The protected build will, or course, result in a FLASH image that is +larger than that of the corresponding "flat" build. How much larger? +I don't have the numbers in hand, but you can build +``boards/arm/stm32/stm32f4discovery/configs/nsh`` and +``boards/arm/stm32/stm32f4discovery/configs/kostest`` and compare +the resulting binaries for yourself using the ``size`` command. + +Increases in size are expected because: + +* The syscall layer is included in the protected build but not the flat + build. +* The kernel-size _syscal_l stubs will cause all enabled OS code to be + drawn into the build. In the flat build, only those OS interfaces + actually called by the application will be included in the final objects. +* The dual memory allocators will increase size. +* Code duplication. Some code, such as the C library, will be + duplicated in both the kernel- and user-blobs, and +* Alignment. The alignments required by the MPU logic will leave + relatively large regions of FLASH (and perhaps RAM) is not usable. + +Performance Issues +================== + +The only performance differences using the protected build should +result as a consequence of the `sycalls` used to interact with the +OS vs. the direct C calls as used in the flat build. If your +performance is highly dependent upon high rate OS calls, then +this could be an issue for you. But, in the typical application, +OS calls do not often figure into the critical performance paths. + +The `syscalls` are, ultimately, software interrupts. If the platform +does not support prioritized, nested interrupts then the `syscall` +execution could also delay other hardware interrupt processing. +However, `sycall` processing is negligible: they really just +configure to return to in supervisor mode and vector to the +`syscall` stub. They should be lightning fast and, for the typical +real-time applications, should cause no issues. \ No newline at end of file