Automation Architecture

This document describes how the Makefile, shell scripts, libraries, and configuration files compose into the jetson-rt-stack build pipeline. The pipeline flashes a Jetson Orin NX 16GB (p3767-0000 module on a p3768 Orin Nano devkit carrier) with a PREEMPT_RT kernel, an Axelera Metis NPU, NVMe root, and a ZED X camera device-tree overlay, on L4T R36.4.3 / JetPack 6.2.

The five numbered phase scripts run in order: extract and patch, build in Docker, bake the rootfs, flash to NVMe, then validate over SSH. This page is for anyone extending or debugging the pipeline; the sections below cover the layering, the script inventory, the configuration sources, and the composition targets that chain the phases together. For a first-time walkthrough see Quickstart; phase internals live in Build (Phases 1 to 2) and Flash (Phase 4).

Layered architecture

┌─────────────────────────────────────────────────────────────────┐
│  Makefile                                                       │
│   Thin wrappers + composition (ignite, ignite-no-flash, all)    │
│   Kconfig targets: menuconfig, defconfig, savedefconfig         │
│   Routes Docker-needed targets through the container            │
└─────────────────────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  scripts/*.sh                                                   │
│   Phase scripts (00_doctor → 01_extract → 02_build → 03_bake    │
│                  → 04_flash → 05_post_flash_validate)           │
│   Plus helpers: pre_flash_audit, verify_vermagic, gather_logs,  │
│                 show_versions, install_zed_sdk, jetson_*        │
└─────────────────────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  scripts/lib/                                                   │
│   config.sh: loads versions.env + .config, derives paths        │
│   log.sh: uniform colored logging + result helpers              │
│   plugin.sh: plugin loader and hook dispatcher                  │
└─────────────────────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  versions.env + .config                                         │
│   versions.env: version pins, tarballs, toolchain, target HW    │
│   .config: feature flags (RT, camera, power, plugins)           │
│                  generated by make menuconfig / make defconfig  │
└─────────────────────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  plugins/                                                       │
│   zedx/plugin.sh: ZED X integration hooks (post_extract,        │
│                        post_defconfig, pre_bake, post_bake)     │
│   axelera/plugin.sh: Axelera Metis + Voyager hooks              │
│   <custom>/: bring-your-own hardware plugin                     │
└─────────────────────────────────────────────────────────────────┘

Script inventory

Each numbered phase script performs the work described in the pipeline overview: 01_extract_and_patch.sh unpacks L4T and injects the AV defconfig and patches; 02_build_kernel.sh cross-compiles the Image, modules, and DTBs in Docker; 03_bake_rootfs.sh stages boot args, drivers, and services into the rootfs; 04_flash_nvme.sh writes to NVMe; and 05_post_flash_validate.sh checks the running board over SSH.

Script Phase When Runs in
00_doctor.sh preflight before Phase 1 host
01_extract_and_patch.sh 1 make extract host (writes latest_jetson/)
02_build_kernel.sh 2 make build Docker (cross-compile)
03_bake_rootfs.sh 3 make bake host (sudo for rootfs writes)
04_flash_nvme.sh 4 make flash host (Jetson in recovery mode)
05_post_flash_validate.sh 5 make verify host (SSH to Jetson)
pre_flash_audit.sh gate make audit host
verify_vermagic.sh gate called by audit + Phase 2 host or Docker
gather_logs.sh utility make logs host
show_versions.sh utility make versions host
install_zed_sdk.sh target-side called by first-boot Jetson
jetson_first_boot.sh target-side systemd one-shot Jetson
jetson_rt_tune.sh target-side systemd per-boot Jetson
verify_tuning.sh target-side called by make verify Jetson

Configuration

versions.env

KEY=VALUE pinning of:

  • L4T / JetPack / kernel base
  • Bootlin toolchain identifier
  • All required tarball filenames
  • External vendor tree directory names (where each tarball and tree comes from: Third-party dependencies)
  • CUDA / PyTorch / numpy / ZED SDK versions
  • USB IDs (APX 0955:7323, RNDIS 0955:7035)
  • Target hostname / IP
  • RT tuning values (isolated cores, boot args)
  • Critical kernel patch values (PCIe retries, CMA size)

Edit versions.env to change a pin globally. Every script that references a pinned value reads it through lib/config.sh. CI enforces consistency between this file and the docs, so never introduce a conflicting value.

scripts/lib/config.sh

Sources versions.env and derives:

  • REPO_ROOT: absolute path to the project root
  • BUILD_WORKSPACE, L4T_DIR, SOURCE_DIR, KERNEL_SRC, ROOTFS, etc.
  • AXELERA_DRIVER_DIR, VOYAGER_SDK_DIR, ZEDX_DRIVER_DIR, ZED_SDK_DIR
  • TARBALL_*_PATH (full paths to the L4T tarballs)
  • TOOLCHAIN_DIR, CROSS_COMPILE_BIN
  • PCIE_DESIGNWARE_H, DEFCONFIG_PATH, EXTLINUX_CONF, KERNEL_IMAGE

Sourced idempotently. Sets JETSON_AV_CONFIG_LOADED=1 so re-sourcing is cheap.

scripts/lib/log.sh

Provides:

  • log::section "Title": bold banner
  • log::step "msg", log::info "msg": normal info
  • log::ok "msg": green success
  • log::warn "msg": yellow non-fatal
  • log::fail "msg": red failure (exits 1 unless NO_EXIT=1)
  • log::pass "label" / log::xfail "label" "why": for audit-style gates
  • log::kv "label" "value": key/value row

Colors auto-disable when stdout is not a TTY or NO_COLOR=1.

Composition orchestration

make all

extract → build → bake

No flash. No audit. No verify. Suitable for CI to validate the build graph compiles.

make ignite-no-flash

doctor → extract → build → bake → audit

Full host-side pipeline. Every gate green is the precondition for flashing. Hardware not required.

make ignite

doctor → extract → build → bake → audit → flash → (sleep 90) → verify

End-to-end, including hardware. The flow is doctor all audit followed by the flash and validate steps, where all expands to extract build bake. 04_flash_nvme.sh pauses for the operator to confirm the Jetson is in force recovery mode (the script reads ENTER), then writes to NVMe. verify runs after a 90 s wait, which covers boot to sshd (about 60 s). It does not cover full first-boot provisioning: jetson_first_boot.sh builds the /opt/av-env Python environment (PyTorch, Voyager SDK wheels) and needs internet access, so the verify venv-import check fails until that provisioning has completed on a networked boot. The service re-runs on each boot until it succeeds. Once verify is green, the camera and NPU stack can be exercised end to end: Samples has the run commands and Benchmarks has the measured throughput plus the reproducible bench harness.

Targeted re-runs

You changed Run
A kernel CONFIG flag make all && make audit && make flash && make verify
A patch in 01_extract_and_patch.sh make clean && make all && make audit && make flash && make verify
Just the headers .deb (no kernel changes) make headers && make bake && make flash && make verify
A first-boot script make bake && make flash && make verify
versions.env (e.g., TARGET_USB_IP) make verify (no rebuild needed)

make clean removes latest_jetson/, which prevents the Phase 1 idempotency guards from silently no-opping a stale change. When in doubt, run make clean && make all.

Failure-mode contracts

Every gate has an explicit exit-code contract:

Script Exit 0 Exit 1 Exit 2 Exit ≥3
00_doctor.sh all checks pass (warnings OK) required check failed (n/a) (n/a)
02_build_kernel.sh build complete, vermagic gate green build failed or vermagic mismatch (n/a) (n/a)
pre_flash_audit.sh every gate green any gate red (n/a) (n/a)
verify_vermagic.sh match mismatch no modules found (n/a)
04_flash_nvme.sh flash complete flash failed (n/a) (n/a)
05_post_flash_validate.sh all checks pass any check failed (n/a) (n/a)

CI pipelines hard-stop on any non-zero exit from these scripts. The make ignite target uses set -e-equivalent semantics via the Makefile’s implicit error-on-failure: any recipe that exits non-zero aborts the target.

Idempotency contract

Script Idempotent State guard
00_doctor.sh yes (read-only) n/a
01_extract_and_patch.sh yes per-step [ -d ... ] and grep -q checks
02_build_kernel.sh partial make re-uses cache; make clean to force rebuild
03_bake_rootfs.sh yes overwrites payloads; cleans extlinux duplicates first
04_flash_nvme.sh yes (always rewrites) n/a
pre_flash_audit.sh yes (read-only) n/a
verify_vermagic.sh yes (read-only) n/a
05_post_flash_validate.sh yes (read-only on host; read-only over SSH) n/a
jetson_first_boot.sh NO /home/j/.jetson_initialized marker
jetson_rt_tune.sh yes none (re-applies tuning every call)

The only non-idempotent script is jetson_first_boot.sh by design: it is a systemd one-shot guarded by the /home/j/.jetson_initialized marker. The marker is written only when the script runs to completion, so a partial run (for example, /opt/av-env provisioning aborting on a boot without internet) leaves the marker absent and the service runs again on the next boot. To force a re-run after a completed pass:

sudo rm /home/j/.jetson_initialized
sudo systemctl start jetson-first-boot.service

Hooks for CI

Recommended GitHub Actions / GitLab CI flow:

build:
  steps:
    - run: make doctor      # exit non-zero if missing prerequisites
    - run: make all         # full build (Docker)
    - run: make audit       # gate
    - run: |
        sha256sum latest_jetson/Linux_for_Tegra/kernel/Image \
                  latest_jetson/Linux_for_Tegra/staging/kernel-headers/*.deb \
                  > artifact.sha256
    - artifacts:
        paths:
          - latest_jetson/Linux_for_Tegra/BUILD_MANIFEST.json
          - artifact.sha256
          - latest_jetson/Linux_for_Tegra/kernel/Image
          - latest_jetson/Linux_for_Tegra/staging/kernel-headers/*.deb

The flash + verify steps require physical hardware, so they live in a separate manual job or a self-hosted runner with a Jetson attached.

Adding a new automation step

  1. Decide where the script lives (scripts/).
  2. Source lib/config.sh and lib/log.sh at the top.
  3. Use log::section, log::step, log::pass, log::xfail for output.
  4. If it’s a gate, set GATE_FAILED=1 on any failure and exit 1 at the end if non-zero.
  5. Add a Makefile target wrapping it.
  6. Document it in Runbook if it is user-facing.
  7. Add it to the appropriate composition target if part of the standard pipeline.