These docs track the upcoming v1.2.0 dev branch. For current released docs, use v1.1.0.

Introduction

v1.2.0 validation scope

The heavily tested path is detection, training and inference for YOLO9 and RF-DETR, including RF-DETR segmentation.

Other model families, tasks, and multi-GPU workflows are available but experimental.

LibreYOLO is an MIT-licensed object detection toolkit. v1.2.0 ships a broad catalogue, but the validated support surface is intentionally narrow:

  • YOLO9 detection - the CNN path.
  • RF-DETR detection - the transformer path.
  • RF-DETR segmentation - the heavily tested segmentation path.

We recommend those paths as the default choice for new projects because they receive the heaviest testing around detection, training and inference. Other supported families and tasks work through the same unified LibreYOLO() factory, but they are experimental in v1.2.0. Use them if you have a specific reason.

python
1from libreyolo import LibreYOLO, SAMPLE_IMAGE
2
3# Default: YOLO9 detection
4model = LibreYOLO("LibreYOLO9c.pt")
5result = model(SAMPLE_IMAGE, conf=0.25, save=True)
6
7print(f"Detected {len(result)} objects")
8print(result.boxes.xyxy)
9print(result.saved_path)

Key features

  • Heavy testing and recommended defaults for YOLO9 detection, RF-DETR detection, and RF-DETR segmentation
  • Unified LibreYOLO() factory for checkpoints, exported artifacts, and runtime loading
  • Detection, segmentation, pose, and gaze tasks through one consistent API
  • Image, directory, and video inference (with optional tiled inference for large frames)
  • Built-in multi-object tracking via ByteTrack
  • ONNX, TorchScript, TensorRT, OpenVINO, NCNN, and CoreML export with embedded metadata, plus matching runtime backends
  • COCO-compatible validation with mAP metrics, plus segmentation and pose validators
  • Ultralytics-style libreyolo command-line tool for predict / train / val / export
  • Accepts any image format: file paths, URLs, PIL, NumPy, PyTorch tensors, raw bytes

Compatibility

Use this matrix as the quick v1.2.0 support map. A checkmark means the path is supported in the validated documentation surface, exp means the path exists but is experimental, and empty cells are not currently supported or should not be relied on.

Model familyv1.2.0 statusInferenceTrainingDetectionSegmentationPoseGazeONNXTorchScriptTensorRTOpenVINONCNNCoreML
YOLO9Validated detect, single GPUexpNot currently supportedNot currently supported
RF-DETRValidated detect + segment, single GPUNot currently supportedNot currently supportedexpNot currently supportedexp
YOLOXExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpexpexp
YOLO9-E2EExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpexpNot currently supported
YOLO-NASExperimentalexpexpexpNot currently supportedexpNot currently supportedexpexpexpexpexpNot currently supported
D-FINEExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpNot currently supportedNot currently supported
DEIMExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpNot currently supportedNot currently supported
DEIMv2ExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpNot currently supportedNot currently supported
RT-DETRExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpNot currently supportedexp
PicoDetExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpexpNot currently supported
ECExperimentalexpexpexpexpexpNot currently supportedexpexpexpexpNot currently supportedNot currently supported
RT-DETRv2ExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpexpNot currently supported
RT-DETRv4ExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpexpNot currently supported
DAMO-YOLOExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpexpNot currently supported
RTMDetExperimentalexpexpexpNot currently supportedNot currently supportedNot currently supportedexpexpexpexpexpNot currently supported
L2CSExperimental, inference-onlyexpNot currently supportedNot currently supportedNot currently supportedNot currently supportedexpNot currently supportedNot currently supportedNot currently supportedNot currently supportedNot currently supportedNot currently supported

CoreML exports produce .mlpackage bundles and require libreyolo[coreml]. CoreML inference is macOS only, INT8 is not supported, and embedded CoreML NMS is not available for RF-DETR, D-FINE, DEIM, DEIMv2, or EC.

Installation

Requirements

  • Python 3.10+
  • PyTorch 1.13+ and torchvision 0.11+

From PyPI

bash
1pip install libreyolo

These docs track the upcoming v1.2.0 dev branch. Until v1.2.0 is published to PyPI, use a source install for the features documented on this page.

From source

bash
1git clone https://github.com/LibreYOLO/libreyolo.git
2cd libreyolo
3git checkout dev
4pip install -e .

Optional dependencies

bash
1# ONNX export and inference
2pip install libreyolo[onnx]
3# or: pip install onnx onnxsim onnxruntime
4
5# RT-DETR compatibility extra (currently no extra packages)
6pip install libreyolo[rtdetr]
7
8# RF-DETR support
9pip install libreyolo[rfdetr]
10# or: pip install transformers
11
12# TensorRT export and inference (NVIDIA GPU)
13pip install libreyolo[tensorrt]
14# Installs TensorRT CUDA 12 Python packages on Linux/Windows.
15# Host driver/CUDA compatibility still matters.
16
17# OpenVINO export and inference (Intel CPU/GPU/VPU)
18pip install libreyolo[openvino]
19# INT8 export also needs: pip install nncf
20
21# NCNN export and inference
22pip install libreyolo[ncnn]
23# or: pip install pnnx ncnn
24
25# ByteTrack API compatibility extra
26pip install libreyolo[tracking]
27# Tracking dependencies are part of the base install in v1.2.0 dev.
28
29# CoreML export and inference (macOS only for runtime)
30pip install libreyolo[coreml]
31# or: pip install coremltools
32
33# L2CS gaze optional auto-download helper
34pip install libreyolo[gaze]
35# Optional parity with the upstream RetinaFace-based L2CS pipeline
36pip install libreyolo[gaze-retinaface]
37
38# Install every optional LibreYOLO extra
39pip install libreyolo[all]

If using uv, the most reliable path is an isolated venv per extra:

bash
1# ONNX environment
2uv venv .venv-onnx
3uv pip install --python .venv-onnx/bin/python -e '.[onnx]'
4
5# RT-DETR environment
6uv venv .venv-rtdetr
7uv pip install --python .venv-rtdetr/bin/python -e '.[rtdetr]'
8
9# Repeat with .[rfdetr], .[openvino], .[ncnn], .[coreml], .[gaze], .[tracking], or .[tensorrt] as needed

This avoids mutating the project environment and keeps optional dependencies isolated. Vendor-specific extras such as TensorRT, OpenVINO, NCNN, and CoreML may still require platform-specific native packages.

Quickstart

For the most tested path, pick single-GPU YOLO9 detection, RF-DETR detection, or RF-DETR segmentation. They load through the same factory, accept the same inputs, and return the same Results object, so you can swap between them without changing surrounding code.

YOLO9 - CNN flagship

python
1from libreyolo import LibreYOLO, SAMPLE_IMAGE
2
3# Use the official checkpoint name and let the factory resolve the details
4model = LibreYOLO("LibreYOLO9c.pt")
5
6# Run on a single image (SAMPLE_IMAGE ships with the package)
7result = model(SAMPLE_IMAGE)
8
9print(f"Found {len(result)} objects")
10print(result.boxes.xyxy) # bounding boxes (N, 4)
11print(result.boxes.conf) # confidence scores (N,)
12print(result.boxes.cls) # class IDs (N,)

RF-DETR - transformer flagship

python
1from libreyolo import LibreYOLO, SAMPLE_IMAGE
2
3# Same factory, same call shape - just point at an RF-DETR checkpoint
4model = LibreYOLO("LibreRFDETRs.pt")
5result = model(SAMPLE_IMAGE)
6
7print(f"Found {len(result)} objects")
8print(result.boxes.xyxy)

Save annotated output

python
1result = model(SAMPLE_IMAGE, save=True)
2print(result.saved_path) # e.g. runs/detect/predict/parkour.jpg

Process a directory

python
1results = model("images/", save=True, batch=4)
2for r in results:
3 print(f"{r.path}: {len(r)} detections")

Available Models

Recommended validated path: YOLO9 detection or RF-DETR detection / segmentation

Detection, training and inference for these models receive the heaviest testing. Treat other families, tasks, and multi-GPU workflows as experimental in v1.2.0.

LibreYOLO ships a small validated v1.2.0 surface plus a broader catalogue of supported models. Every model loads through the same LibreYOLO() factory, but only the validated paths below should be treated as heavily tested.

YOLO9 - CNN flagship

Recommended
Default: LibreYOLO9c.ptHeavily tested: detection, training and inferenceExperimental: segment, multi-GPU
SizeCodeInput sizeUse caseDetection checkpoint
Tiny"t"640Fast inferenceLibreYOLO9t.pt
Small"s"640BalancedLibreYOLO9s.pt
Medium"m"640Higher accuracyLibreYOLO9m.pt
Compact"c"640Best accuracyLibreYOLO9c.pt

Experimental Segmentation checkpoints: LibreYOLO9t-seg.pt, LibreYOLO9s-seg.pt, LibreYOLO9m-seg.pt, LibreYOLO9c-seg.pt. See the Segmentation section.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("LibreYOLO9c.pt")
4# Experimental segmentation variant
5# model = LibreYOLO("LibreYOLO9c-seg.pt")

RF-DETR - transformer flagship

Recommended
Recommended transformer pathHeavily tested: detection, segmentation, training and inferenceExperimental: multi-GPU
SizeCodeInput sizeUse caseDetection checkpoint
Nano"n"384EdgeLibreRFDETRn.pt
Small"s"512BalancedLibreRFDETRs.pt
Medium"m"576Higher accuracyLibreRFDETRm.pt
Large"l"704Maximum accuracyLibreRFDETRl.pt

Heavily tested Segmentation checkpoints: LibreRFDETRn-seg.pt, LibreRFDETRs-seg.pt, LibreRFDETRm-seg.pt, LibreRFDETRl-seg.pt, LibreRFDETRx-seg.pt, LibreRFDETRxx-seg.pt. See the Segmentation section.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("LibreRFDETRs.pt")
4# Segmentation variants exist for every RF-DETR size
5# model = LibreYOLO("LibreRFDETRs-seg.pt")

Additional supported families

Detection-capable families that share the same factory and API surface as the validated paths. These are experimental in v1.2.0. Each checkpoint name links to its Hugging Face model card on the LibreYOLO org; pass any name to LibreYOLO() and the factory will fetch it on first use.

FamilyStatusTasksCheckpoints
YOLOXExperimentaldetectLibreYOLOXn.pt, LibreYOLOXt.pt, LibreYOLOXs.pt, LibreYOLOXm.pt, LibreYOLOXl.pt, LibreYOLOXx.pt
YOLO9-E2EExperimentaldetectLibreYOLO9E2Et.pt, LibreYOLO9E2Es.pt, LibreYOLO9E2Em.pt, LibreYOLO9E2Ec.pt
YOLO-NASExperimentaldetect, poseLibreYOLONASs.pt, LibreYOLONASm.pt, LibreYOLONASl.pt, LibreYOLONASn-pose.pt, LibreYOLONASs-pose.pt, LibreYOLONASm-pose.pt, LibreYOLONASl-pose.pt
D-FINEExperimentaldetectLibreDFINEn.pt, LibreDFINEs.pt, LibreDFINEm.pt, LibreDFINEl.pt, LibreDFINEx.pt
DEIMExperimentaldetectLibreDEIMn.pt, LibreDEIMs.pt, LibreDEIMm.pt, LibreDEIMl.pt, LibreDEIMx.pt
DEIMv2ExperimentaldetectLibreDEIMv2atto.pt, LibreDEIMv2femto.pt, LibreDEIMv2pico.pt, LibreDEIMv2n.pt, LibreDEIMv2s.pt, LibreDEIMv2m.pt, LibreDEIMv2l.pt, LibreDEIMv2x.pt
RT-DETRExperimentaldetectLibreRTDETRr18.pt, LibreRTDETRr34.pt, LibreRTDETRr50.pt, LibreRTDETRr50m.pt, LibreRTDETRr101.pt, LibreRTDETRl.pt, LibreRTDETRx.pt
RT-DETRv2ExperimentaldetectLibreRTDETRv2r18.pt, LibreRTDETRv2r34.pt, LibreRTDETRv2r50.pt, LibreRTDETRv2r50m.pt, LibreRTDETRv2r101.pt
RT-DETRv4ExperimentaldetectLibreRTDETRv4s.pt, LibreRTDETRv4m.pt, LibreRTDETRv4l.pt, LibreRTDETRv4x.pt
PicoDetExperimentaldetectLibrePICODETs.pt, LibrePICODETm.pt, LibrePICODETl.pt
EdgeCrafterExperimentaldetect, pose, segmentLibreECs.pt, LibreECm.pt, LibreECl.pt, LibreECx.pt, LibreECs-pose.pt, LibreECm-pose.pt, LibreECl-pose.pt, LibreECx-pose.pt, LibreECs-seg.pt, LibreECm-seg.pt, LibreECl-seg.pt, LibreECx-seg.pt
DAMO-YOLOExperimentaldetectLibreDAMOYOLOns.pt, LibreDAMOYOLOnm.pt, LibreDAMOYOLOnl.pt, LibreDAMOYOLOt.pt, LibreDAMOYOLOs.pt, LibreDAMOYOLOm.pt, LibreDAMOYOLOl.pt
RTMDetExperimentaldetectLibreRTMDett.pt, LibreRTMDets.pt, LibreRTMDetm.pt, LibreRTMDetl.pt, LibreRTMDetx.pt

Hosting note: YOLO-NAS checkpoints (plain text above) are hosted on Deci's CDN under their proprietary weights license, not on the LibreYOLO Hugging Face org. The factory still downloads them automatically on first use.

Specialized models

FamilyStatusTasksCheckpoints
L2CSExperimentalgaze (inference-only) - see Gaze EstimationLibreL2CSr50.pt

L2CS architecture sizes include r18, r34, r50, r101, and r152, but the upstream-published Gaze360 checkpoint is ResNet-50. Install libreyolo[gaze] for the optional Google Drive helper, or pass a local checkpoint path for other sizes.

Factory function

Use the LibreYOLO() factory for every model and runtime. Give it an official checkpoint name or exported artifact path, then let it choose the right model family, task, class count, and runtime:

python
1from libreyolo import LibreYOLO
2
3# Default: YOLO9 detection
4model = LibreYOLO("LibreYOLO9c.pt")
5
6# Flagship: RF-DETR
7model = LibreYOLO("LibreRFDETRs.pt")
8
9# Segmentation checkpoints use the same factory path
10model = LibreYOLO("LibreRFDETRs-seg.pt") # validated segmentation
11model = LibreYOLO("LibreYOLO9c-seg.pt") # experimental segmentation
12
13# Exported deployment formats
14model = LibreYOLO("model.onnx") # ONNX Runtime
15model = LibreYOLO("model.engine") # TensorRT
16model = LibreYOLO("model.mlpackage") # CoreML (macOS)
17model = LibreYOLO("model_openvino/") # OpenVINO (directory)
18model = LibreYOLO("model_ncnn/") # NCNN (directory)

For recognized official checkpoint filenames, LibreYOLO can auto-download missing weights. For custom filenames, point at an explicit local path. Experimental families still load through the same factory, but keep new projects on YOLO9 detection or RF-DETR detection/segmentation. Use them if you have a specific reason.

Tasks & Filenames

LibreYOLO uses a uniform filename convention so the factory can detect family, size, and task from the checkpoint name alone:

text
1Libre<FAMILY><size>[-<task>].pt

Task suffixes

TaskCanonical nameFilename suffix
Detection"detect"(none - implicit)
Instance segmentation"segment"-seg
Pose estimation"pose"-pose
Classification"classify"-cls
Gaze estimation"gaze"-gaze

The factory accepts aliases at the API boundary ("detection", "seg", "keypoints", etc.) - only the canonical names appear in filenames.

Resolution precedence

When you load a model, the task is resolved in this order:

text
1explicit task= → checkpoint["task"] → filename suffix → family default
python
1from libreyolo import LibreYOLO
2
3# 1. Filename suffix decides → segment
4model = LibreYOLO("LibreRFDETRs-seg.pt")
5
6# 2. Override regardless of filename
7model = LibreYOLO("custom_weights.pt", task="segment")
8
9# 3. Detection is implicit
10model = LibreYOLO("LibreYOLO9c.pt") # task="detect"

Per-family task support

Familyv1.2.0 statusDefaultSupported tasks
YOLO9detect single-GPU heavily tested; segment and multi-GPU experimentaldetectdetect, segment
RF-DETRdetect and segment single-GPU heavily tested; multi-GPU experimentaldetectdetect, segment
YOLOXexperimentaldetectdetect
YOLO9-E2Eexperimentaldetectdetect
YOLO-NASexperimentaldetectdetect, pose
D-FINE / DEIM / DEIMv2experimentaldetectdetect
RT-DETR / RT-DETRv2 / RT-DETRv4experimentaldetectdetect
PicoDetexperimentaldetectdetect
EdgeCrafter (EC)experimentaldetectdetect, pose, segment
DAMO-YOLO / RTMDetexperimentaldetectdetect
L2CSexperimentalgazegaze (inference-only)

Examples

text
1# Detection (implicit)
2LibreYOLO9c.pt
3LibreRFDETRs.pt
4LibreRTDETRr50.pt
5
6# Segmentation
7LibreYOLO9c-seg.pt
8LibreRFDETRs-seg.pt
9LibreECm-seg.pt
10
11# Pose
12LibreYOLONASn-pose.pt
13LibreECs-pose.pt
14
15# Gaze
16LibreL2CSr50.pt # gaze is L2CS's only task - suffix optional

Deprecated aliases

LibreYOLORTDETR and LibreYOLORFDETR are old names for LibreRTDETR and LibreRFDETR respectively. They still resolve with a DeprecationWarning - update imports when convenient.

Prediction

The single-GPU prediction path is heavily tested for YOLO9 detection, RF-DETR detection, and RF-DETR segmentation. Other families and tasks use the same API but are experimental in v1.2.0.

Basic prediction

python
1result = model("image.jpg")

All prediction parameters

python
1result = model(
2 "image.jpg",
3 conf=0.25, # confidence threshold (default: 0.25)
4 iou=0.45, # NMS IoU threshold (default: 0.45)
5 imgsz=640, # input size override (default: model's native)
6 device="auto", # "auto", "cpu", "mps", "0", "cuda:0", ...
7 classes=[0, 2, 5], # filter to specific class IDs (default: all)
8 max_det=300, # max detections per image (default: 300)
9 augment=False, # test-time augmentation where implemented
10 save=True, # save annotated image (default: False)
11 batch=4, # directory batch size
12 stream=False, # video only: yield frame results instead of a list
13 vid_stride=1, # video only: process every N-th frame
14 show=False, # video only: display annotated frames
15 tiling=False, # large-image tiled detection
16 overlap_ratio=0.2, # tile overlap ratio
17 output_path="out/", # where to save (default: runs/detect/predict*/)
18 color_format="auto", # "auto", "rgb", or "bgr"
19 output_file_format="png", # output format: "jpg", "png", "webp"
20)

model.predict(...) is an alias for model(...).

Supported input formats

LibreYOLO accepts images in any of these formats:

python
1# File path (string or pathlib.Path)
2result = model("photo.jpg")
3result = model(Path("photo.jpg"))
4
5# URL
6result = model("https://example.com/image.jpg")
7result = model("s3://bucket/image.jpg")
8result = model("gs://bucket/image.jpg")
9
10# PIL Image
11from PIL import Image
12img = Image.open("photo.jpg")
13result = model(img)
14
15# NumPy array (HWC or CHW, RGB or BGR, uint8 or float32)
16import numpy as np
17arr = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
18result = model(arr)
19
20# OpenCV (BGR) - specify color_format
21import cv2
22frame = cv2.imread("photo.jpg")
23result = model(frame, color_format="bgr")
24
25# PyTorch tensor (CHW or NCHW)
26import torch
27tensor = torch.randn(3, 640, 640)
28result = model(tensor)
29
30# Raw bytes
31with open("photo.jpg", "rb") as f:
32 result = model(f.read())
33
34# BytesIO
35from io import BytesIO
36result = model(BytesIO(open("photo.jpg", "rb").read()))
37
38# Directory of images
39results = model("images/", batch=4)

Working with results

Every prediction returns a Results object (or a list of them for directories):

python
1result = model("image.jpg")
2
3# Number of detections
4len(result) # e.g., 5
5
6# Bounding boxes in xyxy format (x1, y1, x2, y2)
7result.boxes.xyxy # tensor of shape (N, 4)
8
9# Bounding boxes in xywh format (center_x, center_y, width, height)
10result.boxes.xywh # tensor of shape (N, 4)
11
12# Confidence scores
13result.boxes.conf # tensor of shape (N,)
14
15# Class IDs
16result.boxes.cls # tensor of shape (N,)
17
18# Combined data: [x1, y1, x2, y2, conf, cls]
19# Tracking adds a track_id column before conf/cls.
20result.boxes.data # shape (N, 6), or (N, 7) when tracked
21
22# Metadata
23result.orig_shape # (height, width) of original image
24result.path # source file path (or None)
25result.names # {0: "person", 1: "bicycle", ...}
26
27# Move to CPU / convert to numpy
28result_cpu = result.cpu()
29boxes_np = result.boxes.numpy()

Class filtering

Filter detections to specific class IDs:

python
1# Only detect people (class 0) and cars (class 2)
2result = model("image.jpg", classes=[0, 2])

Tiled Inference

For images much larger than the model's input size (e.g., satellite imagery, drone footage), tiled inference splits the image into overlapping tiles, runs detection on each, and merges results.

Tiling is detection-only in v1.2.0 dev. It rejects segmentation masks, and it cannot be combined with augment=True.

python
1result = model(
2 "large_aerial_image.jpg",
3 tiling=True,
4 overlap_ratio=0.2, # 20% overlap between tiles (default)
5 save=True,
6)
7
8# Extra metadata on tiled results
9result.tiled # True
10result.num_tiles # number of tiles used
11result.saved_path # output directory when save=True
12result.tiles_path # directory containing per-tile crops
13result.grid_path # grid visualization image

When save=True with tiling, LibreYOLO saves:

  • final_image.jpg - full image with all merged detections drawn
  • grid_visualization.jpg - image showing tile grid overlay
  • tiles/ - individual tile crops
  • metadata.json - tiling parameters and detection counts

If the image is already smaller than the model's input size, tiling is skipped automatically.

Video Inference

Pass any video file to a flagship and LibreYOLO auto-detects the format from the extension. Supported: .mp4, .avi, .mov, .mkv, .webm, .gif, and other common containers.

Save annotated video

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("LibreYOLO9c.pt")
4results = model("clip.mp4", save=True)
5# Saved under runs/detect/predict*/clip.mp4

Stream results (memory-flat)

For long videos, pass stream=True to get a generator. Each iteration yields the Results for one frame - no full list buffered in RAM.

python
1for result in model("long_clip.mp4", stream=True):
2 print(f"frame {result.frame_idx}: {len(result)} detections")

Frame subsampling

python
1# Process every 2nd frame (halves compute and saved fps)
2results = model("clip.mp4", vid_stride=2, save=True)

Live preview

python
1# Display annotated frames in an OpenCV window while processing
2results = model("clip.mp4", show=True)

VideoSource / VideoWriter for custom pipelines

When you need full control of decoding and encoding - custom frame transforms, mixing tracker output, writing to a non-default codec - use the building blocks directly:

python
1from libreyolo import LibreYOLO
2from libreyolo.utils.video import VideoSource, VideoWriter
3
4model = LibreYOLO("LibreYOLO9c.pt")
5
6with VideoSource("clip.mp4", vid_stride=1) as src, \
7 VideoWriter("out.mp4", fps=src.fps, width=src.width, height=src.height) as out:
8 for frame_bgr, frame_idx in src:
9 result = model(frame_bgr, color_format="bgr")
10 # ... draw, transform, etc.
11 out.write_frame(frame_bgr)

Tracking

LibreYOLO ships a ByteTrack multi-object tracker that consumes Results from any detector and adds persistent track IDs. It is most tested with single-GPU YOLO9 detection and RF-DETR detection; other detection families are experimental in v1.2.0.

Install

bash
1pip install libreyolo[tracking] # compatibility extra; tracking deps ship in base dev install

Video tracking helper

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("LibreYOLO9c.pt")
4
5for result in model.track(
6 "clip.mp4",
7 track_conf=0.25,
8 iou=0.45,
9 save=True, # writes runs/track/<video_stem>.mp4 by default
10 vid_stride=1,
11):
12 print(result.frame_idx, result.track_id)

model.track() is a generator for video files. It runs detection frame by frame, uses the lower ByteTrack confidence internally for recovery, and yields Results with result.track_id and result.boxes.id populated.

Basic loop

python
1from libreyolo import LibreYOLO, ByteTracker
2from libreyolo.utils.video import VideoSource
3
4model = LibreYOLO("LibreYOLO9c.pt")
5tracker = ByteTracker()
6
7with VideoSource("clip.mp4") as src:
8 for frame_bgr, frame_idx in src:
9 result = model(frame_bgr, color_format="bgr", conf=0.1)
10 tracked = tracker.update(result)
11
12 for i in range(len(tracked.boxes)):
13 track_id = int(tracked.boxes.id[i])
14 xyxy = tracked.boxes.xyxy[i].tolist()
15 cls = int(tracked.boxes.cls[i])
16 print(f"frame {frame_idx} - id {track_id} cls {cls} {xyxy}")

After tracker.update(), result.boxes.id holds the track IDs and result.boxes.is_track is True.

TrackConfig knobs

python
1from libreyolo import ByteTracker, TrackConfig
2
3cfg = TrackConfig(
4 track_high_thresh=0.25, # first-stage match threshold
5 track_low_thresh=0.1, # second-stage (low-conf recovery)
6 new_track_thresh=0.25, # minimum conf to start a new track
7 match_thresh=0.8, # IoU cost cutoff (stage 1)
8 match_thresh_low=0.5, # IoU cost cutoff (stage 2)
9 match_thresh_unconfirmed=0.7, # IoU cost cutoff for unconfirmed tracks
10 track_buffer=30, # frames to keep lost tracks before removal
11 frame_rate=30, # scales track_buffer
12 fuse_score=True, # multiply IoU by detection score
13 minimum_consecutive_frames=1, # frames to confirm a new track
14)
15tracker = ByteTracker(config=cfg)

Reset between clips

python
1tracker.reset() # clears tracked / lost / removed lists and the ID counter

Segmentation

v1.2.0 validation scope

The heavily tested path is detection, training and inference for YOLO9 and RF-DETR, including RF-DETR segmentation.

Other model families, tasks, and multi-GPU workflows are available but experimental.

RF-DETR segmentation is the heavily tested segmentation path in v1.2.0. YOLO9 segmentation and EdgeCrafter segmentation are available through the same -seg suffix, but they are experimental.

Run segmentation

python
1from libreyolo import LibreYOLO
2
3# RF-DETR segmentation, heavily tested on single GPU
4model = LibreYOLO("LibreRFDETRs-seg.pt")
5result = model("photo.jpg")
6
7# YOLO9 segmentation is available but experimental in v1.2.0
8# model = LibreYOLO("LibreYOLO9c-seg.pt")
9
10# Segmentation returns boxes + masks
11print(result.boxes.xyxy) # bounding boxes (N, 4)
12print(result.boxes.cls) # class IDs (N,)
13print(result.masks.data.shape) # (N, H, W) tensor of binary masks

Mask representations

python
1# Raw bitmasks
2result.masks.data # tensor (N, H, W) - original image resolution
3
4# Polygon contours (one ndarray of (M, 2) per instance)
5result.masks.xy # absolute pixel coords
6result.masks.xyn # normalized to [0, 1]
7
8# Move / convert like Boxes
9result.masks.cpu()
10result.masks.numpy()

Save annotated output

save=True draws boxes and translucent mask overlays automatically.

python
1model("photo.jpg", save=True)

Training segmentation

RF-DETR segmentation uses the RF-DETR COCO-format training pipeline and is part of the heavily tested single-GPU scope. YOLO9 segmentation and EdgeCrafter segmentation training are available but experimental in v1.2.0.

Pose Estimation

Pose (human keypoint) estimation is supported on YOLO-NAS (-pose) and EdgeCrafter (-pose). Each pose model is single-class ("person") with 17 COCO keypoints.

Run pose

python
1from libreyolo import LibreYOLO
2
3# YOLO-NAS pose
4model = LibreYOLO("LibreYOLONASs-pose.pt")
5result = model("people.jpg")
6
7# EdgeCrafter pose
8# model = LibreYOLO("LibreECs-pose.pt")
9
10# Per-person bbox + 17 keypoints
11print(result.boxes.xyxy) # person boxes (N, 4)
12print(result.keypoints.xy.shape) # (N, 17, 2) pixel coordinates

Keypoint API

python
1result.keypoints.xy # (N, K, 2) absolute pixel coords
2result.keypoints.xyn # (N, K, 2) normalized to [0, 1]
3result.keypoints.conf # (N, K) per-keypoint confidence (None if model doesn't emit it)
4result.keypoints.has_visible # (N, K) bool - conf > 0
5
6result.keypoints.cpu()
7result.keypoints.numpy()

Save annotated output

python
1model("people.jpg", save=True) # draws boxes + skeleton

Pose training is supported for YOLO-NAS; EdgeCrafter pose is currently inference-only. YOLO9 and RF-DETR don't ship pose checkpoints yet.

Gaze Estimation

Gaze direction estimation is provided by the LibreL2CS family, an L2CS-Net port with a ResNet trunk and two angle-bin classification heads. It is a two-stage model: an upstream face detector locates faces, then the gaze head predicts per-face pitch and yaw in radians. It is inference-only and experimental in v1.2.0.

Install

bash
1pip install libreyolo[gaze] # optional Google Drive helper for Gaze360 weights

The published L2CS ResNet-50 weights are trained on Gaze360 and are not mirrored by LibreYOLO. Without the optional helper, pass a local checkpoint path or follow the manual download instructions printed by LibreL2CS.

Two-stage inference

python
1from libreyolo import LibreYOLO
2from libreyolo.models.l2cs.face import resolve_face_detector
3
4# Gaze head
5gaze = LibreYOLO("LibreL2CSr50.pt")
6
7# Wire any LibreYOLO detector trained on faces
8face = LibreYOLO("path/to/face-detector.pt")
9gaze.face_detector = resolve_face_detector(face)
10
11result = gaze("portrait.jpg")
12print(result.boxes.xyxy) # face boxes
13print(result.gaze.data) # (N, 2) tensor - pitch, yaw in radians

Decode angles

python
1import math
2
3for i in range(len(result.gaze)):
4 pitch_rad, yaw_rad = result.gaze.data[i].tolist()
5 pitch_deg = pitch_rad * 180.0 / math.pi
6 yaw_deg = yaw_rad * 180.0 / math.pi
7 print(f"face {i}: pitch={pitch_deg:.1f} deg, yaw={yaw_deg:.1f} deg")

From the CLI: libreyolo predict model=LibreL2CSr50.pt source=portrait.jpg --face-detector path/to/face.pt.

Training

v1.2.0 validation scope

The heavily tested path is detection, training and inference for YOLO9 and RF-DETR, including RF-DETR segmentation.

Other model families, tasks, and multi-GPU workflows are available but experimental.

The heavily tested training paths are single-GPU YOLO9 detection, RF-DETR detection, and RF-DETR segmentation. Other model-family trainers, YOLO9 segmentation training, and multi-GPU workflows are available but experimental in v1.2.0.

YOLO9 - CNN flagship training

python
1from libreyolo import LibreYOLO
2
3# Fine-tune from a pretrained checkpoint (recommended)
4model = LibreYOLO("LibreYOLO9c.pt")
5
6results = model.train(
7 data="coco128.yaml", # path to data.yaml (required)
8
9 # Schedule
10 epochs=300, # default: 300
11 batch=16,
12 imgsz=640,
13
14 # Optimizer
15 lr0=0.01, # initial learning rate
16 optimizer="SGD", # "SGD", "Adam", "AdamW"
17
18 # System
19 device="0", # "" | "cpu" | "cuda" | "0" | "0,1"
20 workers=8,
21 seed=0,
22
23 # Output
24 project="runs/train",
25 name="yolo9_exp",
26 exist_ok=False,
27
28 # Training features
29 amp=True, # automatic mixed precision
30 patience=50, # early stopping patience
31 resume=False, # resume from loaded checkpoint
32)
33
34print(f"Best mAP50-95: {results['best_mAP50_95']:.3f}")
35print(f"Best checkpoint: {results['best_checkpoint']}")

After training completes, the model instance is automatically reloaded with the best weights so you can call model(...) immediately. YOLO9 segmentation training is supported via LibreYOLO("LibreYOLO9c-seg.pt"), but it is experimental in v1.2.0.

RF-DETR - transformer flagship training

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("LibreRFDETRs.pt")
4
5results = model.train(
6 data="path/to/data.yaml",
7 epochs=100,
8 batch_size=4, # NOTE: RF-DETR uses batch_size, not batch
9 lr=1e-4,
10 output_dir="runs/train/rfdetr_exp",
11)

RF-DETR has its own training signature (batch_size, lr, output_dir) but it uses LibreYOLO's dataset config loader. Pass a data.yaml for detection or segmentation; COCO/Roboflow-style annotation layouts can be referenced from that config.

Training results dict

python
1{
2 "final_loss": 2.31,
3 "best_mAP50": 0.682,
4 "best_mAP50_95": 0.451,
5 "best_epoch": 87,
6 "save_dir": "runs/train/yolo9_exp",
7 "best_checkpoint": "runs/train/yolo9_exp/weights/best.pt",
8 "last_checkpoint": "runs/train/yolo9_exp/weights/last.pt",
9}

Resuming training

python
1# Load the checkpoint with the factory, then resume
2model = LibreYOLO("runs/train/yolo9_exp/weights/last.pt")
3results = model.train(data="coco128.yaml", resume=True)

Custom dataset YAML format

data.yaml
1path: /path/to/dataset
2train: images/train
3val: images/val
4test: images/test # optional
5
6nc: 3
7names: ["cat", "dog", "bird"]

Additional training paths

Other families have trainer hooks, but they are not the recommended path in v1.2.0. Keep new work on YOLO9 detection or RF-DETR detection/segmentation; use experimental trainers only for compatibility, benchmark reproduction, or targeted research. DAMO-YOLO, PicoDet, RTMDet, and EC training require an explicit allow_experimental=True acknowledgement.

Training from a YAML config

Every model.train(...) accepts cfg="train.yaml" to load all parameters from a file. Explicit kwargs still win over yaml values, so you can use a yaml for the baseline and override individual fields per run.

python
1model = LibreYOLO("LibreYOLO9c.pt")
2results = model.train(cfg="configs/yolo9_finetune.yaml")
3# Override individual fields:
4# results = model.train(cfg="configs/yolo9_finetune.yaml", epochs=50)

Gradient accumulation

Pass nbs (nominal batch size) to opt into gradient accumulation. The trainer steps the optimizer every nbs / batch forward passes, which lets you train at the recipe's reference batch size on smaller hardware.

python
1# Effective batch 64 on a single GPU that only fits batch=8
2model.train(data="coco128.yaml", batch=8, nbs=64)

Distributed training (DDP, experimental)

YOLO9 and RF-DETR support multi-GPU training through PyTorch DistributedDataParallel, but multi-GPU is outside the heavily tested v1.2.0 scope. Launch the training script with torchrun:

bash
1# 4-GPU node
2torchrun --nproc_per_node=4 train_yolo9.py
3
4# Multi-node - see PyTorch's torchrun docs for --nnodes / --rdzv_endpoint
train_yolo9.py
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("LibreYOLO9c.pt")
4# Pass device="" (auto-detect) and let torchrun set the rank
5model.train(data="coco128.yaml", epochs=300, batch=16)

Validation

Run COCO-standard evaluation on a validation set. The heavily tested validation paths are single-GPU YOLO9 detection, RF-DETR detection, and RF-DETR segmentation.

python
1results = model.val(
2 data="coco128.yaml", # dataset config
3 batch=16,
4 imgsz=640,
5 conf=0.001, # low conf for mAP calculation
6 iou=0.6, # NMS IoU threshold
7 split="val", # "val", "test", or "train"
8 save_json=False, # save predictions as COCO JSON
9 verbose=True, # print per-class metrics
10)
11
12print(f"mAP50: {results['metrics/mAP50']:.3f}")
13print(f"mAP50-95: {results['metrics/mAP50-95']:.3f}")

Validation results dict

By default, LibreYOLO uses COCO evaluation and returns precision, recall, AP/AR metrics, and per-image timing:

python
1{
2 "metrics/mAP50-95": 0.489, # COCO primary metric (AP@[.5:.95])
3 "metrics/mAP50": 0.721, # AP@0.5 (PASCAL VOC style)
4 "metrics/mAP75": 0.534, # AP@0.75 (strict)
5 "metrics/precision": 0.68,
6 "metrics/recall": 0.61,
7 "metrics/precision(B)": 0.68, # bbox aliases
8 "metrics/recall(B)": 0.61,
9 "metrics/mAP50(B)": 0.721,
10 "metrics/mAP50-95(B)": 0.489,
11 "metrics/mAP_small": 0.291,
12 "metrics/mAP_medium": 0.532,
13 "metrics/mAP_large": 0.648,
14 "metrics/AR1": 0.362, # Average Recall (max 1 det)
15 "metrics/AR10": 0.571,
16 "metrics/AR100": 0.601,
17 "metrics/AR_small": 0.387,
18 "metrics/AR_medium": 0.641,
19 "metrics/AR_large": 0.739,
20 "speed/preprocess_ms": 1.2,
21 "speed/inference_ms": 6.8,
22 "speed/postprocess_ms": 0.9,
23 "speed/total_ms": 8.9,
24 "speed/total_s": 12.3,
25 "speed/images_seen": 1382,
26}

Segmentation validation returns mask metrics with (M) suffixes alongside bbox metrics with (B) suffixes. Pose validation returns COCO keypoint metrics through PoseValidator.

Export

Export PyTorch models to ONNX, TorchScript, TensorRT, OpenVINO, NCNN, or CoreML for deployment. The heavily tested export and runtime-backend paths are single-GPU YOLO9 detection, RF-DETR detection, and RF-DETR segmentation. Other families and tasks are experimental.

Quick export

python
1# ONNX (default)
2model.export()
3
4# TorchScript
5model.export(format="torchscript")
6
7# TensorRT (requires NVIDIA GPU + TensorRT)
8model.export(format="tensorrt")
9
10# OpenVINO (optimized for Intel hardware)
11model.export(format="openvino")
12
13# NCNN (via PNNX)
14model.export(format="ncnn")
15
16# CoreML (.mlpackage, macOS runtime)
17model.export(format="coreml")

All export parameters

python
1path = model.export(
2 format="onnx", # "onnx", "torchscript", "tensorrt", "openvino", "ncnn", or "coreml"
3 output_path="model.onnx", # output file (auto-generated if None)
4 imgsz=640, # input resolution (default: model's native)
5 opset=None, # ONNX opset (auto: 13, or 17 for wrappers that need it)
6 simplify=True, # run onnxsim graph simplification
7 dynamic=True, # enable dynamic batch axis
8 half=False, # export in FP16
9 batch=1, # batch size for static graph
10 device=None, # device to trace on (default: model's current device)
11 int8=False, # INT8 quantization (TensorRT / OpenVINO only)
12 data=None, # calibration dataset for INT8
13 fraction=1.0, # fraction of calibration data to use
14 allow_download_scripts=False, # allow data.yaml download hooks during calibration
15 workspace=4.0, # TensorRT workspace size (GB)
16 min_batch=1, # TensorRT dynamic profile minimum batch
17 opt_batch=1, # TensorRT dynamic profile optimal batch
18 max_batch=8, # TensorRT dynamic profile maximum batch
19 hardware_compatibility="none", # TensorRT compatibility mode
20 gpu_device=0, # GPU device index for TensorRT
21 trt_config=None, # optional TensorRT YAML config path
22 compute_units="all", # CoreML routing: all, cpu_only, cpu_and_gpu, cpu_and_ne
23 nms=False, # CoreML embedded NMS where supported
24 iou=0.45, # CoreML embedded NMS IoU threshold
25 conf=0.25, # CoreML embedded NMS confidence threshold
26 verbose=False, # verbose logging
27)

OpenVINO INT8 export additionally requires nncf. NCNN export writes a directory containing model.ncnn.param, model.ncnn.bin, and metadata.yaml. CoreML export writes a .mlpackage bundle, requires coremltools, and does not support INT8.

ONNX metadata

Exported ONNX files include embedded metadata:

KeyExample value
libreyolo_version"1.2.0"
model_family"yolox"
model_size"s"
nb_classes"80"
names'{"0": "person", "1": "bicycle", ...}'
imgsz"640"
dynamic"True"
half"False"

This metadata is automatically read back when loading the exported file with LibreYOLO("model.onnx").

TorchScript Inference

Run an exported .torchscript model through the same runtime-backend prediction API.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("model.torchscript")
4
5result = model("image.jpg", conf=0.25, iou=0.45, save=True)
6print(result.boxes.xyxy)

ONNX Inference

Run inference using ONNX Runtime instead of PyTorch. Useful for deployment environments without PyTorch.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("model.onnx")
4
5result = model("image.jpg", conf=0.25, iou=0.45, save=True)
6print(result.boxes.xyxy)

Auto-metadata

If the ONNX file was exported by LibreYOLO, class names and class count are read automatically from the embedded metadata:

python
1# Export with metadata
2model.export(format="onnx", output_path="model.onnx")
3
4# Load - names and nb_classes auto-populated
5onnx_model = LibreYOLO("model.onnx")
6print(onnx_model.names) # {0: "person", 1: "bicycle", ...}
7print(onnx_model.nb_classes) # 80

For ONNX files without metadata (e.g., exported by other tools), specify nb_classes manually:

python
1model = LibreYOLO("external_model.onnx", nb_classes=20)

Device selection

python
1# Auto-detect (CUDA if available, else CPU)
2model = LibreYOLO("model.onnx", device="auto")
3
4# Force CPU
5model = LibreYOLO("model.onnx", device="cpu")
6
7# Force CUDA
8model = LibreYOLO("model.onnx", device="cuda")

Prediction parameters

Runtime artifacts loaded through LibreYOLO() support the shared runtime prediction API:

python
1result = model(
2 "image.jpg",
3 conf=0.25,
4 iou=0.45,
5 imgsz=640,
6 classes=[0, 2],
7 max_det=300,
8 save=True,
9 output_path="output/annotated.jpg", # final file path when save=True
10 color_format="auto",
11)

Runtime backends do not expose PyTorch-only options such as tiling, overlap_ratio, or output_file_format.

Runtime backends also handle saving a little differently from the PyTorch wrappers: if you set output_path, pass a final file path, not a directory. If you omit it, the current backend default is under runs/detections/.

TensorRT Inference

Run inference using TensorRT for maximum throughput on NVIDIA GPUs. Requires CUDA plus the TensorRT Python bindings.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("model.engine")
4
5result = model("image.jpg", conf=0.25, iou=0.45, save=True)
6print(result.boxes.xyxy)

TensorRT artifacts loaded through LibreYOLO() support the same core runtime prediction API as ONNX and OpenVINO, including the same file-path-only output_path behavior for save=True.

OpenVINO Inference

Run inference using OpenVINO, optimized for Intel CPUs, GPUs, and VPUs.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("model_openvino/")
4
5result = model("image.jpg", conf=0.25, iou=0.45, save=True)
6print(result.boxes.xyxy)

OpenVINO directories loaded through LibreYOLO() read metadata.yaml when present and support the same core runtime prediction API.

NCNN Inference

Run inference using NCNN for lightweight deployment on CPU or Vulkan-capable GPU targets.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("model_ncnn/")
4
5result = model("image.jpg", conf=0.25, iou=0.45, save=True)
6print(result.boxes.xyxy)

An NCNN export directory contains model.ncnn.param, model.ncnn.bin, and usually metadata.yaml.

CoreML Inference

Run an exported .mlpackage through CoreML on macOS. CoreML routes execution with compute_units instead of PyTorch device strings.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("model.mlpackage", compute_units="all")
4
5result = model("image.jpg", conf=0.25, iou=0.45, save=True)
6print(result.boxes.xyxy)

Supported compute_units values are all, cpu_only, cpu_and_gpu, and cpu_and_ne.

CLI

Installing LibreYOLO registers a libreyolo command on your PATH (entry point in pyproject.toml). The CLI mirrors the Python API and follows Ultralytics-style key=value syntax.

Subcommands

CommandPurpose
predictRun inference on images, directories, or videos
trainTrain a model on a dataset
valEvaluate a model on a dataset
exportExport to ONNX / TorchScript / TensorRT / OpenVINO / NCNN / CoreML
checksPrint Python, torch, CUDA, GPU, and optional-package info
modelsList registered model families and CLI shortcut names
formatsList supported export formats
cfgPrint the default training configuration YAML
infoLoad a model and print resolved family, size, task, device, and classes
metadataInspect raw checkpoint metadata from a .pt file
versionPrint LibreYOLO + Python + torch versions

Model name shortcuts

The CLI accepts short names (yolo9-c) that resolve to weight filenames (LibreYOLO9c.pt) - discoverable via libreyolo models. You can also pass any explicit checkpoint path.

Common options

CommandImportant options
predictconf, iou, imgsz, classes, max_det, half, batch, tiling, overlap_ratio, output_file_format, project, name, exist_ok, face_detector
trainepochs, batch, imgsz, lr0, optimizer, scheduler, workers, seed, resume, amp, allow_download_scripts, dry_run
valsplit, batch, imgsz, conf, iou, max_det, half, data_dir, use_coco_eval, project, name, exist_ok, save_json, allow_download_scripts
exportformat, imgsz, batch, half, int8, dynamic, simplify, opset, data, fraction, device, allow_download_scripts, verbose

Predict

bash
1# Flagship: YOLO9
2libreyolo predict model=yolo9-c source=image.jpg conf=0.25 save=true
3
4# Flagship: RF-DETR
5libreyolo predict model=rfdetr-s source=image.jpg save=true
6
7# Video - saved under runs/detect/predict*/
8libreyolo predict model=yolo9-c source=clip.mp4 save=true
9
10# Tiled inference for very large images
11libreyolo predict model=yolo9-c source=aerial.jpg tiling=true save=true
12
13# Gaze (requires a face detector)
14libreyolo predict model=LibreL2CSr50.pt source=portrait.jpg \
15 --face-detector path/to/face.pt save=true

Train

bash
1libreyolo train model=yolo9-c data=coco128.yaml epochs=300 batch=16 device=0
2
3# Dry-run prints the resolved config without launching training
4libreyolo train model=yolo9-c data=coco128.yaml --dry-run

Validate

bash
1libreyolo val model=runs/train/exp/weights/best.pt data=coco128.yaml split=val

Export

bash
1libreyolo export model=runs/train/exp/weights/best.pt format=onnx dynamic=true
2libreyolo export model=best.pt format=tensorrt half=true
3libreyolo export model=best.pt format=openvino int8=true data=coco128.yaml
4libreyolo export model=best.pt format=coreml

Machine-readable output

Every command accepts --json (structured stdout for piping into scripts or agents) and --quiet (suppress stderr progress lines). The core predict, train, val, and export commands also accept --help-json to dump their parameter schema as JSON.

bash
1libreyolo predict model=yolo9-c source=img.jpg --json | jq .
2
3libreyolo train --help-json > train_schema.json

API Reference

LibreYOLO (factory)

python
1LibreYOLO(
2 model_path: str,
3 *,
4 device: str = "auto",
5 task: str | None = None, # override only when a custom artifact is ambiguous
6 nb_classes: int | None = None, # mainly for external exported artifacts
7 compute_units: str = "all", # CoreML only: all, cpu_only, cpu_and_gpu, cpu_and_ne
8) -> model wrapper or runtime backend

Prefer official checkpoint filenames and exported artifact paths, then let the factory resolve the details. It handles PyTorch checkpoints, .onnx, .torchscript, .engine, .tensorrt, .mlpackage, OpenVINO directories containing model.xml, and NCNN directories containing model.ncnn.param plus model.ncnn.bin. The task argument is for ambiguous custom artifacts; otherwise resolution comes from checkpoint metadata, filename suffix, and family default.

Prediction (PyTorch model wrappers)

python
1model(
2 source, # image input (see supported formats)
3 *,
4 conf: float = 0.25,
5 iou: float = 0.45,
6 imgsz: int = None,
7 device: str = "auto",
8 classes: list[int] = None,
9 max_det: int = 300,
10 augment: bool = False,
11 save: bool = False,
12 batch: int = 1,
13 stream: bool = False,
14 vid_stride: int = 1,
15 show: bool = False,
16 output_path: str = None,
17 color_format: str = "auto",
18 tiling: bool = False,
19 overlap_ratio: float = 0.2,
20 output_file_format: str = None,
21) -> Results | list[Results] | Generator[Results, None, None]

Prediction (runtime backends)

python
1backend(
2 source,
3 *,
4 conf: float = 0.25,
5 iou: float = 0.45,
6 imgsz: int = None,
7 classes: list[int] = None,
8 max_det: int = 300,
9 save: bool = False,
10 batch: int = 1,
11 output_path: str = None, # final file path when save=True
12 color_format: str = "auto",
13) -> Results | list[Results]

If output_path is omitted for a runtime backend, the current default save location is runs/detections/.

Results

python
1result = Results(
2 boxes: Boxes | None,
3 orig_shape: tuple[int, int], # (height, width)
4 path: str | None,
5 names: dict[int, str],
6 masks: Masks | None = None,
7 keypoints: Keypoints | None = None,
8 probs: Probs | None = None,
9 obb: OBB | None = None,
10 gaze: Gaze | None = None,
11 speed: dict[str, float] | None = None,
12 track_id = None,
13 frame_idx: int | None = None,
14)
15
16len(result) # number of detections
17result.cpu() # copy with tensors on CPU
18result.cuda() # copy with tensors on CUDA
19result.numpy() # copy with numpy arrays
20result.summary() # list[dict] with boxes, masks, gaze, and track_id when present
21result.to_json() # JSON string from summary()

Boxes

python
1boxes = Boxes(boxes, conf, cls)
2
3boxes.xyxy # (N, 4) tensor - x1, y1, x2, y2
4boxes.xywh # (N, 4) tensor - cx, cy, w, h
5boxes.conf # (N,) tensor - confidence scores
6boxes.cls # (N,) tensor - class IDs
7boxes.id # (N,) track IDs when tracking, else None
8boxes.is_track # True when track IDs are attached
9boxes.data # (N, 6) [xyxy, conf, cls], or (N, 7) with track IDs
10
11len(boxes) # number of boxes
12boxes.cpu() # copy on CPU
13boxes.numpy() # copy as numpy arrays

Task payloads

python
1result.masks.data # segmentation masks, (N, H, W)
2result.masks.xy # list of mask contours in pixel coordinates
3result.masks.xyn # normalized mask contours
4
5result.keypoints.xy # pose keypoint coordinates
6result.keypoints.xyn # normalized keypoint coordinates
7result.keypoints.conf # keypoint confidence when present
8
9result.gaze.data # (N, 2): pitch, yaw in radians
10result.gaze.pitch_deg # pitch in degrees
11result.gaze.yaw_deg # yaw in degrees
12result.gaze.direction_3d # approximate 3D direction vectors

model.export()

python
1model.export(
2 format: str = "onnx", # "onnx", "torchscript", "tensorrt", "openvino", "ncnn", or "coreml"
3 *,
4 output_path: str | None = None,
5 imgsz: int | None = None,
6 opset: int | None = None, # auto: 13, or 17 for wrappers that need it
7 simplify: bool = True,
8 dynamic: bool = True,
9 half: bool = False,
10 batch: int = 1,
11 device: str | None = None,
12 int8: bool = False,
13 data: str | None = None, # calibration data for INT8
14 fraction: float = 1.0, # fraction of calibration data
15 allow_download_scripts: bool = False,
16 workspace: float = 4.0, # TensorRT workspace (GB)
17 min_batch: int = 1, # TensorRT dynamic profile minimum batch
18 opt_batch: int = 1, # TensorRT dynamic profile optimal batch
19 max_batch: int = 8, # TensorRT dynamic profile maximum batch
20 hardware_compatibility: str = "none",
21 gpu_device: int = 0,
22 trt_config = None, # optional TensorRT YAML config path
23 compute_units: str = "all", # CoreML only
24 nms: bool = False, # CoreML embedded NMS where supported
25 iou: float = 0.45, # CoreML embedded NMS IoU threshold
26 conf: float = 0.25, # CoreML embedded NMS confidence threshold
27 verbose: bool = False,
28) -> str # path to exported file or directory

model.val()

python
1model.val(
2 data: str = None, # path to data.yaml
3 batch: int = 16,
4 imgsz: int = None,
5 conf: float = 0.001,
6 iou: float = 0.6,
7 workers: int = 4,
8 allow_download_scripts: bool = False,
9 device: str = None,
10 split: str = "val", # "val", "test", or "train"
11 augment: bool = False,
12 save_json: bool = False,
13 verbose: bool = True,
14) -> dict

Returns (COCO evaluation, default):

python
1{
2 "metrics/mAP50-95": float, # COCO primary metric
3 "metrics/mAP50": float,
4 "metrics/mAP75": float,
5 "metrics/mAP_small": float,
6 "metrics/mAP_medium": float,
7 "metrics/mAP_large": float,
8 "metrics/AR1": float,
9 "metrics/AR10": float,
10 "metrics/AR100": float,
11 "metrics/AR_small": float,
12 "metrics/AR_medium": float,
13 "metrics/AR_large": float,
14}

model.train() (YOLO9)

python
1model.train(
2 data: str, # path to data.yaml (required)
3 *,
4 epochs: int = 300,
5 batch: int = 16,
6 imgsz: int = 640,
7 lr0: float = 0.01,
8 optimizer: str = "SGD",
9 device: str = "",
10 workers: int = 8,
11 seed: int = 0,
12 project: str = "runs/train",
13 name: str = "yolo9_exp",
14 exist_ok: bool = False,
15 resume: bool = False,
16 amp: bool = True,
17 patience: int = 50,
18 allow_download_scripts: bool = False,
19 callbacks = None,
20) -> dict

Returns the standard LibreYOLO training dict with final_loss, best_mAP50, best_mAP50_95, best_epoch, save_dir, best_checkpoint, and last_checkpoint.

model.train() (RF-DETR)

python
1model.train(
2 data: str, # path to data.yaml
3 epochs: int = 100,
4 batch_size: int = 4,
5 lr: float = 1e-4,
6 output_dir: str = "runs/train",
7 resume: str = None,
8 **kwargs, # additional RF-DETR training args
9) -> dict

Additional experimental trainers exist for YOLO-NAS, D-FINE, DEIM, DEIMv2, EC, PicoDet, DAMO-YOLO, RT-DETRv2/v4, and RTMDet. They follow the same model.train(data="...yaml", ...) shape but their defaults and experimental gates are family-specific.

Runtime artifact loading

Load exported artifacts through LibreYOLO(), the same way you load PyTorch checkpoints. The factory chooses ONNX Runtime, TorchScript, TensorRT, OpenVINO, NCNN, or CoreML from the path:

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("model.onnx")
4model = LibreYOLO("model.torchscript")
5model = LibreYOLO("model.engine")
6model = LibreYOLO("model_openvino/")
7model = LibreYOLO("model_ncnn/")
8model = LibreYOLO("model.mlpackage", compute_units="all")

Advanced integrations can reach lower-level runtime modules, but normal application code should stay on the factory path.

ValidationConfig

python
1from libreyolo import ValidationConfig
2
3config = ValidationConfig(
4 data="coco128.yaml",
5 data_dir=None, # override dataset root directory
6 split="val", # "val", "test", or "train"
7 batch_size=16,
8 imgsz=640,
9 conf_thres=0.001,
10 iou_thres=0.6,
11 max_det=300,
12 iou_thresholds=( # mAP IoU sweep
13 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95,
14 ),
15 device="auto",
16 save_dir=None,
17 save_json=False,
18 verbose=True,
19 num_workers=4,
20 half=False,
21 augment=False, # test-time augmentation (TTA)
22 allow_download_scripts=False,
23 # Pose-only fields (PoseValidator)
24 keypoints_json=None,
25 images_dir=None,
26 oks_sigmas=None,
27)
28
29# Load/save YAML
30config = ValidationConfig.from_yaml("config.yaml")
31config.to_yaml("config.yaml")

Architecture Guide

This section is for contributors who want to understand the codebase internals.

Base class design

PyTorch model families inherit from BaseModel in libreyolo/models/base/model.py. Subclasses implement these abstract methods:

MethodPurpose
_init_model()Build and return the nn.Module
_get_available_layers()Return layer-name to module mapping
_get_preprocess_numpy()Return the NumPy preprocessor used for export / calibration
_preprocess()Image to tensor conversion
_forward()Model forward pass
_postprocess()Raw output to detection dicts

BaseModel provides the shared wrapper behavior: prediction, export, validation, size/name metadata, and training helpers. The actual single-image, batch, and tiled inference flow lives in libreyolo/models/base/inference.py, while deployment runtimes live under libreyolo/backends/.

Package structure

text
1libreyolo/
2 __init__.py # Public API exports + deprecated-alias resolver
3 tasks.py # Task types, suffix conventions, resolution rules
4 assets/parkour.jpg # SAMPLE_IMAGE
5 models/
6 __init__.py # LibreYOLO() factory + model registry bootstrap
7 base/
8 model.py # BaseModel - shared wrapper behaviour
9 inference.py # Shared prediction pipeline (image/dir/video/tiled)
10 yolox/ # LibreYOLOX (detect)
11 yolo9/ # LibreYOLO9 (detect, segment)
12 yolo9_e2e/ # LibreYOLO9E2E (detect)
13 yolonas/ # LibreYOLONAS (detect, pose)
14 dfine/ # LibreDFINE (detect)
15 deim/ # LibreDEIM (detect)
16 deimv2/ # LibreDEIMv2 (detect)
17 rtdetr/ # LibreRTDETR (detect)
18 rtdetrv2/ # LibreRTDETRv2 (detect)
19 rtdetrv4/ # LibreRTDETRv4 (detect)
20 rfdetr/ # LibreRFDETR (detect, segment) - lazy-loaded
21 ec/ # LibreEC / EdgeCrafter (detect, pose, segment)
22 picodet/ # LibrePICODET (detect)
23 damoyolo/ # LibreDAMOYOLO (detect)
24 rtmdet/ # LibreRTMDet (detect)
25 l2cs/ # LibreL2CS (gaze, inference-only)
26 backends/
27 base.py
28 onnx.py # ONNX Runtime loader
29 torchscript.py # TorchScript loader
30 tensorrt.py # TensorRT loader
31 openvino.py # OpenVINO loader
32 ncnn.py # NCNN loader
33 coreml.py # CoreML loader
34 export/
35 exporter.py # BaseExporter and format registry
36 onnx.py / torchscript.py / tensorrt.py / openvino.py / ncnn.py / coreml.py
37 config.py / calibration.py
38 training/
39 trainer.py # Shared trainer scaffolding
40 config.py # TrainConfig dataclass (single source of truth)
41 augment.py / callbacks.py / distributed.py / ema.py / scheduler.py
42 artifacts.py / train_config.yaml
43 # Per-family trainers live in models/<family>/trainer.py
44 validation/
45 config.py # ValidationConfig
46 base.py / preprocessors.py
47 detection_validator.py # DetectionValidator, SegmentationValidator
48 pose_validator.py # PoseValidator
49 coco_evaluator.py # COCOEvaluator
50 tracking/
51 tracker.py # ByteTracker
52 config.py # TrackConfig
53 kalman_filter.py / matching.py / strack.py
54 cli/
55 __init__.py # libreyolo entrypoint (Typer app)
56 commands/ # predict / train / val / export / special
57 aliases.py / config.py / parsing.py / output.py / errors.py
58 utils/
59 results.py # Results, Boxes, Masks, Keypoints, Probs, OBB, Gaze
60 image_loader.py # Unified image loading
61 video.py # VideoSource, VideoWriter, video inference loop
62 general.py # Path helpers, NMS, tiling utilities
63 download.py / drawing.py / logging.py / predict_args.py
64 serialization.py / box_ops.py
65 data/
66 dataset.py / pose_dataset.py / utils.py / yolo_coco_api.py
67 config/
68 datasets/ # Built-in dataset YAML configs (coco8, coco128, coco5000, coco, etc.)
69 export/ # TensorRT default YAML

Adding a new model family

  1. 1Create libreyolo/models/newmodel/model.py with a class inheriting BaseModel
  2. 2Set FAMILY, FILENAME_PREFIX, INPUT_SIZES, SUPPORTED_TASKS, and DEFAULT_TASK as needed
  3. 3Implement registry hooks such as can_load(), detect_size(), detect_nb_classes(), and detect_size_from_filename()
  4. 4Implement the model init, preprocess, forward, postprocess, train, and validation hooks that the family needs
  5. 5Create the supporting network and utilities under libreyolo/models/newmodel/
  6. 6Add the import to libreyolo/models/__init__.py; subclass registration happens when the import runs
  7. 7Export the class from libreyolo/__init__.py
  8. 8(Optional) Override val_preprocessor_class if validation preprocessing differs from the standard path

Export architecture

User code should export through model.export(...). Internally, BaseExporter in libreyolo/export/exporter.py owns the format registry, and concrete exporters register themselves through subclass registration.

python
1from libreyolo import LibreYOLO
2
3model = LibreYOLO("LibreYOLO9c.pt")
4model.export(format="onnx")

To add a new export format, implement a new BaseExporter subclass with a unique format_name and import it from libreyolo/export/exporter.py so the registry is populated.

Dataset Format

Training and validation use dataset configs loaded through data.yaml. Detection, segmentation, pose, and RF-DETR training all enter through this loader; the label file contents differ by task.

data.yaml structure

data.yaml
1path: /absolute/path/to/dataset # dataset root
2train: images/train # directory path, relative to path
3val: images/val # directory path, relative to path
4test: images/test # optional
5
6nc: 80 # number of classes
7names: [ # class names
8 "person", "bicycle", "car", "motorcycle", "airplane",
9 "bus", "train", "truck", "boat", "traffic light",
10 # ...
11]

Config resolution and downloads

Dataset configs resolve from an explicit path, the current working directory, then built-ins under libreyolo/config/datasets/. Dataset roots default under ~/datasets and can be overridden with LIBREYOLO_DATASETS_DIR.

train, val, and test may be directories, .txt files, or lists of paths. YAML download hooks are guarded; pass allow_download_scripts=True only for trusted configs.

File-list variant

The same YAML format can also point train, val, or test at .txt files containing one image path per line:

coco.yaml
1path: /absolute/path/to/coco
2train: train2017.txt
3val: val2017.txt
4test: test-dev2017.txt
5
6nc: 80
7names: ["person", "bicycle", "car", "..."]

Directory layout

text
1dataset/
2 images/
3 train/
4 img001.jpg
5 img002.jpg
6 val/
7 img003.jpg
8 labels/
9 train/
10 img001.txt
11 img002.txt
12 val/
13 img003.txt

Detection label format

One text file per image. Each line is one object:

text
1<class_id> <center_x> <center_y> <width> <height>

All coordinates are normalized to [0, 1] relative to image dimensions.

Example (img001.txt):

img001.txt
10 0.5 0.4 0.3 0.6
22 0.1 0.2 0.05 0.1

Segmentation label format

Segmentation uses YOLO polygon rows. The dataset loader derives the bounding box from the polygon vertices and keeps the polygon rings when segment loading is enabled:

text
1<class_id> <x1> <y1> <x2> <y2> ... <xn> <yn>

Pose label format

Pose labels append keypoints after the box. Add kpt_shape and flip_idx to data.yaml so the loader knows the keypoint count and horizontal flip permutation.

yaml
1kpt_shape: [17, 3]
2flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
text
1<class_id> <cx> <cy> <w> <h> <kx1> <ky1> <v1> ... <kxK> <kyK> <vK>

Built-in datasets

LibreYOLO ships built-in dataset configs under libreyolo/config/datasets/ and can auto-download supported datasets on first use:

python
1# These download automatically on first use
2results = model.val(data="coco8.yaml")
3results = model.train(data="coco128.yaml", epochs=10)