Inline Text
Overview
text3 is azul's text engine. It owns shaping, line breaking, BiDi
reordering, vertical writing modes, hyphenation, selection, and editing. The
older text2 path has been removed; text3 is the single live engine.
WIP — a few CSS Inline Layout Module Level 3 features (initial-letter,
text-box-trim, full ruby) are partially implemented; baseline alignment of
non-baseline vertical-align values uses approximate offsets.
The central type is TextShapingCache (re-exported as TextLayoutCache for
backward compatibility). The solver invokes it through
layout_ifc, which collects the IFC's Vec<InlineContent>,
builds UnifiedConstraints from CSS, and calls layout_flow. The result
is wrapped in a CachedInlineLayout and stored on the IFC root's LayoutNode
warm slab so the Layout solver can hit-test, select, and
re-render without re-shaping.
The resource side — font discovery, parsing, fallback chain resolution — is covered in Text Pipeline. This page is the in-engine shaping and layout pipeline.
The 5-stage pipeline
TextShapingCache::layout_flow is the top-level entry. Each stage is
independently cached:
InlineContent ──Stage 1─▶ LogicalItem
(per-char attribution)
│
▼ Stage 2
VisualItem (BiDi reorder, UAX #9)
│
▼ Stage 3
ShapedItem (HarfBuzz/allsorts; per-item cache)
│
▼ Stage 4
ShapedItem' (text-orientation rotate for vertical-rl/lr)
│
▼ Stage 5
PositionedItem in UnifiedLayout
(Knuth–Plass lines + final placement)
Stages 1–4 are independent of geometry; stage 5 takes a flow_chain: &[LayoutFragment] so the same shaped content can re-flow across columns or
pages without re-shaping.
pub fn layout_flow<T: ParsedFontTrait>(
&mut self,
content: &[InlineContent],
style_overrides: &[StyleOverride],
flow_chain: &[LayoutFragment],
font_chain_cache: &HashMap<FontChainKey, FontFallbackChain>,
fc_cache: &FcFontCache,
loaded_fonts: &LoadedFonts<T>,
debug_messages: &mut Option<Vec<LayoutDebugMessage>>,
) -> Result<FlowLayout, LayoutError>;
Caching architecture
TextShapingCache holds four maps, one per stage:
logical_items. Caches Stage 1. Keyed byCacheId = u64of&[InlineContent], valueArc<Vec<LogicalItem>>.visual_items. Caches Stage 2. Keyed by(logical_items_id, base_direction), valueArc<Vec<VisualItem>>.shaped_items. Caches Stage 3 (monolithic). Keyed by(visual_items_id, style_hash), valueArc<Vec<ShapedItem>>.per_item_shaped. Caches Stage 3 (incremental). Keyed byhash(text, bidi_level, script, style.layout_hash()), valueArc<PerItemShapedEntry>.
Stage 3 has two levels: a fast monolithic cache hit returns the entire
Vec<ShapedItem> if the visual-items + style hashes match. On a miss,
shape_visual_items_with_per_item_cache reuses individual cached items
per-key (keyed on text + bidi level + script + layout-affecting style) and
only re-shapes new items. Eviction runs every layout pass via
begin_generation:
pub fn begin_generation(&mut self) {
if self.generation > 0 && !self.per_item_accessed.is_empty() {
let accessed = &self.per_item_accessed;
self.per_item_shaped.retain(|k, _| accessed.contains(k));
}
self.per_item_accessed.clear();
self.generation += 1;
}
The cap is PER_ITEM_CACHE_MAX = 4096; exceeding it forces a generation
flush early.
InlineContent and LogicalItem
InlineContent is the externally-visible inline-level „atom“:
pub enum InlineContent {
Text(StyledRun),
Image(InlineImage),
Space(SpaceConfig),
LineBreak(LineBreakConfig),
Tab { style: Arc<StyleProperties> },
Marker { run: StyledRun, position_outside: bool },
Shape(InlineShape),
Ruby { base: Vec<InlineContent>, text: Vec<InlineContent>, style: Arc<StyleProperties> },
}
StyledRun carries a String plus an Arc<StyleProperties> (font selectors,
size, weight, decoration, color). Arc makes per-item cache entries cheap
to share between similar runs.
Stage 1 (create_logical_items) splits Text runs by script boundaries,
applies style_overrides (per-character style changes for selection, IME
preedit, search highlighting), and tags each LogicalItem with the source
span and style.
BiDi (Stage 2)
reorder_logical_items runs Unicode BiDi (UAX #9) using the unicode-bidi
crate. The base direction comes from CSS direction, except when
unicode-bidi: plaintext is set:
let base_direction = if unicode_bidi_val == UnicodeBidi::Plaintext {
let has_strong = logical_items.iter().any(|item| {
if let LogicalItem::Text { text, .. } = item {
matches!(unicode_bidi::get_base_direction(text.as_str()),
Direction::Ltr | Direction::Rtl)
} else { false }
});
if has_strong { get_base_direction_from_logical(&logical_items) }
else { first_constraints.direction.unwrap_or(BidiDirection::Ltr) }
} else {
first_constraints.direction.unwrap_or(BidiDirection::Ltr)
};
CSS Writing Modes § 8.3: plaintext auto-detects from the first strong
character; empty paragraphs fall back to the containing block's direction.
Shaping (Stage 3)
shape_visual_items and shape_visual_items_with_per_item_cache drive the
shaper through the ParsedFontTrait abstraction. The default implementation
uses allsorts for OpenType shaping
with HarfBuzz-equivalent ligatures, kerning, contextual forms, and complex
script support.
Font fallback: shaping a cluster goes through a FontFallbackChain resolved
from the cluster's script + style. Each fallback level is checked for
codepoint coverage; the first font that covers all codepoints in the cluster
wins. The fallback chain is built once per (font-family, weight, style)
stack by collect_and_resolve_font_chains_with_registration and cached on
FontManager.font_chain_cache. The full resolution pipeline is described in
Text Pipeline.
ShapedItem variants:
pub enum ShapedItem {
Cluster(ShapedCluster),
Object { ... },
CombinedBlock { ... },
Tab { ... },
Break { ... },
}
ShapedCluster.source_node_id: Option<NodeId> lets selection and editing map
glyph runs back to their source DOM node. Object and other generated items
lack a direct source_node_id; the IFC's ContentIndex mapping recovers it.
Text-orientation transform (Stage 4)
For writing-mode: vertical-rl/vertical-lr and text-orientation: upright | sideways | mixed, glyph clusters are rotated and offset before line
breaking. The transform uses constraints from the first fragment only;
multi-fragment flows with mixed writing modes are noted as a TODO in
text3/cache.rs.
Line breaking and flow (Stage 5)
text3/knuth_plass.rs implements Knuth–Plass total-fit line breaking. The
breaker walks ShapedItems, accumulating „boxes“ (clusters) and „glue“
(spaces), then minimises a total-badness metric across all line-break
combinations. Tightness, looseness, and text-wrap: balance are all knobs
in the badness function.
perform_fragment_layout runs once per LayoutFragment (one fragment per
column or per page). A BreakCursor tracks where the previous fragment
stopped; the next fragment picks up from that cursor. This is how
multi-column and paged inline layout works without re-shaping.
UnifiedLayout is the output:
pub struct UnifiedLayout {
pub items: Vec<PositionedItem>,
pub bounds: LogicalRect,
pub line_count: usize,
pub baseline_offsets: Vec<f32>,
// ...
}
pub struct PositionedItem {
pub item: ShapedItem,
pub position: LogicalPosition,
pub line_index: u32,
pub bidi_level: u8,
// ...
}
UnifiedLayout is wrapped in Arc and stored on the IFC root's
LayoutNode.warm.inline_layout_result: Option<Arc<CachedInlineLayout>> (see
Layout).
UnifiedConstraints
The full per-IFC layout input. Built by layout_ifc from CSS getters
on the IFC root:
pub struct UnifiedConstraints {
pub shape_boundaries: Vec<ShapeBoundary>,
pub shape_exclusions: Vec<ShapeBoundary>,
pub available_width: AvailableSpace,
pub available_height: Option<f32>,
pub writing_mode: Option<WritingMode>,
pub direction: Option<BidiDirection>,
pub text_orientation: TextOrientation,
pub text_align: TextAlign,
pub text_justify: JustifyContent,
pub line_height: LineHeight,
pub vertical_align: VerticalAlign,
pub strut_ascent: f32,
pub strut_descent: f32,
pub strut_x_height: f32,
pub ch_width: f32,
pub overflow: OverflowBehavior,
pub segment_alignment: SegmentAlignment,
pub text_combine_upright: Option<TextCombineUpright>,
pub exclusion_margin: f32,
pub hyphenation: Hyphens,
pub hyphenation_language: Option<Language>,
pub text_indent: f32,
pub text_indent_each_line: bool,
pub text_indent_hanging: bool,
pub initial_letter: Option<InitialLetter>,
pub line_clamp: Option<NonZeroUsize>,
pub text_wrap: TextWrap,
pub columns: u32,
pub column_gap: f32,
pub hanging_punctuation: bool,
pub overflow_wrap: OverflowWrap,
pub text_align_last: TextAlign,
pub word_break: WordBreak,
pub white_space_mode: WhiteSpaceMode,
pub line_break: LineBreakStrictness,
pub unicode_bidi: UnicodeBidi,
}
available_width: AvailableSpace is the cache-validity key. A layout shaped
under MinContent cannot be reused for Definite(actual_column_width) —
the line breaks would be at the wrong positions. This was the root cause of
the table-cell width bug fixed by storing constraints alongside the layout
in CachedInlineLayout. AvailableSpace::default() returns MaxContent,
never Definite(0.0) — a zero-width container would make every word
overflow to its own line.
PartialEq on UnifiedConstraints uses round_eq for floats so jitter
from CSS recomputation does not invalidate the cache. Hash uses f.round() as usize for the same reason.
FontManager and the font chain cache
FontManager<T> is parameterised over the parsed-font type (FontRef for
production, MockFont for tests).
pub struct FontManager<T> {
pub fc_cache: FcFontCache,
pub parsed_fonts: Arc<Mutex<HashMap<FontId, T>>>,
pub font_chain_cache: HashMap<FontChainKey, FontFallbackChain>,
pub embedded_fonts: Mutex<HashMap<u64, FontRef>>,
pub font_hash_to_families: HashMap<u64, StyleFontFamilyVec>,
pub registry: Option<Arc<FcFontRegistry>>,
pub last_resolved_font_stacks_sig: Option<u64>,
}
fc_cache is a rust-fontconfig v4.1 shared handle (internally
Arc<RwLock>); cloning is cheap and builder-thread writes are immediately
visible. registry is the optional scout-on-demand path: when present,
chain resolution lazy-parses families the DOM needs, dropping peak RSS by
the common-stack metadata size (~15 MiB on macOS) for headless renders that
don't touch every system font.
last_resolved_font_stacks_sig is the rolling-hash signature of
compact_cache.prev_font_hashes at the moment the chain cache was last
populated. LayoutWindow.layout_dom_recursive reads this to skip the
resolver when the DOM's font stacks haven't changed since the last
successful resolution.
FontContext is the application-wide shared font state — owned by App.
FontManager is the per-window one — owned by LayoutWindow. They share
the same parsed_fonts Arc. FontContext::pre_resolve_chains_for_dom is the
warmup hook: a headless renderer or PDF generator can pre-resolve all font
chains for a DOM before the first layout, avoiding a layout-time spike. The
function uses scripts_present_in_styled_dom to limit Unicode-fallback
fonts to the scripts actually present — for an ASCII-only page, this skips
the ~300 MiB Arial-Unicode / CJK / Arabic pull-in entirely.
Hyphenation
Behind feature = "text_layout_hyphenation". Uses the
hyphenation crate with TeX
patterns. Languages are loaded lazily; each UnifiedConstraints carries
hyphenation: Hyphens (Auto/None/Manual) and hyphenation_language: Option<Language>. Stage 5 inserts soft-hyphen break opportunities into the
Knuth–Plass break list before line breaking.
When the feature is off, text3::cache::Standard becomes a no-op stub
returning empty breaks, so the rest of the pipeline compiles unchanged.
Selection
Selection types live alongside text3 and in azul-core:
TextCursor { cluster_id: GraphemeClusterId, affinity: CursorAffinity }— locates a cursor between two grapheme clusters, with affinity choosing the visual side at line breaks.SelectionRange { anchor, focus }— sameTextCursortype at both ends.ContentIndex— a(run_index, cluster_offset)pair indexed against aUnifiedLayout. Maps cleanly to a(NodeId, byte_offset)viaShapedCluster.source_node_id.
hit_test_cursor_position(layout, point) returns the TextCursor at a
screen position. cursor_to_pixel_position(layout, cursor) is the inverse,
used to draw the caret. Both walk layout.items in source order.
Editing
text3/edit.rs operates directly against UnifiedLayout:
apply_text_changeset(&mut layout, changeset)mutates theitemsvec for a stream of inserts/deletes given as cluster-indexed operations.recompute_line_breaks(&mut layout, available_width)reruns Knuth–Plass over the modified items without re-shaping unaffected clusters.
This is the fast path used by LayoutWindow::try_incremental_text_relayout
for keystroke-by-keystroke text edits. It bypasses
solver3::layout_document entirely when the IFC's height does not change.
If the height changes (e.g. the line wraps), the path falls back to a normal
layout_document call so the BFC parent can reposition siblings.
DirtyTextNode holds the in-progress Vec<InlineContent> for an edited
text node before it's committed back into the DOM:
pub struct DirtyTextNode {
pub content: Vec<InlineContent>,
pub cursor: Option<TextCursor>,
pub needs_ancestor_relayout: bool,
}
needs_ancestor_relayout = true means the IFC's height changed and the
parent BFC needs to re-flow.
IME preedit injection
LayoutWindow.pre_preedit_content: Option<Vec<InlineContent>> stores a
snapshot of the pre-edit inline content. When IME preedit text changes (e.g.
during CJK composition), the renderer injects the preedit text into a clean
copy of the original content, preserving the user's existing input. Without
the snapshot, repeated setMarkedText calls would accumulate stale
preedits.
LayoutContext.preedit_text: Option<String> is the per-render preedit
string. cursor_locations: Vec<(DomId, NodeId, TextCursor)> carries
multi-cursor positions for both visible cursors and preedit anchors.
Layout-vs-render style equivalence
StyleProperties::layout_eq compares only the fields that affect glyph
positions (font, size, letter-spacing, word-spacing). Color, decoration,
background, and shadow are not compared. TextShapingCache::use_old_layout
uses this to decide whether a cached layout can be reused when constraints
plus content match but rendering-only properties changed:
pub fn use_old_layout(
old_constraints: &UnifiedConstraints,
new_constraints: &UnifiedConstraints,
old_content: &[InlineContent],
new_content: &[InlineContent],
) -> bool;
A pure color change on a paragraph thus keeps the same UnifiedLayout and
only triggers display-list regeneration.
The IFC call site
layout_ifc is the bridge from box layout to text layout. It:
- Resolves the IFC root's DOM ID (anonymous boxes inherit from parent or first child with a DOM id).
- Walks the IFC tree to collect
Vec<InlineContent>and achild_map: BTreeMap<NodeIndex, ContentRange>so glyph clusters can be mapped back to layout nodes for hit-testing. - Checks for a cached
CachedInlineLayoutwith matchingconstraints. If present andavailable_width+has_floatsmatch, returns it without re-running stages 1–5. - Builds
UnifiedConstraintsfrom CSS andLayoutConstraints. - Calls
text_cache.layout_flow. - Builds
CachedInlineLayout::new_with_constraintsand stores it on the IFC root'swarm.inline_layout_result. - Returns a
LayoutOutputwith the IFC's bounds and per-child positions for inline-blocks.
The first ~80 lines of layout_ifc are the cache-hit fast path; full
execution starts at the „Phase 1: Collect and measure all inline-level
children“ comment.
Known gaps vs CSS Inline Layout Module Level 3
- § 3.3 initial-letter (drop caps) — types in place, layout not wired.
- § 4 vertical-align — only baseline supported.
top,middle,bottom,text-top,text-bottom,super,subuse approximate offsets; full table-cell/inline-block alignment is incomplete. - § 6 text-box-trim / leading-trim — not implemented.
- Multi-fragment text orientation (mixed writing modes across columns) uses constraints from the first fragment only.
- Ruby layout: the
Rubyvariant exists but baseline alignment of base+text is approximate.
Coming Up Next
- Text Pipeline — font discovery, parsing, fallback chains
- Fragmentation — page breaks, widows, orphans, paged media
- Layout — solver3, formatting contexts, the per-frame relayout cycle