Compact Property Cache
Overview
The compact cache is a four-array, fixed-layout encoding of the ~50 layout-hot CSS properties. Layout reads them by node index in O(1), with no BTreeMap lookups and no cascade walks. Built once per restyle by build_compact_cache_with_inheritance (see Cascade, Inheritance, Restyle); read on every layout pass.
Properties that don't fit (background, box-shadow, transform, filter, content, transitions, ...) live on the slow CssPropertyCache::get_property_slow path and are not duplicated here. The HOT_FLAG_HAS_* and DOM_HAS_* bits are the negative fast path: stay on the slow path, but skip the walk entirely when the property is known unset.
This page documents the four arrays, the bit layout for tier 1, the sentinel encodings, the encoder side that populates them, and the procedure for adding a new compact-cached property.
Memory budget
For a 1000-node DOM:
tier1_enums: Vec<u64>is 8 B per node, 8 KB for 1000 nodes.tier2_dims: Vec<CompactNodeProps>is 68 B per node, 68 KB for 1000 nodes.tier2_cold: Vec<CompactNodePropsCold>is 28 B per node, 28 KB for 1000 nodes.tier2b_text: Vec<CompactTextProps>is 24 B per node, 24 KB for 1000 nodes.- Total. 128 B per node, 128 KB for 1000 nodes.
CompactLayoutCache
pub struct CompactLayoutCache {
pub tier1_enums: Vec<u64>,
pub tier2_dims: Vec<CompactNodeProps>,
pub tier2_cold: Vec<CompactNodePropsCold>,
pub tier2b_text: Vec<CompactTextProps>,
pub font_dirty_nodes: Vec<usize>,
pub prev_font_hashes: Vec<u64>,
pub font_hash_to_families: BTreeMap<u64, Vec<StyleFontFamily>>,
// …a few DOM-level flags described below
}
All four per-node arrays have length node_count and are indexed by NodeId::index(). Allocation happens in CompactLayoutCache::with_capacity(n) once per restyle.
font_dirty_nodes lists indices whose font_family_hash differs from prev_font_hashes. See Cascade, Inheritance, Restyle for how that drives incremental font resolution.
Tier 1: all enums in one u64
Every enum-valued layout property fits in a few bits. Tier 1 packs 21 of them into a single u64:
[4:0] display 5 bits (22 variants)
[7:5] position 3 bits
[9:8] float 2 bits
[12:10] overflow_x 3 bits
[15:13] overflow_y 3 bits
[16] box_sizing 1 bit
[18:17] flex_direction 2 bits
[20:19] flex_wrap 2 bits
[23:21] justify_content 3 bits
[26:24] align_items 3 bits
[29:27] align_content 3 bits
[31:30] writing_mode 2 bits
[33:32] clear 2 bits
[37:34] font_weight 4 bits
[39:38] font_style 2 bits
[42:40] text_align 3 bits
[44:43] visibility 2 bits
[47:45] white_space 3 bits
[48] direction 1 bit
[51:49] vertical_align 3 bits
[52] border_collapse 1 bit
[55:53] align_self 3 bits
[58:56] justify_self 3 bits
[60:59] grid_auto_flow 2 bits
[62:61] justify_items 2 bits
[63] TIER1_POPULATED 1 bit
Bit 63 is a „this node has tier-1 data“ flag. The bit layout treats 0 as „all defaults“ (Display::Block, Position::Static, etc.) so an all-zero tier1_enums[i] with bit 63 set is semantically the same as a fresh Default::default() node. But tier1_enums[i] == 0 (no bit 63) means „not yet populated“ and forces a slow-path lookup.
encode_tier1(display, position, float, ..., border_collapse) -> u64 is the single producer; one decode_* function per field is the consumer:
#[inline(always)]
pub fn decode_display(t1: u64) -> LayoutDisplay {
layout_display_from_u8(((t1 >> DISPLAY_SHIFT) & DISPLAY_MASK) as u8)
}
The _from_u8 / _to_u8 pairs are the per-enum codec. They use match rather than transmute so the compiler can prove every input maps to a defined output.
Tier 2 hot: CompactNodeProps
#[repr(C)]
pub struct CompactNodeProps {
// Dimensions with unit info: u32 MSB-sentinel
pub width: u32, pub height: u32,
pub min_width: u32, pub max_width: u32,
pub min_height: u32, pub max_height: u32,
pub flex_basis: u32,
pub font_size: u32,
// Resolved px × 10: i16 MSB-sentinel
pub padding_top: i16, pub padding_right: i16,
pub padding_bottom: i16, pub padding_left: i16,
pub margin_top: i16, pub margin_right: i16,
pub margin_bottom: i16, pub margin_left: i16,
pub border_top_width: i16, /* …4 sides */
pub top: i16, pub right: i16, pub bottom: i16, pub left: i16,
// Flex factors × 100: u16 MSB-sentinel
pub flex_grow: u16,
pub flex_shrink: u16,
// Gap × 10: i16
pub row_gap: i16,
pub column_gap: i16,
}
68 bytes, layout-critical, accessed in every iteration of the constraint-solving loop. The integer encodings use the top of their unsigned / signed range as sentinels for „value doesn't fit, fall through to slow path“.
Tier 2 cold: CompactNodePropsCold
28 bytes of paint-only and rare-but-typed properties: border colors as u32 RGBA, border radii as i16 × 10, z_index, border_styles_packed (4 bits per side), grid placement (grid_col/row_start/end as i16 with I16_AUTO sentinel), tab_size, border_spacing_h/v, opacity (× 254 with 255 = unset).
CompactNodePropsCold also carries two u8 „has-X“ flag bytes:
pub hot_flags: u8;
// bit 0: has_transform
// bit 1: has_transform_origin
// bit 2: has_box_shadow
// bit 3: has_text_decoration
// bits 4-5: scrollbar_gutter (auto/stable/both-edges/mirror)
// bit 6: has_background
// bit 7: has_clip_path
pub extra_flags: u8;
// bit 0: has_any_scrollbar_css
// bit 1: has_counter
// bit 2: has_break
// bit 3: has_text_orientation
// bit 4: has_text_shadow
// bit 5: has_backdrop_filter
// bit 6: has_filter
// bit 7: has_mix_blend_mode
These are negative fast paths. When hot_flags & HOT_FLAG_HAS_TRANSFORM == 0, the renderer can skip the slow-cascade walk for transform entirely and use the identity matrix. The bit is set during cascade build only when the node actually declares the property.
CompactLayoutCache itself carries a parallel set of DOM-level flags (DOM_HAS_TEXT_INDENT, DOM_HAS_LINE_HEIGHT, ...). They mark „some node in this DOM declared this property“. When clear, the cascade walks for that prop are skipped across the whole DOM.
Tier 2b: CompactTextProps
#[repr(C)]
pub struct CompactTextProps {
pub text_color: u32, // 0xRRGGBBAA
pub font_family_hash: u64,
pub line_height: i16, // ×10, I16_SENTINEL = "normal"
pub letter_spacing: i16,
pub word_spacing: i16,
pub text_indent: i16,
}
24 bytes of IFC / text-shaping inputs. The whole struct is inheritable as a unit, so the cascade builder copies tier2b_text[parent] to tier2b_text[child] in one move before running per-node CSS. font_family_hash = 0 is the unset sentinel; the actual font-family list is looked up in font_hash_to_families, deduplicated across nodes.
Sentinel encoding
Three encodings, three sentinel schemes:
u32for dimensions with unit. Used for width, height, min / max-*, flex-basis, and font-size. Low 4 bits holdSizeMetric, upper 28 hold a signed value times 1000. Sentinels areU32_SENTINEL=0xFFFFFFFF,U32_AUTO,U32_NONE,U32_INHERIT,U32_INITIAL,U32_MIN_CONTENT, andU32_MAX_CONTENT. The threshold is0xFFFFFFF9.i16 × 10for resolved px. Used for padding, margin, border-width, top / right / bottom / left, gap, radii, line-height, letter / word-spacing, text-indent, and grid-*. Range is -3276.7 to +3276.4 px. Sentinels areI16_SENTINEL=0x7FFF,I16_AUTO=0x7FFE,I16_INHERIT=0x7FFD, andI16_INITIAL=0x7FFC. The threshold is0x7FFC.u16 × 100for flex factor. Used for flex-grow and flex-shrink. Range is 0.00 to 655.28. The sentinel isU16_SENTINEL=0xFFFF. The threshold is0xFFF9.
Any value at or above the threshold means „doesn't fit; ask the slow path“. The getter on CssPropertyCache handles fallbacks.
encode_pixel_value_u32(pv: &PixelValue) -> u32:
let metric = size_metric_to_u8(pv.metric) as u32;
let raw = pv.number.number; // FloatValue is value × 1000
if !(-134_217_728..=134_217_727).contains(&raw) {
return U32_SENTINEL;
}
let value_bits = ((raw as i32) as u32) << 4;
value_bits | metric
The FloatValue × 1000 representation is what makes this work in const context — encoding / decoding uses no floats.
Border styles packed into u16
pub fn encode_border_styles_packed(
top: BorderStyle, right: BorderStyle,
bottom: BorderStyle, left: BorderStyle,
) -> u16 {
(border_style_to_u8(top) as u16)
| ((border_style_to_u8(right) as u16) << 4)
| ((border_style_to_u8(bottom) as u16) << 8)
| ((border_style_to_u8(left) as u16) << 12)
}
BorderStyle has 10 variants and fits in 4 bits. The decoders decode_border_top_style / _right_ / _bottom_ / _left_ extract a single side.
Color encoding
encode_color_u32(c: &ColorU) -> u32 packs RGBA as 0xRRGGBBAA. 0x00000000 is the unset sentinel, meaning fully-transparent black (rgba(0,0,0,0)) collides with unset. The decoder returns None for 0, and the renderer treats None as „use the parent's text color“ or the property's initial value. In practice, fully-transparent black is rare enough that the collision is acceptable; callers who need to distinguish reach for the slow path.
Reading the cache
The fast-path getters live on CompactLayoutCache. Examples:
let cache: &CompactLayoutCache = ...;
let nid: usize = node_id.index();
// Tier 1 — single shift+mask, no branches
let display: LayoutDisplay = decode_display(cache.tier1_enums[nid]);
// Tier 2 — direct field
let pad_top_x10: i16 = cache.tier2_dims[nid].padding_top;
// Tier 2 — with sentinel handling
match cache.get_width_raw(nid) {
Some(pv) => /* use pv */,
None => /* slow-path */,
}
// Cold flag — short-circuit before the slow walk
if cache.tier2_cold[nid].hot_flags & HOT_FLAG_HAS_TRANSFORM != 0 {
let transform = cache.get_transform_slow(node_data, nid, state);
}
get_*_raw returns the typed value directly; get_* returns None if the slot holds a sentinel (caller falls back to get_property_slow). The is_*_auto family (is_margin_top_auto, is_grid_col_start_auto, …) is a fast check for the Auto sentinel that doesn't require decoding.
Encoding side
build_compact_cache_with_inheritance populates the four arrays in pre-order. Per-node encoder calls:
result.tier1_enums[i] = encode_tier1(
display, position, float, overflow_x, overflow_y, box_sizing,
flex_direction, flex_wrap, justify_content, align_items, align_content,
writing_mode, clear, font_weight, font_style, text_align,
visibility, white_space, direction, vertical_align, border_collapse,
);
if let Some(val) = self.get_width(nd, &node_id, &default_state) {
result.tier2_dims[i].width = encode_layout_width(val);
}
// …
Each get_* call cascades through CssPropertyCache (UA → global → cascaded → inline → user override). The result is an Option<CssPropertyValue<T>>; None leaves the slot at Default::default().
encode_layout_width and friends live in core/src/compact_cache_builder.rs because they need access to the cascade's CssPropertyValue resolution; pure encode helpers (encode_pixel_value_u32, encode_resolved_px_i16, encode_flex_u16, encode_color_u32) are in css/src/compact_cache.rs and have no core dependency.
Defaults
impl Default for CompactNodeProps {
fn default() -> Self {
Self {
width: U32_AUTO, height: U32_AUTO,
min_width: U32_AUTO, max_width: U32_NONE,
min_height: U32_AUTO, max_height: U32_NONE,
flex_basis: U32_AUTO, font_size: U32_INITIAL,
padding_top: 0, /* …all 4 sides … */
margin_top: 0, /* …all 4 sides … */
border_top_width: 0,
top: I16_AUTO, right: I16_AUTO, bottom: I16_AUTO, left: I16_AUTO,
flex_grow: 0,
flex_shrink: encode_flex_u16(1.0), // CSS default
row_gap: 0,
column_gap: 0,
}
}
}
CompactNodePropsCold::default() uses I16_SENTINEL for radii (no rounded corners → skip slow walk) and I16_AUTO for z_index and grid lines. CompactTextProps::default() uses I16_SENTINEL for line_height to mean „normal“.
When the slow path is faster
The compact cache is a worthwhile trade-off because the slow get_property_slow walks five sources per call (user override → inline → cascaded → computed → default). For ~50 properties read on every layout pass on every node, that's hundreds of thousands of walks per frame on a non-trivial DOM. The compact cache turns those into array indexing.
For uncommon properties (transform, box-shadow, filter, content, transitions), the per-frame call count is low enough that the slow path is fine, and adding them to compact tiers would inflate per-node memory without measurable speedup. The HOT_FLAG_HAS_* and DOM_HAS_* bits are the compromise: stay slow-path, but skip the walk entirely when the property is known unset.
Adding a new compact-cached property
- Pick the tier:
- Tier 1 if the property is an enum with ≤ 8 variants and the bit budget has room (3 spare bits at the top of
u64). - Tier 2 hot if it's a numeric layout-critical dimension. Use
i16 × 10for resolved px,u32if you need a unit (em / %, etc.). - Tier 2 cold if it's paint-only or rare. Add a
HOT_FLAG_HAS_*bit if the value is usually unset. - Tier 2b if it's text / IFC and inheritable.
- Tier 1 if the property is an enum with ≤ 8 variants and the bit budget has room (3 spare bits at the top of
- Add the field to the appropriate struct in
css/src/compact_cache.rs. UpdateDefault. - If new tier-1 bits, add
*_SHIFTand*_MASKconstants and updateencode_tier1+ the matchingdecode_*. - Add
encode_*anddecode_*helpers near the existing ones. - Add encoder calls in
core/src/compact_cache_builder.rsStep 3 and (if it's a global*rule target) Step 2.5. - Add inheritance handling in Step 1 if the property inherits.
- Add a getter on
CompactLayoutCache(get_<prop>_raw,get_<prop>,is_<prop>_auto). - Update the slow-path fallback in
CssPropertyCache::get_property_slowso callers who don't use the compact cache still work.
See also
- Cascade, Inheritance, Restyle — how
build_compact_cache_with_inheritancepopulates these arrays. - CSS Parser — the source of the
CssPropertyvalues that get encoded. - DOM Internals —
NodeData::style(inlineCss) is one input to the cascade. - Styling Subsystem — parent overview of the styling pipeline.
Coming Up Next
- Cascade, Inheritance, Restyle — Selector matching, specificity, and computed values
- Layout Solver — How the encoded values feed the formatting-context engines
- DOM Internals — How the public
Domtype is built and stored