Skip to content

perf: Views engine render optimizations — warm 99.99% faster, cold 10% faster#724

Open
johnml1135 wants to merge 4 commits intomainfrom
001-render-speedup
Open

perf: Views engine render optimizations — warm 99.99% faster, cold 10% faster#724
johnml1135 wants to merge 4 commits intomainfrom
001-render-speedup

Conversation

@johnml1135
Copy link
Contributor

@johnml1135 johnml1135 commented Mar 3, 2026

Render Performance Optimizations

Summary

This branch delivers comprehensive performance optimizations to the FieldWorks Views engine rendering pipeline, reducing warm render time by 99.99% (153ms → 0.01ms) and cold render time by ~10% (60.36ms → 54.32ms) across 15 benchmark scenarios.

All 15 scenarios pass with 0% pixel variance — no visual regressions.

Motivation

FieldWorks UI responsiveness is bottlenecked by the Views engine render pipeline. Switching between entries, refreshing displays, or navigating within editors triggers Reconstruct()Layout()DrawTheRoot() cycles. Profiling revealed:

  • Redundant layout passes: Layout ran twice per render (once inside Reconstruct, once externally) — both doing identical work
  • Redundant reconstruction: RefreshDisplay() called Reconstruct on ALL root sites, even when no data changed
  • Redundant GDI calls: Every font switch created/destroyed HFONT handles; every draw call set colors unconditionally
  • Redundant NFC normalization: Text offset conversion normalized NFC text that was already NFC, called 12+ times per paragraph

Benchmark Infrastructure

New test infrastructure

  • RenderBenchmarkHarness — creates real VwRootBox + VwGraphics in an offscreen HWND, measures CreateView/MakeRoot/PerformOffscreenLayout/DrawTheRoot independently
  • RenderTimingSuiteTests — 15 scenarios covering simple through extreme content (long-prose: 4 sections × 80 verses; lex-extreme: 25 entries × 10 senses)
  • RenderVerifyTests — pixel-perfect snapshot verification using Verify + ImageSharp
  • RenderBenchmarkResults / RenderBenchmarkReportWriter — generates markdown summary reports with per-scenario timing
  • DataTreeRenderTests — DataTree-level render scenarios (collapsed/expanded/deep/extreme/multiws/simple)

Test files

  • Src/Common/RootSite/RootSiteTests/RenderBenchmarkHarness.cs
  • Src/Common/RootSite/RootSiteTests/RenderTimingSuiteTests.cs
  • Src/Common/RootSite/RootSiteTests/RenderVerifyTests.cs
  • Src/Common/RootSite/RootSiteTests/RenderBenchmarkTestsBase.cs
  • Src/Common/RenderVerification/ (supporting infrastructure)
  • Src/Common/Controls/DetailControlsTests/DataTreeRenderTests.cs
  • 15 verified baseline PNG snapshots

Phase 1: Warm Render Optimizations (153ms → 0.01ms)

PATH-L1: Width-Invariant Layout Guard

  • File: Src/views/VwRootBox.cpp, VwRootBox.h
  • Change: Added m_fNeedsLayout + m_dxLastLayoutWidth to VwRootBox::Layout(). If box tree hasn't changed and width is the same, skip the entire layout traversal.
  • Impact: Warm Layout: 50ms → 0.03ms

PATH-L4: Harness GDI Caching

  • File: Src/Common/RootSite/RootSiteTests/RenderBenchmarkHarness.cs
  • Change: Cached GDI bitmap/graphics in PerformOffscreenLayout() instead of allocating per call.
  • Impact: Warm PerformOffscreenLayout: 27ms → 0.00ms

PATH-R1: Reconstruct Guard

  • File: Src/views/VwRootBox.cpp, VwRootBox.h
  • Change: Added m_fNeedsReconstruct flag. Reconstruct() returns early if no PropChanged, style change, or overlay change has occurred since last construction.
  • Impact: Warm Reconstruct: 100ms → 0.01ms

PATH-L5: Skip Reconstruct When Data Unchanged

  • Files: Src/views/Views.idh, Src/Common/SimpleRootSite/SimpleRootSite.cs
  • Change: Exposed NeedsReconstruct property via COM. SimpleRootSite.RefreshDisplay() checks this before performing expensive reconstruct cycle.
  • Impact: Eliminates managed overhead (SelectionRestorer, SuspendDrawing, COM interop) for all no-op refreshes across all root sites.

Phase 2: Cold Render Optimizations (60.36ms → 54.32ms)

PATH-R1-FIX: Construct Guard Bug Fix

  • File: Src/views/VwRootBox.cpp
  • Change: Construct() was leaving m_fNeedsReconstruct = true (set during Init), so the PATH-R1 guard never fired on the first warm render. Fixed by setting m_fNeedsReconstruct = false after successful construction.
  • Impact: Warm render 98.20ms → 0.01ms (the guard was previously dead code)

PATH-C1: HFONT LRU Cache

  • Files: Src/views/lib/VwGraphics.h, VwGraphics.cpp
  • Change: Added 8-entry LRU cache of (LgCharRenderProps → HFONT) in VwGraphics. On font change, check cache via memcmp of font-specific fields. Cache hit → reuse HFONT (skip CreateFontIndirect). Cache miss → create and add to cache. All cached HFONTs freed on destructor/ReleaseDC.
  • Impact: ~2% avg cold reduction. Best for multi-WS text with frequent font switching.

PATH-C2: Color State Caching

  • Files: Src/views/lib/VwGraphics.h, VwGraphics.cpp
  • Change: Track current foreground color, background color, and background mode. Only call GDI color functions when value actually changes. Cache invalidated on DC change.
  • Impact: <1% avg cold reduction

PATH-N1: NFC Normalization Bypass

  • Files: Src/views/lib/UniscribeSegment.h, UniscribeSegment.cpp, UniscribeEngine.cpp
  • Change: OffsetInNfc() and OffsetToOrig() performed COM Fetch + ICU NFC normalization on every call — 12+ times per FindBreakPoint invocation. For content-heavy views, this meant millions of redundant normalizations. Added NFC detection in CallScriptItemize() (comparing pre/post normalization length). When text is already NFC (common case — FieldWorks stores NFC), new overloads return identity offsets, completely skipping COM and ICU calls.
  • Impact: 8.2% avg cold reduction (59.17ms → 54.32ms). Best results: mixed-styles -27.1%, lex-deep -24.9%, lex-extreme -24.3%

Bug Fixes

VwPropertyStore Crash Fix

  • File: Src/views/VwPropertyStore.cpp
  • Change: Assert(m_chrp.ws) and AssertPtr(qws) at line 1336 fired when ws=0 reached VwPropertyStore during PropChanged-driven box tree rebuilds. Fixed by guarding get_EngineOrNull with if (m_chrp.ws) check and using AssertPtrN(qws) to allow null engine pointer. The code already defaulted to LTR directionality when qws was null.

VwDrawRootBuffered Coordinate Fixes

  • File: Src/views/VwRootBox.cpp
  • Change: Fixed coordinate mapping and trailing separator bars in VwDrawRootBuffered.

Other Fixes

Final Benchmark Results

Scenario Cold (ms) Warm (ms) Pixel Variance
simple 162.23 0.02 0%
medium 16.40 0.01 0%
complex 32.04 0.00 0%
deep-nested 17.42 0.01 0%
custom-heavy 19.28 0.01 0%
many-paragraphs 33.47 0.01 0%
footnote-heavy 66.78 0.01 0%
mixed-styles 73.71 0.01 0%
long-prose 159.94 0.01 0%
multi-book 24.01 0.01 0%
rtl-script 37.16 0.01 0%
multi-ws 45.60 0.01 0%
lex-shallow 17.51 0.00 0%
lex-deep 21.42 0.01 0%
lex-extreme 87.90 0.01 0%
Average 54.32 0.01 0%

Files Changed (Key Production Files)

File Changes
Src/views/VwRootBox.h Layout/Reconstruct guard fields, NeedsReconstruct property
Src/views/VwRootBox.cpp Layout/Reconstruct guards, dirty flags, Construct fix, coordinate fixes
Src/views/Views.idh NeedsReconstruct COM property on IVwRootBox
Src/views/VwPropertyStore.cpp Crash fix for ws=0 during PropChanged
Src/views/lib/VwGraphics.h HFONT cache, color cache structures
Src/views/lib/VwGraphics.cpp HFONT LRU cache, color change detection
Src/views/lib/UniscribeSegment.h NFC-aware overloads, CallScriptItemize extension
Src/views/lib/UniscribeSegment.cpp NFC bypass logic, NFC-aware offset functions
Src/views/lib/UniscribeEngine.cpp 12 FindBreakPoint call sites updated for NFC bypass
Src/Common/SimpleRootSite/SimpleRootSite.cs NeedsReconstruct check in RefreshDisplay
Src/Common/Controls/DetailControls/DataTree.cs HWND diagnostics, render scenario support
Src/Common/Controls/DetailControls/Slice.cs Render measurements

Future Optimization Paths

The remaining opportunity for significant cold render improvement is PATH-V1: Lazy VwLazyBox Construction — deferring off-screen paragraph construction until scrolled into view. This is the only path that can achieve the 50% cold reduction target, as the engine is fundamentally bottlenecked by ScriptShape/ScriptPlace kernel calls that cannot be made faster, only avoided.

See layout-optimization-paths.md for full analysis.


This change is Reviewable

@johnml1135 johnml1135 marked this pull request as ready for review March 3, 2026 21:20
Copilot AI review requested due to automatic review settings March 3, 2026 21:20
@github-actions
Copy link

github-actions bot commented Mar 3, 2026

⚠️ Commit Message Format Issues ⚠️
commit d6741a0bd9:
1: T1 Title exceeds max length (95>72): "perf: cold render optimizations — HFONT cache, color caching, NFC bypass, Reconstruct guard fix"

commit d0066f3247:
4: B1 Line exceeds max length (82>80): "- Optimize layout and paint paths with culling, guards, caching, and binary search"

commit c4678e8240:
1: T8 Title is too short (0<5)
2: B4 Second line is not empty: "feat(render): expand DataTree render scenarios and persist benchmark timings"
26: B1 Line exceeds max length (134>80): "- Update render-testing skill with explicit devil's advocate stage, retest/review, and mandatory TIMING_ENHANCEMENTS.md before commit."
28: B1 Line exceeds max length (107>80): "- Refine DataTree tab-index optimization safety by moving final ResetTabIndices(0) to CreateSlices finally."
30: B1 Line exceeds max length (82>80): "- Keep DeepSuspendLayout/DeepResumeLayout batching in DummyObjectSlice.BecomeReal."
32: B1 Line exceeds max length (102>80): "- Mark OS-3 and OS-5 complete in OpenSpec tasks and add TIMING_ENHANCEMENTS.md with measured outcomes."
47: B1 Line exceeds max length (85>80): "perf: layout path optimization — early-exit guard, construction skip, paint-path skip"

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Performance-focused updates introduce render verification/benchmark infrastructure and targeted optimizations in the DetailControls/DataTree rendering path to reduce redundant work and support pixel-perfect render validation.

Changes:

  • Added RenderVerification infrastructure (environment validation, diagnostics/trace toggling, benchmark result models, reporting & comparison utilities).
  • Optimized DataTree/Slice construction, layout, painting, and visibility paths to avoid repeated O(N) work.
  • Added/updated test harness assets (layouts/parts, new diagnostics tests, VS Code tasks) and snapshot testing support.

Reviewed changes

Copilot reviewed 68 out of 153 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
Src/Common/RenderVerification/RenderModels.cs Adds shared models for scenarios and timing results used by render verification/benchmarks.
Src/Common/RenderVerification/RenderEnvironmentValidator.cs Captures environment details and produces a hash/diff for deterministic snapshot comparisons.
Src/Common/RenderVerification/RenderDiagnosticsToggle.cs Adds a toggle for diagnostics/trace output with file-backed flags and a trace listener.
Src/Common/RenderVerification/RenderBenchmarkResults.cs Defines JSON-serializable benchmark run/result structures and flags loading.
Src/Common/RenderVerification/RenderBenchmarkReportWriter.cs Writes JSON + Markdown summary reports and recommendations for benchmark runs.
Src/Common/RenderVerification/RenderBenchmarkComparer.cs Compares runs to detect regressions/improvements and generates a summary.
Src/Common/RenderVerification/CompositeViewCapture.cs Implements two-pass DataTree capture with Views content overlay for pixel-perfect snapshots.
Src/Common/FieldWorks/FieldWorks.Diagnostics.dev.config Adds a new trace switch for Views render timing.
Src/Common/Controls/XMLViews/XMLViewsTests/XmlBrowseViewBaseTests.cs Updates test stub interface to include NeedsReconstruct.
Src/Common/Controls/DetailControls/VectorReferenceView.cs Avoids drawing a trailing separator bar after the last item.
Src/Common/Controls/DetailControls/Slice.cs Adds cached XML attribute parsing + width fast-path + DataTree method call updates.
Src/Common/Controls/DetailControls/PossibilityVectorReferenceView.cs Avoids drawing a trailing separator bar after the last item.
Src/Common/Controls/DetailControls/HwndDiagnostics.cs Adds helper methods to read USER/GDI handle counts for diagnostics/tests.
Src/Common/Controls/DetailControls/DetailControlsTests/TestResult.xml Adds a test runner output artifact (currently shows a failed/invalid run).
Src/Common/Controls/DetailControls/DetailControlsTests/TestParts.xml Extends test parts inventory to include senses sequences and collapsible slices.
Src/Common/Controls/DetailControls/DetailControlsTests/Test.fwlayout Updates test layouts to include CitationForm and senses parts/layouts.
Src/Common/Controls/DetailControls/DetailControlsTests/DetailControlsTests.csproj Adds snapshot/JSON deps and references RenderVerification project.
Src/Common/Controls/DetailControls/DetailControlsTests/DataTreeTests.cs Updates expected template XML string for modified layout composition.
Src/Common/Controls/DetailControls/DetailControlsTests/DataTreeHwndCountTest.cs Adds baseline diagnostics tests for handle growth and slice install count.
Src/Common/Controls/DetailControls/DataTree.cs Adds multiple layout/paint/visibility optimizations and diagnostics counters.
Directory.Packages.props Adds central package version for Verify.
AGENTS.md Adds note about external dependency repository (liblcm).
.vscode/tasks.json Adds tasks for render baseline tests and building tests via repo script.
.vscode/settings.json Allows the new “Build Tests” task as an approved VS Code task.
.github/skills/openspec-verify-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-sync-specs/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-onboard/SKILL.md Updates onboarding workflow text/commands and cross-platform command notes.
.github/skills/openspec-new-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-ff-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-explore/SKILL.md Updates explore-mode wording around “proposal” vs other workflows.
.github/skills/openspec-continue-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-bulk-archive-change/SKILL.md Updates text and generatedBy metadata version.
.github/skills/openspec-archive-change/SKILL.md Updates guidance to invoke the sync skill via Task tool.
.github/skills/openspec-apply-change/SKILL.md Updates generatedBy metadata version.
.github/prompts/opsx-onboard.prompt.md Mirrors onboarding prompt updates (CLI install check, command list).
.github/prompts/opsx-ff.prompt.md Clarifies template usage vs context/rules constraints.
.github/prompts/opsx-explore.prompt.md Mirrors explore prompt updates (“proposal” wording).
.github/prompts/opsx-bulk-archive.prompt.md Mirrors bulk-archive prompt copy update.
.github/prompts/opsx-archive.prompt.md Mirrors archive prompt sync invocation update.
.github/instructions/csharp.instructions.md Adds C# instructions file (currently specifies “latest C# / C# 14” guidance).
.gitattributes Marks Verify snapshot PNG extensions as binary.
.GitHub/skills/render-testing/SKILL.md Adds a render-testing skill doc with build/test/run instructions and baseline workflow.
.GitHub/skills/code-review-skill-main/scripts/pr-analyzer.py Adds a Python tool to analyze diffs and estimate PR review complexity.
.GitHub/skills/code-review-skill-main/reference/typescript.md Adds TypeScript review reference documentation.
.GitHub/skills/code-review-skill-main/reference/security-review-guide.md Adds security review reference documentation.
.GitHub/skills/code-review-skill-main/reference/java.md Adds Java review reference documentation.
.GitHub/skills/code-review-skill-main/reference/cpp.md Adds C++ review reference documentation.
.GitHub/skills/code-review-skill-main/reference/code-review-best-practices.md Adds code review best practices reference documentation.
.GitHub/skills/code-review-skill-main/reference/c.md Adds C review reference documentation.
.GitHub/skills/code-review-skill-main/assets/review-checklist.md Adds a condensed review checklist asset.
.GitHub/skills/code-review-skill-main/assets/pr-review-template.md Adds a PR review template asset.
.GitHub/skills/code-review-skill-main/SKILL.md Adds a top-level code review skill description/index.
.GitHub/skills/code-review-skill-main/README.md Adds documentation for the code-review skill bundle.
.GitHub/skills/code-review-skill-main/LICENSE Adds license file for the bundled code-review skill content.
.GitHub/skills/code-review-skill-main/CONTRIBUTING.md Adds contribution guidelines for the bundled code-review skill content.
.GitHub/skills/code-review-skill-main/.gitignore Adds gitignore for the bundled code-review skill content.
Comments suppressed due to low confidence (1)

.GitHub/skills/render-testing/SKILL.md:1

  • This doc hardcodes a user-specific NUnit console path and recommends dotnet build for a .NET Framework workflow. For portability and alignment with repo build guidance, prefer .\build.ps1 / .\test.ps1 (and avoid assuming C:\\Users\\johnm\\... exists). Also consider avoiding or clearly calling out commands with pipes/redirection in docs, since the repo guidance prefers wrapper scripts for piped commands.
---

Comment on lines +64 to +108
public void EnableDiagnostics(bool persist = false)
{
m_originalDiagnosticsState = DiagnosticsEnabled;
DiagnosticsEnabled = true;
TraceEnabled = true;

if (persist)
{
SaveFlags();
}

SetupTraceListener();
}

/// <summary>
/// Disables diagnostics and trace output.
/// </summary>
/// <param name="persist">Whether to persist the change to the flags file.</param>
public void DisableDiagnostics(bool persist = false)
{
DiagnosticsEnabled = false;
TraceEnabled = false;

if (persist)
{
SaveFlags();
}

RemoveTraceListener();
}

/// <summary>
/// Restores the original diagnostics state before Enable/Disable was called.
/// </summary>
public void RestoreOriginalState()
{
if (m_originalDiagnosticsState)
{
EnableDiagnostics(persist: false);
}
else
{
DisableDiagnostics(persist: false);
}
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RestoreOriginalState() is not idempotent when restoring to the enabled state because EnableDiagnostics() overwrites m_originalDiagnosticsState every time it is called. If DisableDiagnostics() is called and then RestoreOriginalState() enables diagnostics, the original state field can get reset to false, making subsequent restores incorrect. Fix by capturing m_originalDiagnosticsState only once (e.g., in ctor / first toggle), or by making RestoreOriginalState() call internal enable/disable helpers that do not mutate the stored original value.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +15
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<test-run id="0" runstate="Runnable" testcasecount="0" result="Failed" label="Invalid" total="0" passed="0" failed="0" warnings="0" inconclusive="0" skipped="0" asserts="0" engine-version="3.16.0.0" clr-version="4.0.30319.42000" start-time="2026-02-12 20:00:58Z" end-time="2026-02-12 20:00:58Z" duration="0.275532">
<command-line><![CDATA["C:\Users\johnm\.nuget\packages\nunit.consolerunner\3.16.0\tools\nunit3-console.exe" bin\x64\Debug\DetailControlsTests.dll --where "test =~ /DataTreeRender/" --workers=1]]></command-line>
<filter>
<test re="1">DataTreeRender</test>
</filter>
<test-suite type="Assembly" id="1-1" name="DetailControlsTests.dll" fullname="C:\Users\johnm\Documents\repos\FieldWorks.worktrees\ui-speed-up\Src\Common\Controls\DetailControls\DetailControlsTests\bin\x64\Debug\DetailControlsTests.dll" testcasecount="0" runstate="NotRunnable" result="Failed" label="Invalid">
<properties>
<property name="_SKIPREASON" value="File not found: C:\Users\johnm\Documents\repos\FieldWorks.worktrees\ui-speed-up\Src\Common\Controls\DetailControls\DetailControlsTests\bin\x64\Debug\DetailControlsTests.dll" />
</properties>
<reason>
<message>File not found: C:\Users\johnm\Documents\repos\FieldWorks.worktrees\ui-speed-up\Src\Common\Controls\DetailControls\DetailControlsTests\bin\x64\Debug\DetailControlsTests.dll</message>
</reason>
</test-suite>
</test-run> No newline at end of file
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be a local NUnit test-run output artifact containing user-specific absolute paths and a failed/invalid run. It should not be committed to source control; delete it from the repo and add it to an ignore list (or ensure test runners output to a non-tracked artifacts folder) to avoid noisy diffs and leaking local machine paths.

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +48
public string GetEnvironmentHash()
{
var settingsJson = Newtonsoft.Json.JsonConvert.SerializeObject(CurrentSettings);
using (var sha256 = SHA256.Create())
{
var hashBytes = sha256.ComputeHash(Encoding.UTF8.GetBytes(settingsJson));
return Convert.ToBase64String(hashBytes);
}
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment hash is intended to be deterministic, but hashing JsonConvert.SerializeObject(CurrentSettings) can be unstable across runtimes/versions because JSON.NET property ordering is not guaranteed unless explicitly configured. That can cause spurious environment-hash mismatches even when the captured values are identical. Consider generating the hash from an explicit, stable field ordering (e.g., concatenating known fields in a fixed order), or configuring serialization with a contract resolver that enforces deterministic property ordering.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +10
## C# Instructions
- Always use the latest version C#, currently C# 14 features.
- Write clear and concise comments for each function.
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guidance conflicts with the repo's .NET Framework/C# constraints (FieldWorks is limited to C# 7.3). As written, it encourages unsupported language features (file-scoped namespaces, switch expressions, nullable reference types, etc.), which can lead to uncompilable PRs. Update the instructions to match the actual language version/tooling used in this codebase (e.g., explicitly state C# 7.3 and avoid recommending C# 8+ features).

Copilot uses AI. Check for mistakes.
@github-actions
Copy link

github-actions bot commented Mar 3, 2026

NUnit Tests

    1 files  ±  0      1 suites  ±0   10m 8s ⏱️ + 4m 15s
4 552 tests +145  4 431 ✅ +111  87 💤 ±0  34 ❌ +34 
4 561 runs  +145  4 440 ✅ +111  87 💤 ±0  34 ❌ +34 

For more details on these failures, see this check.

Results for commit d6741a0. ± Comparison against base commit bcfbea0.

♻️ This comment has been updated with latest results.

# This is the 1st commit message:

feat(render): expand DataTree render scenarios and persist benchmark timings

standardize DataTree test data with predictable field naming across scenarios
add per-scenario timing capture and write results to datatree-timings.json
add Newtonsoft.Json to DetailControlsTests for timing serialization
add render-testing skill documentation for baseline and benchmark workflow
remove obsolete generated Build/FieldWorks.targets.bad file
update checked-in test result artifact for current render test run

fix: resolve build blocker + timing test layout gaps

- Remove duplicate Kernel project entry in FieldWorks.sln that blocked
  NuGet restore (first entry was missing EndProject)
- Add CitationForm and Senses parts to test Normal layout so timing
  tests produce real slices instead of 0
- Add LexSense-Detail-Senses part and subsense recursion to GlossSn
  layout so WorkloadGrowsWithComplexity test correctly validates
  that deeper scenarios produce more slices
- Include user edits to implementation-paths.md

OS-3: All 4 DataTreeTiming tests now pass

perf: add render-testing workflow + timing enhancement log

- Update render-testing skill with explicit devil's advocate stage, retest/review, and mandatory TIMING_ENHANCEMENTS.md before commit.

- Refine DataTree tab-index optimization safety by moving final ResetTabIndices(0) to CreateSlices finally.

- Keep DeepSuspendLayout/DeepResumeLayout batching in DummyObjectSlice.BecomeReal.

- Mark OS-3 and OS-5 complete in OpenSpec tasks and add TIMING_ENHANCEMENTS.md with measured outcomes.

Validation: build.ps1 -BuildTests; test.ps1 -TestFilter DataTreeTiming -NoBuild

perf: paint pipeline optimization — clip-rect culling + double-buffering

- HandlePaintLinesBetweenSlices: skip separator lines outside paint clip rect,
  reducing per-paint iteration from O(N) to O(visible) during scroll/partial
  repaints. Avoids O(N) XML attribute parsing on off-screen slices.
- DataTree constructor: enable OptimizedDoubleBuffer + AllPaintingInWmPaint
  to eliminate flicker during paint.
- Add DataTreeTiming_PaintPerformance test for paint-time regression detection.
- Update render-testing skill with test coverage assessment step.
- Update TIMING_ENHANCEMENTS.md with enhancements 3 and 4.

perf: layout path optimization — early-exit guard, construction skip, paint-path skip

Enhancement 5: Slice.SetWidthForDataTreeLayout returns early when
m_widthHasBeenSetByDataTree && Width == width, avoiding redundant
splitter resizing and event handler churn on every layout pass.

Enhancement 6: DataTree.OnSizeChanged skips the O(N) splitter
adjustment loop during ConstructingSlices. HandleLayout1(fFull=true)
handles width synchronization after construction completes.

Enhancement 7: HandleLayout1 paint path (fFull=false) skips
SetWidthForDataTreeLayout for non-visible slices, reducing width-sync
work from O(N) to O(K) per paint where K is visible slice count.

Also fixes RemoveDuplicateCustomFields test to match Test.fwlayout
Normal layout changes from prior commit.

perf: XML attribute caching + MakeSliceVisible high-water mark

Enhancement 8: Cache header/skipSpacerLine/sameObject XML attributes
on Slice as lazy bool? fields. Eliminates per-paint XML parsing in
HandlePaintLinesBetweenSlices (~2,700 XML lookups/sec at 60Hz with
15 visible slices reduced to 0 after first access).

Enhancement 9: Change MakeSliceVisible from static to instance method.
Pass known index from HandleLayout1 caller (avoids O(N) IndexOf).
Track m_lastVisibleHighWaterMark so the inner 'make preceding visible'
loop amortizes to O(N) total instead of O(N*V). Preserves the .NET
Framework bug workaround (LT-7307). Reset mark in CreateSlices.

perf: construction-path guards + optimization regression tests

Enhancement 10: Skip ResetTabIndices in RemoveSlice during ConstructingSlices.
Mirror of Enhancement 1 — the finally block in CreateSlices already does a
single ResetTabIndices(0) after all removals/insertions complete.

Enhancement 11: Skip AdjustSliceSplitPosition in InstallSlice during
ConstructingSlices. HandleLayout1(fFull=true) sets correct widths + splitter
positions after construction, making per-insert adjustment redundant.

Added 7 optimization regression tests (DataTreeOpt_ prefix):
- WidthStabilityAfterLayout (Enhancement 5)
- AllViewportSlicesVisible (Enhancement 9)
- XmlCacheConsistency (Enhancement 8)
- XmlCacheInvalidationOnConfigChange (Enhancement 8)
- SequentialPaintsProduceIdenticalOutput (Enhancements 3,4,7,8)
- SlicePositionsMonotonicallyIncreasing (Enhancement 3)
- IsHeaderNodeDelegatesToCachedProperty (Enhancement 8)

All 12 DataTree tests pass (5 timing + 7 optimization).

# This is the commit message #2:

feat: Enhancement 12 — binary search in HandleLayout1 paint path

Add FindFirstPotentiallyVisibleSlice binary search to skip above-viewport
slices during paint-path layout. Addresses JohnT's original TODO in OnPaint.

- HandleLayout1 now starts iteration at the first potentially visible slice
  (with -1 safety margin) instead of index 0 on paint-path calls
- Timing improvements: shallow -71%, deep -20%, extreme -7%, paint -11%
- 5 binary-search risk-reduction tests added (DataTreeOpt_ prefix)
- Regenerate snapshot baselines after layout/parts changes from 64ab221
- Add mandatory snapshot regression check (Step 6) to render-testing skill
- Update TIMING_ENHANCEMENTS.md with E12 measurements and devil's advocate

# This is the commit message #3:

The real optimization - HWND virtualization

# This is the commit message #4:

fix: correct VwDrawRootBuffered coordinate mapping + trailing separator bars

Three rendering quality fixes for the snapshot capture pipeline:

1. VwDrawRootBuffered coordinate fix (CompositeViewCapture.cs):
   GetCoordRects returns rcDst in rootSite-local coordinates (origin
   at HorizMargin, 0). The overlay code was passing clientRect with
   the rootSite's position relative to the DataTree (e.g. X=175),
   causing VwDrawRootBuffered to offset rcDst by (-175, -y) and
   render content at negative X — clipping or hiding it entirely.
   Fix: render each rootSite into a temp bitmap at local coords
   (0,0,w,h), then composite into the main bitmap at the correct
   DataTree-relative position.

2. EnsureAllSlicesInitialized uses BecomeRealInPlace (CompositeViewCapture.cs):
   Replaced manual handle-forcing with the production ViewSlice
   initialization path, which sets AllowLayout=true and triggers
   rootBox.Layout with the correct width.

3. Trailing separator bar removal (VectorReferenceView.cs,
   PossibilityVectorReferenceView.cs):
   DisplayVec called AddSeparatorBar after every item including the
   last, causing VwSeparatorBox to draw unwanted grey bars after
   single/final items like 'Main Dictionary'. Added guard to only
   add separators between items.

Baselines regenerated — all 6 scenarios show 40-70% more rendered
content. All 23 DataTree tests pass (timing + optimization + snapshots).

# This is the commit message #5:

Added tests before virtualization

# This is the commit message #6:

update renders

# This is the commit message #7:

Hwnd implementation started

# This is the commit message #8:

upgrade openspec

# This is the commit message #9:

hwnd implementation
- Add deterministic DataTree render scenarios, benchmark harness, and baselines
- Optimize layout and paint paths with culling, guards, caching, and binary search
- Fix snapshot compositing and separator rendering regressions
- Add timing and regression tests; update TIMING_ENHANCEMENTS.md
- Refresh render baselines and benchmark harness after layout updates
- Fix snapshot capture coordinates, initialization, and separator rendering
- Harden DataTree with regression tests for resize, navigation, and crash paths
- Remove generated test-result artifact and update timing notes
…ass, Reconstruct guard fix

Phase 2 cold render optimizations for the Views engine:

PATH-R1-FIX: Fix Reconstruct guard bug — Construct() left m_fNeedsReconstruct
true (from Init), so the warm Reconstruct guard never fired. Warm render dropped
from 98.20ms to 0.01ms (99.99% reduction).

PATH-C1: Add 8-entry HFONT LRU cache in VwGraphics to avoid redundant
CreateFontIndirect/DeleteObjectFont GDI kernel calls when cycling between the
same fonts (common in multi-writing-system text). ~2% cold reduction.

PATH-C2: Track current foreground/background colors and background mode in
VwGraphics. Skip redundant SetTextColor/SetBkColor/SetBkMode GDI calls when
values haven't changed. <1% cold reduction.

PATH-N1: NFC normalization bypass — OffsetInNfc/OffsetToOrig were performing
COM Fetch + ICU NFC normalization on every call (12+ times per FindBreakPoint).
Added NFC flag from CallScriptItemize; for NFC text (common case), return
identity offsets instantly. 8.2% cold reduction. Best on multi-run scenarios:
mixed-styles -27%, lex-deep -25%, lex-extreme -24%.

Overall cold render: 60.36ms -> 54.32ms (-10.0% average).
All 15 benchmark scenarios pass with 0% pixel variance.
@johnml1135 johnml1135 force-pushed the 001-render-speedup branch from 04900b0 to d6741a0 Compare March 5, 2026 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants