.. _uz_dataviewer_architecture:
===================
Developer reference
===================
Developer reference for how ``uz_dataviewer`` is structured, the design decisions behind it, and how to extend it.
End-user documentation is in :doc:`uz_dataviewer_usage`; packaging is in :doc:`uz_dataviewer_build`.
Core model
==========
The viewer is **state-driven and command-routed**:
- A single source of truth, ``AppState`` (``state.py``). Panels are (almost) pure render functions: each frame they read ``AppState`` and draw it.
- **Every discrete user action goes through the command registry** (``commands.py``). A command mutates ``AppState`` *and* echoes its canonical call to the console.
This model yields four properties:
#. **Scriptability** — anything clickable can be typed or replayed (``.uzscript``).
#. **Live transcript** — the console shows what happened, as commands.
#. **Save/restore** — sessions serialise to JSON or to a replayable command script.
#. **Testability** — app logic is exercised by dispatching command strings; no window required.
.. warning::
A feature whose only trigger is a direct widget mutation breaks this model.
Add a command and have the widget call it.
Frame flow
==========
``immapp.run()`` (hello_imgui) drives a classic immediate-mode loop.
Per frame:
.. code-block:: text
pre_new_frame() # app.py: poll async loads, focus requests, file dialogs
└─ docking layout renders each DockableWindow's gui_function:
├─ NavigationPanel.render(state) # left: runs + signals + normalize menu
├─ PlotsPanel.render(state) # center: subplot grid
├─ FftPanel.render(state) # center tab: FFT window
├─ HistogramPanel.render(state) # center tab: Histogram window
├─ NodesPanel.render(state) # center tab: node canvas
└─ Console.render(state) # bottom: log + command input
Immediate mode has **no retained widget state**, so anything that must persist across frames lives in ``AppState``/``SubplotCell``.
Transient "do this once next frame" requests use a **pending-flag pattern** (``cell.fit_pending``, ``cell.pending_x_lim``, ``cfg.compute_requested``): the renderer consumes and clears the flag.
Module map
==========
.. list-table::
:header-rows: 1
:widths: 25 45 30
* - Module
- Responsibility
- Key types
* - ``app.py``
- Docking layout, ``immapp`` runner, theme, Session menu, ``.uzscript``/file startup
- ``DataViewerApp``
* - ``api.py``
- GUI-free façade for headless/library use (load, FFT, node transforms) — see :doc:`uz_dataviewer_library`
- ``read``, ``Dataset``, ``fft``, ``fft_frame``, ``node``, ``kinds``
* - ``state.py``
- The single app state, grid/cells, analysis configs, async load orchestration
- ``AppState``, ``SubplotCell``, ``PlotType``,
``AnalysisConfig``/``FftConfig``/``HistogramConfig``
* - ``commands.py``
- Command registry, parser, dispatcher, the ~60 built-in commands
- ``CommandRegistry``, ``Command``, ``Param``
* - ``command_doc.py``
- Generates the docs command reference (CSV) from the live registry — see :ref:`uz_dataviewer_generated_command_reference`
- ``build_csv``, ``group_for``, ``GROUP_ORDER``
* - ``console.py``
- Bottom console: scrolling selectable log + command input/completion/history
- ``Console``
* - ``model.py``
- Loaded data + per-log time normalization
- ``Run``, ``Signal``, ``DataRegistry``
* - ``loader.py``
- CSV (Arrow) / Parquet loading, header cleaning, delimiter sniffing; lean large-log loading (float32-at-parse, CSV size guard, streaming Parquet, CSV→Parquet ``convert``) — see :ref:`uz_dataviewer_loader_large_logs`
- ``load_file``, ``parse_file``, ``ParsedRun``, ``parse_channel_name``,
``convert_csv_to_parquet``
* - ``downsample.py``
- Range-aware decimation: min/max pyramid + one-shot (pure NumPy)
- ``Pyramid``, ``decimate_range``, ``visible_slice``
* - ``analysis.py``
- GUI-free transforms (FFT)
- ``compute_fft``, ``FftResult``
* - ``transforms.py``
- GUI-free node transforms (math, FIR filter), pure NumPy
- ``math_node``, ``filter_node``
* - ``nodes.py``
- Dataflow graph + on-demand evaluation → derived runs; the transform **registry**
- ``NodeGraph``, ``Node``, ``evaluate``, ``is_stale``, ``TransformSpec``, ``REGISTRY``
* - ``plugins.py``
- Load external transform-node plugins (``@transform``); robust to missing dirs/files
- ``transform``, ``ParamSpec``, ``load_plugins``
* - ``session.py``
- JSON save/restore, ``.uzscript`` export/replay, CSV exports
- ``to_dict``/``apply_dict``, ``export_*``
* - ``webbridge.py``
- Browser integration (file input, array load, downloads)
- ``IS_WEB``, ``load_columns``, ``download``
* - ``panels/navigation.py``
- Left tree, drag sources, file dialog, normalize menu
- ``NavigationPanel``
* - ``panels/plots.py``
- Subplot grid: types, cursors, spy, axis linking, secondary axis, export
- ``PlotsPanel``
* - ``panels/analysis.py``
- Shared base for analysis windows (sources, follow-window, compute, stale, zoom, export)
- ``AnalysisPanel``, ``follow_combo``
* - ``panels/fft.py``
- FFT window
- ``FftPanel(AnalysisPanel)``
* - ``panels/histogram.py``
- Histogram window
- ``HistogramPanel(AnalysisPanel)``
* - ``panels/nodes.py``
- Node canvas (``imgui_node_editor``) issuing ``node_*`` commands
- ``NodesPanel``
Data model
==========
- A **``Run``** is one loaded file: a shared ``time`` axis (``float64``) plus one **``Signal``** per channel (``y`` as ``float32``, contiguous, ready for ImPlot and the pyramid).
- **``DataRegistry``** owns runs and hands out stable integer ids; ``SignalRef = (run_id, name)`` identifies a signal everywhere.
- **Time normalization** is per-log: ``Run.set_time_origin(target)`` keeps the original ``time_raw`` and derives ``time = time_raw - time_raw[0] + target``. It is reversible (``target=None`` restores raw) and re-derives only on change, never per frame.
Channel headers like ``CH8=8)ia`` are cleaned to ``ia`` by ``parse_channel_name``, which also detects a trailing unit token (``_rpm``, ``_us``, …) for axis labels.
.. _uz_dataviewer_loader_large_logs:
Loading & large logs (``loader.py``)
=====================================
``load_file`` dispatches by extension through ``parse_file`` → ``parse_csv`` / ``parse_parquet``, each returning a ``ParsedRun`` (plain arrays, no registry touch) that is turned into a ``Run`` via ``ParsedRun.register`` (``DataRegistry.add_run``).
On native, loads run on a ``ThreadPoolExecutor``: the worker only **parses** into a ``ParsedRun``, and the main thread **registers** it in ``state.poll_pending_loads``, so ``DataRegistry`` is only ever mutated from one thread (see :ref:`uz_dataviewer_arch_native_web`).
The loader keeps the **load-time memory peak** near the resident dataset size, so ~100M-point logs open without OOM:
- **float32 at parse, no intermediate copy.** ``_csv_column_types`` feeds Arrow ``ConvertOptions(column_types=…)`` so channels parse straight to ``float32`` (time stays ``float64``), avoiding the float64→float32 re-cast that doubled channel memory. ``_named_signals`` shares one naming/dedup pass across every load path.
- **Streaming CSV.** For files at/above ``CSV_STREAM_MIN_BYTES``, ``_parse_csv_streaming`` reads via Arrow's ``open_csv`` batch reader into **preallocated** numpy arrays (grown geometrically since a CSV's row count isn't known up front), ``include_columns`` projecting to just the named columns. Peak ≈ resident + one batch (~1.5×) instead of the bulk ``read_csv`` peak (~4× resident, measured). Small CSVs keep the fast multithreaded bulk path.
- **Releasing Arrow's pool.** After every parse, ``parse_file`` calls ``_release_arrow_pool`` (``pa.default_memory_pool().release_unused()``): Arrow otherwise retains freed parse scratch rather than returning it to the OS, pinning RSS at the parse high-water mark (measured ~1.2 GB reclaimed on the 9M-row log, ~10 GB on the 75M).
- **CSV size guard.** ``_guard_csv_size`` estimates the resident footprint from file size × column count and **refuses** a CSV above ``MAX_CSV_NUMERIC_BYTES``. Now that large CSVs stream, this is a **resident-RAM** ceiling (the full record must fit in memory), not a bulk-parse guard; it routes the user to ``convert(...)`` or fewer channels.
- **Streaming Parquet.** For files above ``PARQUET_STREAM_MIN_ROWS``, ``_parse_parquet_streaming`` reads the exact row count from ``ParquetFile.metadata.num_rows``, **preallocates** the output arrays, and fills them with ``iter_batches`` (peak ≈ resident + one row group, ~1.5× measured) instead of materialising the whole table. Small Parquet keeps the simple bulk path.
- **CSV→Parquet converter.** ``convert_csv_to_parquet`` stream-converts a CSV to Parquet with bounded memory (per-batch ``ParquetWriter.write_batch``), dropping JavaScope's empty trailing column and preserving raw headers so the result round-trips through the same name parsing. Exposed as the scriptable ``convert(src, [dst])`` command and the ``uz-dataviewer convert`` CLI.
The data stays **in RAM**, so FFT / histogram / node transforms operate on the full record.
Lifting the in-RAM bound entirely (out-of-core, web's hard ~4 GB case) is deferred — see :ref:`uz_dataviewer_web_large_logs`.
The command layer
=================
``CommandRegistry`` holds ``Command``s (a name, typed ``Param``s, a handler, help).
The grammar is a single function call, ``name(arg, arg, ...)`` (``parse_call``).
- **Param kinds** drive both coercion and canonical formatting: ``int``, ``float``, ``bool``, ``str``, ``plot`` (``plot_1``↔index), ``run`` (``run_2`` or a file label), ``fwindow`` (``custom``/``full``/``plot_N`` for analysis windows).
- **Three entry points**:
- ``execute(state, name, args)`` — used by UI handlers; coerces, runs, **echoes** the canonical call.
- ``dispatch(state, text)`` — used by the console/scripts; parses then executes; errors go to the console.
- ``echo(state, name, values)`` — logs a command **without** running it. Used for continuous gestures (zoom, cursor/spy drag) that ImPlot has *already* applied — the echo settles once on mouse-up rather than firing every frame.
- **Run resolution by label** lets scripts say ``add_signal(plot_1, Log.csv, ia)`` and survive run-id reassignment after a reload.
Panels never mutate ``AppState`` for a user action directly; they call ``state.commands.execute(...)`` (often via a small ``self._emit`` wrapper that logs errors).
This invariant keeps everything scriptable.
.. _uz_dataviewer_generated_command_reference:
Generated command reference (``command_doc.py``)
-------------------------------------------------
The command table in :ref:`uz_dataviewer_usage` is **generated from the live registry**, not hand-maintained.
The commands are closures registered inside ``register_builtins`` and the ``_register_analysis_common`` / ``_register_node_commands`` factories — built from name f-strings (``fft_remove``, ``hist_remove`` are *one* closure parameterised by ``prefix``), so there is no module-level symbol for ``sphinx.ext.autodoc`` to document.
``command_doc.build_csv`` instead instantiates a ``CommandRegistry`` and walks ``reg.all()``, taking the name, ``cmd.help`` (description), ``cmd.params`` (rendered ``name:kind``) and the handler's source location (``handler.__code__.co_filename``/``co_firstlineno`` → ``commands.py:577``) — that source column is exactly what makes an otherwise-unfindable closure locatable.
- **Grouping lives here, not in the registry.** ``group_for`` maps each command to a section (``GROUP_ORDER``) by name prefix (``fft_*``/``hist_*``/``node_*``) or an explicit table for the rest, and **raises** for any unassigned command. (``help`` lists commands flat; the curated grouping is documentation-only.)
- **Committed, not built at docs time.** The output ``uz_dataviewer_commands.csv`` sits beside ``uz_dataviewer_usage.rst`` and is included via ``.. csv-table:: :file:``, so the Sphinx build stays hermetic. Regenerate with ``python -m uz_dataviewer.command_doc``.
- **Drift-guarded.** ``tests/test_command_doc.py`` re-renders in memory and fails if the committed CSV is stale or a new command has no group — so editing a command (or even shifting its line) without regenerating breaks CI.
Plots panel internals (``panels/plots.py``)
============================================
A runtime grid of ``SubplotCell``s.
Cell types: ``line`` / ``scatter`` / ``stairs`` (time series) and ``xy`` (one signal vs another).
FFT and Histogram are **separate windows**, not cell types.
Per time-series cell, each frame:
#. Compute the data extent and apply any **pending** axis limits (``fit_pending``, ``pending_x_lim``, ``pending_y_lim``) with ``Cond_.Always``.
#. **Linked X:** the hovered cell becomes the ``_driver`` and publishes its X range to ``state.shared_x``; followers lock to it. Every time-series cell also publishes its own range to ``state.plot_x_ranges[plot_n]`` (used by analysis windows' "follow plot_N").
#. **Downsample** each signal to ~``max_points`` over the visible window (``decimate_range``, fed by ``visible_slice``) — this keeps multi-GB logs interactive. **XY cells are the exception**: they decimate by plain uniform stride (``_xy_stride``), not a min/max envelope, so a busy Lissajous figure can alias (acceptable for the phase-portrait case).
#. Draw it (line/scatter/stairs), optionally with per-sample markers (``Spec.marker``), routing the left/right (secondary ``ImAxis_.y2``) axis per signal.
#. **Cursors** (two ``drag_line_x``) and **spy** (``drag_rect`` + a ``canvas_only`` inset) draw on top. Their readouts and rectangles are recomputed only on move.
#. **Zoom echo:** when a pan/zoom settles, ``echo set_x_lim(plot_n, ...)``.
.. note::
**Cursor performance.** Cursor readouts originally used ``np.interp``, which up-casts the entire ``float32`` signal to ``float64`` on every call (~150 ms/frame on a 5 M-point log).
It was replaced with an O(log n) ``searchsorted`` lookup (``_value_at``) that touches only the two bracketing samples.
Analysis windows (``panels/analysis.py``)
==========================================
FFT and Histogram share ``AnalysisPanel``.
Subclasses set a command ``prefix`` (``fft``/``hist``) and a ``compute_command``, then implement ``_compute``, ``_draw``, ``_options_key``, and any ``_extra_controls``.
The base provides:
- A **sources list** fed by drag-drop (``*_source`` command) and the legend "Remove" (``*_remove``).
- A **time-window selector** (``follow_combo``): ``Custom`` (``0``), ``Full`` (``-1``), or ``plot_N`` (``≥1``), resolved by ``AppState.resolve_x_window``. Switching to a plot computes immediately; switching to Custom/Full does **not** (the window can be the whole log).
- **On-demand compute only** — never per frame. A compute happens on the *Compute* button, on a drag-in, or once when an input changes (``compute_requested``). Following a plot and then panning it goes **stale** rather than recomputing.
- A **stale indicator**: a ``_computed_key`` snapshot (sources + window + options) is compared to the current key each frame; mismatch shows "⚠ stale - press Compute".
- **Every control routed through commands** (``fft_follow``, ``fft_remove_dc``, ``hist_bins``, ``*_xlim`` on zoom, ``*_export``, …) and **Export** (OS dialog on desktop, browser download on web).
.. _uz_dataviewer_node_graph:
Node graph (``nodes.py``, ``transforms.py``, ``panels/nodes.py``)
==================================================================
A small dataflow engine, built as a **derived-signal factory**: a transform node, when evaluated, materializes its result as a *new run* in the registry (``Run.derived=True``) via ``AppState.upsert_derived_run``.
That derived signal then flows through the existing app unchanged — it shows in Navigation and is draggable into plots / FFT / Histogram.
Nodes *produce* data; nothing else in the app needs to know they exist.
- **Graph** (``NodeGraph``) of ``Node``s: a **source** wraps a ``(run, signal)`` ref; a **transform** (``fft`` / ``math`` / ``filter`` / ``shift``) pulls arrays from its inputs and computes. Links are validated against cycles; ids are persisted so derived-run labels (= node name) are stable across save/restore.
- **On-demand evaluation** (``evaluate``): topological order; each transform reads its inputs (a source from the registry, an upstream transform from its cache), computes via ``transforms``/``analysis``, bumps a ``version``, and upserts its derived run **in place** (so plot references survive a re-eval). Per-node errors are logged and skipped, not fatal. A node is **stale** (``is_stale``) when its key — params + each input's version/identity — differs from the last evaluated key (the same idea as the analysis windows' ``_computed_key``).
- **GUI-free + scriptable**: ``nodes.py``/``transforms.py`` import no GUI and raise plain ``ValueError``s, so the whole engine is driven and tested from command strings. The canvas (``panels/nodes.py``, ``imgui_node_editor``) is a thin layer that **only issues ``node_*`` commands** — drag-drop → ``node_source``, link → ``node_link``, a widget → ``node_set``, a drag → ``node_pos``, the button → ``node_eval``. So the graph round-trips through the console, ``.uzscript``, and the JSON session.
- **Transforms are pure NumPy** (windowed-sinc FIR filter, not SciPy) so native and web share one code path. FFT reuses ``analysis.compute_fft``; its derived run's x-axis is frequency.
- **Registry-driven.** Every transform (builtin *or* plugin) is a ``TransformSpec`` in ``REGISTRY``; the engine looks up available kinds, default params, input arity and the compute function there, so a plugin is indistinguishable from a builtin. External plugins (``plugins.py``, the ``@transform`` decorator) are loaded at startup from ``$UZ_DATAVIEWER_PLUGINS`` / ``~/.uz_dataviewer/nodes/`` — both optional; a missing dir, no files, or a broken plugin are all tolerated (the app runs with just the builtins). An unknown kind (plugin not installed) restores as a placeholder that keeps its params/links. See :doc:`uz_dataviewer_plugins`.
.. note::
Intentional v1 limits: binary math needs equal-length inputs (no resampling); a source node pointing at *another node's* derived output is a runtime convenience that may not fully survive restore (chaining is meant to go through ``node_link``).
.. _uz_dataviewer_downsampling:
Downsampling (``downsample.py``)
=================================
Pure NumPy — **identical on native and web** (the Rust ``tsdownsample`` dependency was removed in favour of the pyramid).
Every path produces a **min/max envelope at the signals' true sample positions** (so it looks like the waveform, not a blocky bucket-centre grid) and a **consistent ~``max_points`` output at every zoom**.
Two layers:
- **One-shot envelope (``_envelope_oneshot``)** — splits the visible slice into ~``max_points/2`` near-equal buckets, takes each bucket's argmin/argmax, and interleaves them in order. Vectorised, **O(visible points)** — used for windows up to ``PYRAMID_MIN_POINTS`` (1 M), a couple of ms at that size.
- **Multi-resolution pyramid (``Pyramid``)** — for larger windows. Built **once per signal on first display** (lazily, cached on ``Signal.pyramid``): geometric levels storing the per-bucket argmin/argmax **sample indices** (``64, 512, 4096, …``), each aggregated from the previous, so the build is O(n). A query picks the coarsest level that still has ≥ the budget of buckets, then groups those down to the budget keeping each group's true extreme — **O(output)**.
.. note::
**Hitting the budget exactly.** Both layers group their candidate buckets down to *exactly* ``max_points/2`` via ``_reduce_to`` — it assigns the ``k`` candidates to ``max_points/2`` near-equal groups (``grp = arange(k)*target//k``) and keeps each group's true extreme with one ``lexsort``.
This replaced an integer **group factor** (``ceil(k/target)``), which could only halve/third/… the count and so realised as few as ~50 % of the budget at certain zooms (e.g. a 5 M-sample window landed on ~1 220 of 2 000 points).
The output is now a stable ~``max_points`` at every zoom and on every path.
``_reduce_to`` only ever runs on an already-bucketed array (≲ ``target × factor``), so the query stays O(output); the pyramid *build* is untouched (still a fixed-stride O(n) pass).
Storing *indices* (not values) lets the envelope use true X/Y read from the run's ``time``/``y`` arrays, so it is also normalization-agnostic.
Measured (single float32 signal, ``max_points=10 000``, one core; reproduce with a short ``Pyramid.build`` / ``decimate_range`` timing loop):
.. list-table::
:header-rows: 1
:widths: 24 19 19 19 19
* - Signal
- Build (once)
- Query/frame (full)
- Query/frame (zoomed)
- Pyramid memory
* - 10 M pts (40 MB)
- ~15 ms
- ~0.9 ms
- ~0.15 ms
- 1.4 MB (3.6 %)
* - 50 M pts (200 MB)
- ~75 ms
- ~0.7 ms
- ~0.2 ms
- 7.1 MB (3.6 %)
Re-scanning the full 50 M extent every frame (the one-shot envelope, the work the pyramid avoids) costs **~23 ms/frame** — the pyramid is ~30× cheaper per frame at the price of a one-time build and ~3.6 % memory.
The output stays ~``max_points`` whether viewing the whole record or a slice.
*(Absolute times are hardware-dependent; the structural point is the O(output) vs O(visible) gap.)*
The FFT/Histogram windows use the same decimation: a multi-million-point spectrum is range-decimated per frame (``AnalysisPanel._plot_decimated``) rather than drawn in full, and a histogram is binned once at compute (not re-binned every frame).
Two accepted trade-offs:
- The pyramid is built **lazily on first display**, on the render thread, so the first frame after dropping in a large signal stalls for the build (~75 ms at 50 M). It is a one-off per signal and dwarfed by the file load; building it on the loader thread alongside parsing is noted as future work.
- The renderer **re-decimates every frame with no result cache**. Because a query is O(output) this is cheap even on a static view, and it keeps the code stateless (true to immediate mode). Caching decimated arrays keyed on ``(visible limits, max_points)`` would trim idle CPU at the cost of a cache to invalidate; simplicity was chosen.
Sessions (``session.py``)
==========================
Two complementary formats:
- **JSON snapshot** (``save_state``/``load_state``) — the exact view (grid, assignments, plot types, cursors/spy, secondary axes, FFT/Histogram config, time origins, file list). Restore reloads files then resolves assignments **by run label** (not id).
- **``.uzscript``** (``export_script``/``run_script``) — a human-readable, editable list of commands that reproduces the session when replayed. Normalised loads round-trip as ``load(path, start)``.
CSV exports (``export_data``, ``export_fft``, ``export_histogram``) also live here.
.. _uz_dataviewer_arch_native_web:
Native vs web (``webbridge.py``)
==================================
The same Python runs on the desktop and in the browser via Pyodide (CPython→WASM); a single flag, ``webbridge.IS_WEB``, gates the handful of edges that differ.
The UI, command layer, plots, FFT/Histogram, nodes and downsampler are **identical** on both — the differences are file I/O, threading, the WASM memory ceiling, and plugin loading.
Design-relevant points: loads are async on native and synchronous on web (Pyodide has no worker threads); the renderer never branches on the target because the **decimator is pure NumPy** (no native dependency to wheel for the browser); and the web build cannot hold a multi-GB log, so a large CSV is **stream-parsed into typed arrays** and **decimated on a memory budget** at load rather than materialised whole.
The user-facing comparison is in :doc:`uz_dataviewer_native_vs_web`; the wasm32 ~4 GB ceiling and the deferred out-of-core fix are analysed in :ref:`uz_dataviewer_web_large_logs`.
Web edges (``webbridge.py`` / ``build/gen_web.py``):
- **File I/O.** Native uses ``portable_file_dialogs`` plus a GLFW drop callback; web uses a hidden HTML ```` to open, ``webbridge.download`` (a Blob + synthetic anchor click) to save, and a second hidden picker (``uzSessionInput``) for session uploads, routed by extension in ``webbridge.load_uploaded_session`` (``.json`` → ``load_state``, ``.uzscript``/``.txt`` → ``run_script``).
- **Loading.** ``wireFileInput`` (``build/gen_web.py``) dispatches by type: a CSV ≤ ``FULL_LOAD_LIMIT`` (200 MB) **and any Parquet** take ``loadFull`` → written to the Pyodide FS and parsed by Arrow, same code as native; a larger CSV takes ``loadStreamingCsv``.
- **Streaming CSV decimation (CSV only).** ``loadStreamingCsv`` reads the ``File`` blob in chunks (the multi-GB text never sits in WASM memory) and pushes per-column typed arrays into ``webbridge.load_columns``. It estimates the numeric footprint (``estRows × (8 + 4·nDataCols)``) from a header sample and, if that exceeds ``MEM_BUDGET`` (~1.5 GB, the working budget below the ~4 GB heap ceiling), keeps every ``stride``-th row so the result fits — emitting the "decimated 1:N … use the native app" console note. **This safety net is CSV-only:** a large **Parquet** has no equivalent and goes through ``loadFull`` unguarded, so it loads at full resolution if it fits the heap or, if it doesn't, the oversize allocation **silently aborts the whole WASM runtime** ("loading does nothing"). The size estimate exists precisely to avoid that abort for the CSV path.
- **Node plugins are native-only.** ``plugins.load_plugins`` (called once at startup in ``app.py``) scans filesystem/env plugin dirs (``$UZ_DATAVIEWER_PLUGINS``, ``~/.uz_dataviewer/nodes/``) that a browser tab cannot populate, so on web it finds nothing and the node graph runs the builtins in ``nodes.REGISTRY`` only. The ``load_plugins`` command still exists but has no dirs to read. See :doc:`uz_dataviewer_plugins`.
- **Version pins.** ``PYODIDE_VERSION`` and ``IMGUI_BUNDLE_WHEEL`` (``build/gen_web.py``) must be a compatible pair — the ``imgui_bundle`` wheel is built against one specific Pyodide release, so bump them together (see the `imgui_bundle Pyodide docs `_). A true single-file, no-server build would require a C++/emscripten ``SINGLE_FILE`` rebuild and would drop the Python data stack (:ref:`uz_dataviewer_build_web`).
Extending the app
=================
**Add a command** — register it in ``register_builtins`` (``commands.py``):
.. code-block:: python
def my_action(state, a):
plot_n, value = a
state.cells[plot_n - 1].something = value
reg.add("my_action", [Param("plot", "plot"), Param("value", "float")], my_action, "help text")
Then have the widget call ``state.commands.execute(state, "my_action", [plot_n, value])`` (via the panel's ``_emit``) on change.
Persist it in ``session.py`` if it should be saved.
**Add a time-series option** — add a field to ``SubplotCell``, a command to toggle it, a checkbox in ``PlotsPanel._cell_header`` that emits the command, the rendering in ``_render_time_series``, and the session round-trip.
**Add an analysis window** — subclass ``AnalysisPanel``, set ``prefix``/``compute_command``, implement ``_compute``/``_draw``/``_options_key``/``_extra_controls``; register the per-type commands (the common ones come from ``_register_analysis_common``); add a ``DockableWindow`` in ``app.py`` and a config dataclass + serialisation.
**Add a plot type** — extend ``PlotType``, branch in ``PlotsPanel._render_cell``, and update ``session.py`` if it carries extra per-cell state.
**Add a node transform** — register a ``TransformSpec`` in ``nodes.py`` (a compute function + default params + arity), or, for a user-supplied one, ship a plugin file with the ``@transform`` decorator (see :doc:`uz_dataviewer_plugins`).
Builtins keep bespoke widgets in ``NodesPanel._params``; plugins render from their ``ParamSpec`` list automatically.
Commands and save/restore are generic (``node_set`` carries arbitrary params; the graph stores only the kind name), so nothing else needs touching.
Testing
=======
Tests are pure ``pytest``, no real window.
Two patterns:
- **Logic via commands** — dispatch command strings against an ``AppState`` and assert on state (e.g. ``test_commands.py``, ``test_session.py``).
- **Headless rendering** — create an ImGui/ImPlot context, set a backend-textures flag, and drive a few frames calling ``panel.render(state)`` (e.g. ``test_plot_types.py``). This exercises the real ImPlot calls (begin/end pairing, Spec markers, drag tools) without a GPU window. Destroy the contexts in a ``finally`` so tests don't leak across modules.
Run ``pytest`` from the project root.
.. _uz_dataviewer_design_decisions:
Design decisions
================
.. list-table::
:header-rows: 1
:widths: 35 65
* - Decision
- Why
* - Single ``AppState``, panels as renderers
- One source of truth; trivial to serialise and test; no scattered widget state.
* - Command layer is the only mutation path
- Scriptability, console transcript, save/restore-by-replay, and headless testing all
derive from it.
* - Gestures **echo**, discrete actions **execute**
- Zoom/drag are continuous; echoing once on settle keeps the console readable while
staying replayable.
* - Min/max **pyramid** (pure NumPy) for decimation
- Per-frame cost becomes O(output), not O(visible points) (~0.7 ms vs ~23 ms/frame at
50 M points), so multi-GB logs pan at full fps — built once at first view (~75 ms/50 M),
~3.6 % memory. Replaced the Rust ``tsdownsample`` dep so native and web run the same
code. Cost: lazy build hitches the first frame; the renderer re-decimates every frame
uncached (both accepted — see :ref:`uz_dataviewer_downsampling`).
* - XY cells decimate by uniform **stride**, not min/max
- An XY/phase plot has no monotone time axis to bucket against; stride keeps it simple
but can alias a dense Lissajous figure (the one place range-aware downsampling does not
apply).
* - FFT/Histogram are windows, on-demand
- Per-frame transforms of a large window would tank the frame rate; explicit Compute
keeps it snappy.
* - Nodes are a **derived-signal factory** (not a new viewer)
- Transforms emit ordinary runs, so plots/FFT/Histogram consume them unchanged. The
engine is GUI-free and command-driven (scriptable and headless-testable); the
``imgui_node_editor`` canvas only issues ``node_*`` commands.
* - ``searchsorted`` not ``np.interp`` for cursors
- ``np.interp`` up-casts the whole array each call — catastrophic per frame.
* - Session refs by run **label**
- Run ids are reassigned on reload; labels survive.
* - Per-log time normalization keeps ``time_raw``
- Reversible, cheap (recompute on change only), and signals are never touched.
* - Web: stream-parse CSV → arrays
- Avoids holding multi-GB text in 32-bit WASM memory; loads full resolution.
* - Pending-flag pattern (``fit_pending``, ``compute_requested``, …)
- The immediate-mode way to express "do X once next frame".
Repository layout
=================
.. code-block:: text
uz_dataviewer/
├── run.py # run from a source checkout (also replays *.uzscript)
├── src/uz_dataviewer/ # modules — see the module map above
│ └── panels/ # navigation, plots, analysis base, fft, histogram, nodes
├── tests/ # pytest (logic via commands + headless rendering)
├── docs/ # USAGE / ARCHITECTURE / BUILD / NATIVE_VS_WEB / PLUGINS / LIBRARY / ROADMAP
└── build/ # native (PyInstaller) + web (Pyodide) build flow