JustDoIt — Rendering Architecture & Roadmap

“The best terminal rendering codebase on the planet.”

Not a goal. A standard.

The Premise

ASCII/ANSI is the hardest rendering medium that exists. Fixed-width monospace grid. 95 printable characters. No subpixel positioning. No blend modes. No transparency. Color only where the terminal cooperates.

If you can make text alive in that medium — truly alive, with sharp edges, real physics, generative fills, and 24fps animation — you can make it alive anywhere. The terminal is where ideas get proven. Everything else is just a richer substrate.

This document describes how JustDoIt becomes the definitive implementation of that proof.

The Architecture Stack

┌─────────────────────────────────────────────────────────────────────┐
│                          INPUT LAYER                                 │
│  plain text  │  PIL image  │  AI-generated image  │  live capture   │
└──────────────┴─────────────┴──────────────────────┴─────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────────┐
│                         RENDER PIPELINE                              │
│                                                                      │
│   PATH A: Glyph-Dict (fast, fill-composable)                        │
│     text → font lookup → glyph mask → fill_fn(mask, t) → rows       │
│                                                                      │
│   PATH B: Image-Sample (quality, color, any source)                 │
│     image → [composite effects] → image_to_ascii() → (char, rgb)    │
│                                                                      │
│   PATH C: Hybrid (best of both)                                     │
│     text → PIL render → effect image → composite → sample           │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────────┐
│                       SPATIAL EFFECTS LAYER                          │
│  warp  │  distort  │  perspective  │  rotate  │  fisheye  │  zoom   │
└─────────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────────┐
│                        ANIMATION LAYER                               │
│  frame loop  │  time parameter t  │  simulation state  │  easing    │
└─────────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────────┐
│                         OUTPUT LAYER                                 │
│  terminal/ANSI  │  SVG  │  PNG  │  APNG  │  asciinema  │  WebGL    │
└─────────────────────────────────────────────────────────────────────┘

Every layer is independently swappable. Every transition between layers is a well-defined data contract. Nothing is hardwired.

Data Contracts

These are the types that flow between layers. Get these right and the whole system is composable.

Glyph (Path A output)

Glyph = list[str]          # list of rows, same width per font
GlyphMask = list[list[float]]  # 0.0=empty, 1.0=ink, float for anti-alias

Cell (Path B output, universal intermediate)

@dataclass
class Cell:
    char: str                          # single ASCII character
    fg: tuple[int, int, int] | None    # RGB foreground, None = default
    bg: tuple[int, int, int] | None    # RGB background, None = transparent
    bold: bool = False
    dim: bool = False

Grid (universal intermediate — what everything converges to)

Grid = list[list[Cell]]   # Grid[row][col]

This is the key type. Everything upstream produces a Grid. Everything downstream consumes one. Both paths converge here. The Grid is the universal handoff point.

Frame (animation unit)

@dataclass
class Frame:
    grid: Grid
    t: float          # normalized time [0.0, 1.0]
    index: int        # absolute frame number
    duration_ms: int  # how long to display this frame

AnimationSpec

@dataclass
class AnimationSpec:
    fps: float = 24.0
    frame_count: int = 48
    loop: bool = True
    bounce: bool = False   # forward + reverse = natural loop
    easing: str = "linear" # linear | ease-in | ease-out | ease-in-out

Resolution Model

Resolution is a first-class parameter, not an afterthought.

Display Targets

# Named presets (extend freely)
DISPLAYS = {
    "terminal":  DisplayTarget(cols=80,   rows=24,  note="classic terminal"),
    "wide":      DisplayTarget(cols=220,  rows=50,  note="modern widescreen terminal"),
    "fhd":       DisplayTarget(px_w=1920, px_h=1080, cell_w=8, cell_h=16),
    "qhd":       DisplayTarget(px_w=2560, px_h=1440, cell_w=8, cell_h=16),
    "4k":        DisplayTarget(px_w=3840, px_h=2160, cell_w=8, cell_h=16),
    "4k-hidpi":  DisplayTarget(px_w=3840, px_h=2160, cell_w=16, cell_h=32),  # 2× cell, same grid as FHD
    "5k":        DisplayTarget(px_w=5120, px_h=2880, cell_w=8, cell_h=16),
}

Resolution-Derived Values

Given a DisplayTarget, everything else is computable:

cols     = px_w // cell_w          # ASCII grid width
rows     = px_h // cell_h          # ASCII grid height
canvas   = (px_w, px_h)            # PIL image size for Path B
font_pt  = fit_to_fill(text, cols) # largest font that fills the width
cell_ar  = cell_w / cell_h         # aspect ratio (~0.5 for monospace)

The cell_w / cell_h problem

Monospace cells are not square (~0.5 aspect ratio). This matters for:

Image sampling (a “square” region in pixel space is not square in cell space)
Effect generation (noise, plasma, etc. must compensate or circles become ovals)
SDF computation (Euclidean distance in cell space ≠ pixel space)

Rule: All effects that care about geometry receive (cell_w, cell_h) and compensate internally. Callers never need to think about this.

The Three Render Paths

Path A — Glyph-Dict (current, extend don’t replace)

When to use: Animated fills (flame, plasma, turing, RD, slime). Per-letter effects. Low-latency animation. Terminal output.

Pipeline:

text → font.lookup(char) → GlyphMask → fill_fn(mask, t, **params) → Glyph → Grid

Strengths:

O(letters × fill_cells) per frame — scales with text length, not display resolution
Fill effects have per-letter masks — can treat each letter independently
Zero PIL dependency for pure fills
Works in actual terminals (not just SVG/PNG)

Limitations:

Letter shape quality bounded by font_size in the rasterizer
No color sourced from the fill (fills produce char choices, color is separate)
No arbitrary image input

Extension points:

fill_fn signature: (mask: GlyphMask, t: float, **params) -> Glyph — add t everywhere
Anti-aliased masks: float values in GlyphMask (currently binary 0/1) enable sub-cell effects
Per-fill color: fill_fn can return list[list[Cell]] instead of list[str] for color-aware fills

Path B — Image-Sampler (new, G09)

When to use: 4K gallery stills. AI image → ASCII. Photo → ASCII. Maximum quality text. Color-accurate output.

Pipeline:

PIL.Image → image_to_ascii(cell_w, cell_h) → Grid

Internally:

for each cell (row, col):
    pixel_block = image[row*cell_h:(row+1)*cell_h, col*cell_w:(col+1)*cell_w]
    zone_vec    = compute_6zone_coverage(pixel_block.to_grayscale())
    char        = nearest_char(zone_vec, char_db)
    fg_rgb      = mean_rgb(pixel_block)
    grid[row][col] = Cell(char, fg=fg_rgb)

Strengths:

Harri-quality edge following (chars follow contours, not just brightness)
True RGB color per cell
Source-agnostic: text, AI image, photo, video frame — same code
PIL kerning, anti-aliasing, subpixel rendering included for free (text source)

Limitations:

DB build cost (one-time, cached)
Animation: DB lookup × cells × frames (needs numpy at 4K)
No per-letter fill effects (no mask isolation)

Performance envelope: | Display | Cells | numpy time/frame | pure-python time/frame | |———|——-|—————–|————————| | fhd | ~16k | ~8ms | ~200ms | | 4k | ~65k | ~30ms | ~800ms | | 4k anim | 65k×24fps | ~720ms/s | not viable |

4K animation: pre-render to Grid list, write APNG. Don’t attempt real-time.

Path C — Hybrid (planned, unlocks everything)

When to use: Best quality + generative effects at any resolution. The future default.

Pipeline:

text
  → PIL render at canvas resolution (Path B quality)
  → composite effect image (flame/plasma/turing as PIL layer, aligned to mask)
  → image_to_ascii()
  → Grid (char from luminance, color from effect layer)

Why this is better than either A or B alone:

Letter shapes are PIL-quality (not density-mapped glyph dict)
Effect fills are visible as color in the output (flame = red/orange cells inside sharp letter edges)
Same effect code can run at any resolution — just generate the effect at canvas size
Effect simulation runs at a lower grid resolution for performance, upsampled to canvas before composite

The compositing model:

def composite_effect(
    text_image: PIL.Image,    # white text on black
    effect_image: PIL.Image,  # colorful effect at same size
    mode: str = "mask",       # "mask" | "multiply" | "screen" | "overlay"
) -> PIL.Image:
    """
    mode="mask":     effect visible only where text_image is ink (sharp edges)
    mode="multiply": effect dims with text coverage (glow bleeds slightly)
    mode="screen":   effect brightens on ink (neon glow look)
    mode="overlay":  photoshop-style — high contrast version of multiply
    """

Effect Taxonomy

Effects fall into two families. Both should work on both paths.

Family 1: Field Effects (value at every point in space and time)

These generate a value f(x, y, t) for every position. Works naturally as an image layer.

Effect	Generator	Notes
Noise	Perlin/Simplex	scale, octaves, seed
Plasma	Sum of sine waves	frequency, phase, palette
Flame	Upward noise convection	turbulence, cooling
Voronoi	Nearest-cell distance	n_cells, metric, preset
Wave	Directional sine	frequency, amplitude, direction
Fractal	Mandelbrot/Julia escape	center, zoom, max_iter
Gradient	Linear/radial/conic	stops, angle

Interface:

class FieldEffect:
    def sample(self, x: float, y: float, t: float) -> float:
        """Return value in [0.0, 1.0] at normalized position and time."""

    def to_image(self, w: int, h: int, t: float, palette: Palette) -> PIL.Image:
        """Rasterize to PIL image for compositing."""

    def to_mask_fill(self, mask: GlyphMask, t: float) -> Glyph:
        """Fast path for Path A — sample at cell centers."""

Same effect, three consumers: image composite (Path C), direct fill (Path A), future GPU shader.

Family 2: Simulation Effects (stateful, evolve over time)

These maintain state between frames. Can’t be computed independently per-frame.

Effect	State	Notes
Reaction-Diffusion	U, V grids	Gray-Scott equations
Turing	Activator/inhibitor grids	FHN model
Slime Mold	Agent positions	Physarum simulation
Strange Attractor	(x, y, z) trajectories	Lorenz, Rössler
L-System	Grammar state	rules, axiom, generations
Conway/CA	Cell grid	rule variants

Interface:

class SimulationEffect:
    def __init__(self, w: int, h: int, **params): ...
    def step(self, dt: float = 1.0) -> None:
        """Advance simulation by dt. Called once per frame."""
    def to_image(self, palette: Palette) -> PIL.Image:
        """Render current state to PIL image."""
    def to_mask_fill(self, mask: GlyphMask) -> Glyph:
        """Fast path for Path A."""
    def reset(self) -> None: ...

Critical: simulation effects run at a logical grid that is independent of output resolution. A Turing simulation might run on a 120×34 grid regardless of whether the output is FHD or 4K. The to_image() method upsamples to whatever canvas size is needed.

Animation Model

The time parameter `t`

Every effect and every fill function receives t: float — normalized time in [0.0, 1.0] for a looping animation, or absolute seconds for non-looping. This is the single threading parameter that makes everything animatable.

# Frame loop (simplified)
for frame_idx in range(spec.frame_count):
    t = frame_idx / spec.frame_count  # [0.0, 1.0)
    
    # Path A
    grid = render_glyph(text, font, fill_fn=flame_fill, t=t)
    
    # Path B / C
    effect_img = flame_effect.to_image(canvas_w, canvas_h, t, palette)
    composited = composite_effect(text_img, effect_img, mode="mask")
    grid = image_to_ascii(composited, cell_w, cell_h)
    
    frames.append(Frame(grid, t=t, index=frame_idx, duration_ms=1000//fps))

Bounce / pingpong

# Bounce: t goes 0→1→0, giving a seamless loop with no jump cut
if spec.bounce:
    cycle = frame_idx / (spec.frame_count / 2)
    t = cycle if cycle <= 1.0 else 2.0 - cycle

Simulation effects and frame count

Simulation effects must run for frame_count steps. They don’t use normalized t — they use step count. The animation system calls effect.step() once per frame, then captures effect.to_image().

CLI Design — Resolution-Aware and Maximally Flexible

Core flags (stable, don’t break)

python justdoit.py TEXT [options]

--font NAME          font name (block|slim|figlet/*|ttf:path)
--color NAME         ANSI color or rainbow
--fill NAME          fill effect name
--gap N              char gap between letters

Resolution flags (new)

--display NAME|WxH   named preset or explicit e.g. 3840x2160
--cell-w N           cell width in px (default: auto from display)
--cell-h N           cell height in px (default: auto from display)
--cols N             explicit grid width override
--rows N             explicit grid height override

Pipeline flags (new)

--pipeline glyph|image|hybrid|auto
    glyph:  Path A — fast, fill effects, terminal-safe
    image:  Path B — quality, color, any image source
    hybrid: Path C — best quality + generative effects
    auto:   pick based on other flags (default)

--source text|file:PATH|ai:PROMPT
    text:   render --text as ASCII art (default)
    file:   convert image file to ASCII
    ai:     generate image via AI then convert (future)

Effect flags (new — generalize existing)

--fill NAME           fill effect (existing, extended)
--fill-param K=V      arbitrary effect parameter (repeatable)
    e.g. --fill plasma --fill-param preset=tight --fill-param t_offset=0.5

--composite-mode mask|multiply|screen|overlay
    how effect layer composites with text in hybrid mode

--palette NAME|HEX,HEX,...
    color palette for effects that use one

Animation flags (new)

--animate             enable animation output
--fps N               frames per second (default: 24)
--frames N            total frame count (default: 48)
--duration MS         total duration in ms (overrides --frames if set)
--loop                loop the animation (default: true)
--bounce              pingpong loop (forward + reverse)
--easing linear|ease-in|ease-out|ease-in-out

--sim-steps N         simulation steps per frame for sim effects (default: 1)
--sim-warmup N        warmup steps before first frame (default: 0)

Output flags (extend existing)

--output FILE         output file (.svg|.png|.apng|.cast|.txt)
--format auto|svg|png|apng|ansi|cast
    auto: infer from --output extension
--profile standard|wide|4k   gallery profile (existing)
--bg-color HEX        background color for image output
--font-size N         override computed font size for SVG/PNG output

Auto-resolution logic

--display 4k --pipeline auto --animate
  → pipeline = hybrid (animate + display > fhd → hybrid preferred)
  → canvas = 3840×2160
  → cell_w=8, cell_h=16 → grid = 480×135
  → font_pt = fit_to_fill(text, 480 cols)
  → output format: .apng (animate implies apng unless --format specified)

--display terminal --fill flame
  → pipeline = glyph (terminal + fill → fast path)
  → cols = terminal_size().cols
  → no PIL dependency, runs in actual terminal

`--pipeline auto` decision table

Output Formats

Format	Description	Color	Animation	Terminal-safe
ansi	ANSI escape codes to stdout	24-bit	no	yes
txt	plain text, no color	no	no	yes
svg	vector, per-char `<text>` elements	full RGB	no (multi-file)	no
png	rasterized via PIL	full RGB	no	no
apng	animated PNG, frame sequence	full RGB	yes	no
cast	asciinema v2 format	ANSI	yes	no
html	`<pre>` with `<span style="">`	full RGB	CSS animation	no

All output formats consume list[Frame] (or Frame for stills). The output layer is a pure function: frames → bytes.

Performance Strategy

Do not over-engineer early. Profile first.

The bottleneck hierarchy (measured, not assumed):

Simulation step — RD/Turing at large grids. Solution: run at 1/4 res, upsample.
DB nearest-neighbor search — 65k cells × 95 chars × 6D L2. Solution: numpy vectorized matmul, or precompute a quantized lookup table.
PIL render — text drawing. Solution: cache the text image; only re-render when text changes.
File I/O — APNG write. Solution: stream frames, don’t hold all in RAM.

Numpy is optional but practically required above FHD animation

Make it optional with a pure-Python fallback. Raise a warning (not an error) when numpy is missing and the user requests 4K animation. The pure-Python path exists for correctness testing and small renders.

Caching layer

# Cache keys:
# - char_db: (charset, cell_w, cell_h)
# - text_image: (text, font_path, font_pt, canvas_size, fg, bg)
# - effect_image: for field effects — (effect_name, w, h, t, params_hash)

# Cache policy: LRU, max_size configurable, default 64 entries

Extension Points (not optional — design for these from day one)

Custom fills

from justdoit.effects import register_fill

@register_fill("myeffect")
def my_effect(mask: GlyphMask, t: float, **params) -> Glyph:
    ...

Custom fonts

from justdoit.fonts import register_font

register_font("myfont", my_glyph_dict)
# or
register_font_from_ttf("myfont", "/path/to/font.ttf", size=16)
# or
register_font_from_image("myfont", "/path/to/sprite.png", char_w=8, char_h=16)

Custom output targets

from justdoit.output import register_output

@register_output("myformat", extensions=[".xyz"])
def write_myformat(frames: list[Frame], path: str, **kwargs) -> None:
    ...

Custom compositors

from justdoit.composite import register_compositor

@register_compositor("dissolve")
def dissolve(base: PIL.Image, effect: PIL.Image, t: float) -> PIL.Image:
    ...

The Path to “Best on the Planet”

What “best” actually means here

Not most features. Not most effects. Best means:

Correctness — characters follow glyph contours. Edges are sharp. Colors are accurate.
Composability — any fill, any font, any source, any output. No hidden incompatibilities.
Performance — fast enough to be used. Documented limits. Graceful degradation.
Extensibility — adding a new fill, font, or output is one file and one registration call.
Portability — zero hard dependencies for core functionality. PIL optional. numpy optional. GPU optional. If you have them, use them. If you don’t, still works.

Near-term (what Claude is building now)

✅ core/char_db.py — 6D zone shape DB
✅ core/image_sampler.py — image → ASCII grid
✅ core/image_pipeline.py — text/image → Grid entry points
4K gallery using image pipeline

Medium-term

Cell dataclass as universal intermediate type
t: float parameter threaded through all fill functions
FieldEffect base class + to_image() method on all field effects
--display / --pipeline / --animate CLI flags
Hybrid path (Path C) — text image + effect composite

Long-term

SimulationEffect base class — stateful, resolution-independent
Frame + AnimationSpec types
HTML output format
WebGL shader export (the pipeline is the product; the substrate is a parameter)
AI image source (--source ai:PROMPT)

Invariants — Never Break These

uv run pytest is always green. Every PR. No exceptions.
Core is zero-dependency. from justdoit import render must work with only stdlib.
Both paths co-exist. Never remove the glyph-dict path. It has real advantages.
All glyphs in a font have the same height. The rasterizer zips rows.
t: float is normalized [0.0, 1.0] for looping effects. Never assume absolute time.
The Grid is the handoff type. Everything produces Grids. Everything consumes Grids.
CLI flags are additive. New flags never change the behavior of old flags.
Patent flag protocol. Anything with no known prior art: stop, flag, don’t push. (See CLAUDE.md)

Last updated: 2026-04-24
Authors: Jonny Galloway, NumberOne

JustDoIt — Rendering Architecture & Roadmap

The Premise

The Architecture Stack

Data Contracts

Glyph (Path A output)

Cell (Path B output, universal intermediate)

Grid (universal intermediate — what everything converges to)

Frame (animation unit)

AnimationSpec

Resolution Model

Display Targets

Resolution-Derived Values

The cell_w / cell_h problem

The Three Render Paths

Path A — Glyph-Dict (current, extend don’t replace)

Path B — Image-Sampler (new, G09)

Path C — Hybrid (planned, unlocks everything)

Effect Taxonomy

Family 1: Field Effects (value at every point in space and time)

Family 2: Simulation Effects (stateful, evolve over time)

Animation Model

The time parameter t

Bounce / pingpong

Simulation effects and frame count

CLI Design — Resolution-Aware and Maximally Flexible

Core flags (stable, don’t break)

Resolution flags (new)

Pipeline flags (new)

Effect flags (new — generalize existing)

Animation flags (new)

Output flags (extend existing)

Auto-resolution logic

--pipeline auto decision table

Output Formats

Performance Strategy

Do not over-engineer early. Profile first.

Numpy is optional but practically required above FHD animation

Caching layer

Extension Points (not optional — design for these from day one)

Custom fills

Custom fonts

Custom output targets

Custom compositors

The Path to “Best on the Planet”

What “best” actually means here

Near-term (what Claude is building now)

Medium-term

Long-term

Invariants — Never Break These

The time parameter `t`

`--pipeline auto` decision table