End-to-end LLM creation studio

Train, refine, align, and test small language models from one focused desktop app.

Haiku Studio wraps our h2 Python training engine in a local Electron interface for project management, tokenizer building, pretraining, SFT, DPO, checkpoint testing, and kernel-level runtime visibility.

Hardware Detection Tokenizer Training Pretraining Fine-Tuning Direct Preference Optimization Chat Lab
Haiku Studio
Active Project haiku_studio
Tokenizer50k BPE
CheckpointDPO Ready
DeviceCUDA / CPU
Pretraining run projects/haiku_studio/checkpoints
System Kernel OutputKernel Ready
0001  LOG  [project] Loaded tokenizer into data/tokenizer.json 0002  LOG  [train] eval_loss=2.184 checkpoint saved 0003  WARN [tokenizer] RAM guard enabled
Capabilities

A complete local workflow without burying the engine.

The desktop app stays close to the Python training scripts while making the project lifecycle visible and manageable.

01

Project workspaces

Create isolated project folders for tokenizers, configs, checkpoints, logs, caches, and datasets.

02

Tokenizer builder

Build a BPE tokenizer from one file or an entire corpus folder with RAM-aware input sampling.

03

Pretraining

Launch pretraining runs, monitor metrics, save checkpoints, and resume from project-scoped artifacts.

04

SFT

Fine-tune on user/bot dialogue data with completion-only masking and project-local outputs.

05

DPO alignment

Train from prompt/chosen/rejected preference pairs using a frozen reference checkpoint.

06

Kernel diagnostics

Stream process output into the app with warning and error highlighting for serious runtime issues.

Workflow

Projects stay persistent.
Runtime folders stay clean.

Persistent project projects/haiku_studio

Tokenizer, project config, checkpoints, logs, cache, and datasets.

Runtime staging data/ + config/

Active project tokenizer and config are staged for the Python backend scripts.

Trainer output checkpoints + logs

Results are written back to the active project rather than scattered globally.

Interface

Designed for visibility, not decoration.

Every tab maps to a real engine workflow: project setup, tokenizer creation, training, alignment, chat testing, and export.

Project Workspace
Checkpoint Targetprojects/haiku_studio/checkpoints
Metrics Targetprojects/haiku_studio/logs
Cache Targetprojects/haiku_studio/cache
Tokenizer Lab
Single file / corpus folder
Auto-safe sampling
Kernel Output
0001LOG[train] saved projects/haiku_studio/checkpoints/model.pt 0002WARN[tokenizer] corpus sampled under RAM guard 0003ERR[runtime] serious errors surface immediately
Pretrainingbase model
SFTdialogue tuning
DPOpreference alignment
Chat Lab

Test the best checkpoint from the active project without leaving the workspace.

HuggingFace Integration
  • Search for training data and sync with your corpus from within Haiku Studio.
  • Link your account and export weights to HuggingFace using your API key.
  • Better HuggingFace and other integrations coming soon!
Download

Install Haiku Studio or review the source.

Use the installer for the desktop workspace, or clone the repository and run the Python scripts directly.