Table Format Converter
A browser-only tool to read structured tabular data (CSV, Parquet, JSON, NDJSON,
Excel), inspect its structure, adjust the parsing heuristic, and export it to a
different format. Your data never leaves the browser — all
processing happens locally with DuckDB-WASM and SheetJS.
Workflow
- Drop a file onto the drop zone, or click to pick one.
- The app detects format and parsing parameters from a small sample
and shows a preview of the first 10 rows plus the schema.
- Adjust any parameter on the left panel — the preview re-runs live
(debounced ~300 ms).
- Pick a target format and options, then click Export. Only at
this point is the full file read end-to-end.
Sample-first performance
Drop is instant regardless of file size, and the first preview appears
within ~100 ms even for multi-GB files:
- Files > 512 KB — registered as a lazy
FileReader handle (no bytes loaded). A small
512 KB head slice is pre-loaded into the WASM
heap and used for all preview / detection queries — they never touch
the rest of the file.
- Files ≤ 512 KB — loaded entirely into the heap
once and reused for both preview and export.
- Parquet — column-pruned + row-group-pruned by DuckDB,
so even full reads only fetch what the query actually needs.
- Excel — must be loaded fully (SheetJS doesn't stream).
The heuristic peeks at only the first 100 rows per sheet.
- Export — always reads the full file; this is
the only stage where the entire dataset is touched.
Supported formats
- CSV / TSV — auto-detects encoding, BOM, line ending,
delimiter, quote, skiprows, header.
- Parquet — schema is read from metadata directly; no
heuristic needed.
- JSON (array) — full file is parsed (a JSON array is a
single value). For files > 100 MB the app suggests NDJSON instead.
- NDJSON / JSON Lines — schema inferred from the first ~100
lines.
- Excel (.xlsx) — sheet selection, data-block detection
(bounding box), header row, plus a visual range picker on the first
~30 × 15 cells.
- DuckDB (.duckdb / .ddb / .db) — attached read-only;
pick which table to read from. Schema and types come straight from
the file. Can also be produced as an export target.
Notes & limitations (v1)
- Excel sheets with multiple disjoint data blocks: only the first/largest is
detected. Use the range picker to select another block manually.
- Nested structures (Struct/List/Map) are shown as JSON strings in the preview;
Parquet export preserves native types.
- If your browser blocks cross-origin
SharedArrayBuffer (typical
for file://), DuckDB falls back to the single-threaded MVP build —
slightly slower but otherwise functional.
- External libs (DuckDB-WASM, SheetJS) are loaded from public CDNs. Internet
access is therefore needed on first load; browsers cache aggressively
afterwards.