Quantum chemistry needs a modern file format

Quantum chemistry needs a modern file format

Diagram showing today's fragmented output files on the left and a single .qvf container with typed sections on the right
From a directory of slices to a single container

Quantum chemistry calculations in 2026 produce rich, many layered results. A single run can generate atomic structures, volumetric grids for densities and orbitals, band structures, vibrational modes, spectra, trajectories, and detailed provenance describing how the calculation was performed. In principle, all of this belongs together as one coherent dataset.

In practice, it does not.

Instead, results are fragmented across a collection of loosely related files: cube files for densities, XSF for periodic systems, XYZ for trajectories, vendor specific binaries for restart data, and a mix of text, JSON, or ad hoc logs for energies and convergence. Each file captures a slice of the calculation, but none captures the whole. There is no shared manifest, no consistent schema, and often no reliable record of provenance.

This fragmentation is not just inconvenient. It is a structural limitation that has persisted for decades.

The real problem: an aging ecosystem

QC visualization landscape: tools sorted by last release year, showing active, stalled, and effectively dead categories
The freeze line in QC visualization tools

The issue is not only the number of formats, but the state of the tools that use them.

A look at the visualization tools that quantum chemists actually reach for shows a field that is partially active and partially frozen in place. Before listing them, one distinction matters. A pure orbital or density viewer exists to look at the output of a QC calculation: orbitals, densities, electrostatic potentials, NTOs, IBOs, ELF, and so on. A molecule editor with a built in renderer exists primarily to build and edit structures, and happens to visualize cubes as a secondary capability. Avogadro 2 is the latter. So is Gabedit. So is GaussView. So is every vendor GUI tied to a specific QC engine. The distinction matters because the editor category has steady commercial backing (Schrödinger, Gaussian Inc., SCM, Wavefunction, TURBOMOLE GmbH), while the pure viewer category is overwhelmingly academic and overwhelmingly thin.

Pure orbital and density viewers, still maintained

Tool Latest release License
Multiwfn Apr 2026 (rolling), 3.8 stable (Jan 2026) Free, permissive, academic and commercial
Pegamoid Aug 2025 (last commit), 2.12.4 active 2024 to 2025 GPL
MOrbVis Active 2025, WebGPU based Open source
IQmol Jul 2024 Open source, Q-Chem affiliated
Molden / gmolden Oct 2023 Free, proprietary terms
IboView Jan 2022 Free, custom license
QMForge Feb 2020 GPL. Slow to stall.
Speck 2017 to 2020 era, sporadic MIT

Editors and full GUIs that include a renderer, still maintained

Tool Latest release License
Avogadro 2 Apr 2026 BSD 3 clause / GPL v2, open source
AMSview (AMS GUI) 2026.1 Commercial (SCM)
TmoleX 2024.1 era Commercial (TURBOMOLE, COSMOlogic)
Maestro / Jaguar GUI 2026.1 era Commercial (Schrödinger)
Chemcraft 2024, ongoing Commercial. Free non saving demo.
wxMacMolPlt Jan 2024 GPL
Spartan Spartan’24 Commercial (Wavefunction)
Vipster 1.20.0 era, active 2023 to 2024 GPL
Gabedit Jul 2021 GPL
GaussView 2019 Commercial (Gaussian, Inc.)

Web and JavaScript stack, still maintained

Tool Latest release License Notes
Jmol / JSmol Mar 2026 LGPL 2.0 Reads cube, molden, fchk
3Dmol.js Jan 2026 BSD Loads cube and VASP data, isosurfaces
Mol* (mol star) Active 2024 to 2026 MIT Successor to NGL and LiteMol
NGL Viewer Core 2.x maintained, NGLVieweR 1.4.0 (Nov 2024) MIT Primarily structural biology, reads cube
Miew (EPAM) Active MIT Mainly biomolecular
iCn3D (NCBI) Active NIH Mainly PDB, has cube support

Periodic, crystallographic, and materials oriented

Tool Latest release License
Mercury (CCDC) 2024.3.0 era Free base, full version needs CSD license
VESTA Aug 2022 Free for academic and non commercial
Olex2 1.5 era, active Free with citation
CrystalMaker 11.x era Commercial
Diamond 5.x (Crystal Impact) Commercial
XCrySDen Oct 2019 GPL v2

Biomolecular tools routinely used for orbital cubes

Tool Latest release License
PyMOL Feb 2025 Schrödinger commercial. Open source build available.
UCSF ChimeraX 2024 to 2025 Free academic, commercial fee
UCSF Chimera 1.19 (2024). Considered legacy, succeeded by ChimeraX. Free academic
VMD 1.9.3 stable (Nov 2016). 1.9.4 RC and 2.0.0 alpha Unix in 2025. Free non commercial, custom license

The graveyard

A larger group of QC visualization tools has either explicitly retired, last released a decade or more ago, or sits in maintenance limbo with sporadic patches. Several of these were dominant in their era.

Tool Last release Status
MOLEKEL 5.4 (Aug 2009) Abandoned. Code on GitHub. GPL.
gOpenMol 3.00 (2005, Windows binary 2010) Dead. Original CSC Finland page gone.
Viewmol 2.4.1 mid 2000s, last activity ~2009 Dead. GPL.
RasMol 2.7.5.1 (Jul 2009) Effectively dead, but the protein chestnut. GPL.
ArgusLab 4.0.1 mid 2000s Dead. Long promised Qt and iPad ports never appeared.
QuteMol 0.4.1 (Jun 2007) Dead. GPL with citation.
ECCE 7.0 official PNNL (Aug 2013), fork 7.3.2 beta ~2017 Effectively dead. ECL 2.0.
GaussSum 3.0.2 (2013) Effectively dead. GPL. Built on cclib.
MoleCoolQt 2.4 era, sporadic commits Largely dormant. Charge density oriented.
Luscus Source side ~2016 to 2018 Stagnant. Academic Free License v3.
MOLDA Early 2000s Dead. Site gone for years.
PyMOlyze Early 2000s Dead. Rolled into QMForge philosophy.
MOLMOL 2K.2 era (ETH Zurich) Dead. NMR oriented but used for structures.
HyperChem 8.0.10 (2011) Effectively dead. Still sold by Hypercube.
Cerius2 Last Accelrys release ~2007 Dead. Replaced by Materials Studio.
InsightII Last Accelrys release ~2005 Dead. The SGI workstation era classic.
MacMolPlt (original) Pre 7.0, mid 2000s, Carbon Long superseded by wxMacMolPlt.

What the landscape shows

The actively maintained pure orbital viewers are a short list: Molden, IboView, Multiwfn, Pegamoid (for OpenMolcas users), and MOrbVis (a recent WebGPU upstart). IQmol fits here too if classified as a viewer rather than a builder. That is essentially the full set. Everything else is either an editor with a renderer attached, a primarily biomolecular tool repurposed for cube files, or a vendor GUI tied to a specific QC engine. VMD straddles categories. The stable 1.9.3 from 2016 is what most people still run, while 1.9.4 release candidates and the 2.0.0 alpha have been in slow public development for nearly a decade.

Several widely used classics have stalled. XCrySDen last shipped in 2019, Gabedit in 2021, IboView in early 2022, VESTA in mid 2022. The truly free and open source options that a new project can realistically embed or extend are a narrow set: Avogadro 2 (BSD), Multiwfn (permissive even for commercial use), Jmol (LGPL), Gabedit (GPL), wxMacMolPlt (GPL), and Vipster (GPL). Tools that are free to download but carry restrictive or proprietary terms include Molden, VMD, IboView, and VESTA.

One format to rule them all

What this landscape calls for is not yet another viewer. It is a common file format that all of these tools, alive and yet to be written, can read.

QVF is meant to be that format. One container that consolidates what is currently spread across cube files, XSF, MOLDEN format, fchk, XYZ, ad hoc JSON output, and a long list of vendor binaries. One manifest with the structure, the volumetric fields, the bands, the spectra, the vibrations, the trajectory, and the full provenance, packaged so that any consumer can read what it understands and ignore the rest. One specification, openly licensed, that any code or viewer can implement without coordination.

The historical reason this never happened is that adding format support to a dozen aging tools was a multi year effort with no single party in a position to drive it. That argument is no longer the obstacle it was. Adding a reader or writer for a clearly specified format is now the kind of task a coding agent finishes in an afternoon, given the schema, a few example files, and the test suite that ships with the spec. Quantum chemistry is one of the fields where this leverage is most consequential. The codes are large, the maintainers are few, and the time saved is exactly the time that did not previously exist.

Many of these tools were developed by PhD students or small research groups to solve a specific problem, published alongside a paper, and then left behind as those researchers moved on or shifted focus. Over time, their dependencies age, their build systems break, and their assumptions about input formats drift out of sync with modern codes.

At the same time, quantum chemistry packages themselves evolve. Output formats change subtly, or sometimes significantly, between versions. In other cases, formats remain nominally the same while their contents shift in undocumented ways. Parsing output becomes brittle, requiring constant patching and format specific workarounds.

The result is a fragile ecosystem. Visualization tools support different and only partially overlapping formats. No single format captures all relevant data. Output parsing is error prone and version dependent. Reproducibility suffers because provenance is incomplete or lost. Moving data between tools requires manual intervention.

Even when formats are well established, they are difficult to extend. Adding new data types or metadata often breaks compatibility, so formats stagnate. The path of least resistance becomes creating yet another file type rather than improving an existing one.

The cost of text based legacy formats

A particularly visible symptom of this legacy is the continued reliance on text based formats for large numerical data.

Formats like Gaussian cube store volumetric grids as ASCII text. This was reasonable when grid sizes were small. It is no longer defensible. A 200³ grid contains around 8 million values and can exceed 180 MB as text, while the same data stored as binary float32 is about 32 MB and loads orders of magnitude faster. Larger grids routinely cross into gigabyte scale text files.

Bar chart comparing the on-disk size of a 200³ volumetric grid in four encodings: Cube ASCII 180 MB, float64 binary 64 MB, float32 binary 32 MB, float32 plus zstd 15 MB
Same data, four encodings

These formats persist not because they are efficient or well suited to modern workloads, but because they are entrenched.

What a modern format needs to do

A modern file format for quantum chemistry must reflect how the field actually works today.

It should package the entire result of a calculation, including structure, grids, spectra, trajectories, and provenance, into a single file that can be shared in one step. It must support random access so that tools can read only the data they need without loading everything. It should be machine readable, explicitly typed, and unambiguous, with clear units and schema validation.

Equally important, it must be designed to evolve. New types of data should be addable without breaking existing tools. Unknown data should be safely ignored but not lost. The format should not depend on heavy, specialized libraries, so that it can be implemented across languages and environments, including lightweight scripts and browser based tools.

In short, it should be a format that supports both stability and change.

Enter QVF

QVF (Quantum Visualization Format) is designed to meet these requirements.

A QVF file is a self contained ZIP archive. At its core is a single manifest.json that describes all data contained in the file. The rest of the archive consists of typed sections: structure, volumetric data, band structures, spectra, vibrations, trajectories, and provenance.

Large numerical arrays are stored as raw binary (typically little endian float32), making them compact and fast to load. Smaller or structured data is stored as JSON. Every section is explicitly labeled with a “kind” drawn from a controlled vocabulary, allowing tools to declare what they support and ignore what they do not.

Because ZIP provides a central directory, any section can be accessed directly without scanning the entire file. A visualization tool that only needs the structure reads a few kilobytes, even if the file also contains hundreds of megabytes of volumetric data.

The format is fully self describing. Units are explicit. Provenance is mandatory. Every binary section includes a sha256 checksum for integrity. Provenance also carries an explicit agent model trail for AI driven workflows, so a calculation produced by an agent loop records the same audit information as one run by hand at a terminal.

Cross section of a .qvf archive: manifest.json at the top followed by typed section directories for structure, volumes, bands, spectra, vibrations, trajectory, and vendor namespaces, with a consumer reading via the central directory
Inside a .qvf file

Designed for an ecosystem, not a single tool

One of the central design goals of QVF is to enable an ecosystem rather than a single implementation.

A QVF reader does not need to understand everything in the file. It reads the manifest, identifies the section types it supports, and processes only those. Unsupported sections are not errors. They are simply reported as present but unused. This allows the format to grow over time without breaking older tools.

Vendors and projects can introduce their own extensions under a namespaced convention (x_<vendor>.*) without requiring coordination. At the same time, a shared core vocabulary ensures interoperability for common data types.

Capability is declared by kind, not by version. There is no level 1 or level 2 tiering. A minimal structure viewer supports structure. A bands plotter supports structure plus bands. SemVer rules pinned to kinds keep older readers working as the registry grows: minor versions add new optional kinds, major versions are rare and deliberate.

This model reflects a key reality: no single tool will ever cover all use cases. The format must allow many tools, with different capabilities, to operate on the same file.

Matrix showing which QVF section kinds each consumer type supports: structure only viewer, bands plotter, spectra tool, reference viewer, and validator
Capability is declared by kind, not by version

Scope and practical limits

QVF is intentionally scoped as a visualization and analysis container. It stores evaluated data, including grids, spectra, and trajectories, but does not attempt to encode the full internal state of a quantum chemistry calculation, such as basis set integrals or wavefunction coefficient matrices. Including those would dramatically increase file size and blur the line between data container and simulation engine.

Similarly, extremely large time series of volumetric data is out of scope for the initial version. These constraints keep file sizes manageable, typically tens to a few hundred megabytes, while covering the vast majority of practical use cases.

What vibe-qc commits to

vibe-qc will adopt QVF as its primary visualization output format and maintain the QVF specification going forward. The writer is implemented as part of vibe-qc’s existing output infrastructure, so every converged calculation produces a .qvf file alongside the standard log output. A validation tool, qvf-validate, ships with vibe-qc and lets any other producer check that its output is conformant.

The specification is open and the license is MPL 2.0, the same license as vibe-qc itself. SemVer governance, growth of the kind registry, and the consumer contract are vibe-qc maintained responsibilities. Vendor extensions remain entirely independent of the central spec.

A reference viewer is on the roadmap. It will ship with vibe-qc and support the full v1 section vocabulary: structure with unit cell and optional symmetry overlay, isosurface rendering for the volumetric kinds with viewer suggested defaults, animated vibrational modes, IR and Raman spectra with adjustable broadening, band structure plots along the declared k path, and trajectory playback. The viewer is not written yet. The format is written first on purpose, so that the viewer can be built against a stable specification and other codes and tools can adopt the format in parallel rather than waiting on a single reference implementation.

We strongly encourage other quantum chemistry codes and visualization tools to adopt QVF. The writer side is light: a few hundred lines of code on top of any modern ZIP library. The reader side is lighter still. Parse the manifest, decompress the binary members you care about, render. Anything outside the kinds you support is something you label as present but unused and move on from.

Feedback, adoption notes, and proposals for new kinds are welcome at mpei@vibe-qc.com.

Toward a better default

The goal of QVF is not to replace every existing format overnight, but to provide a better default.

A single file that contains everything needed to inspect, share, and reproduce a calculation. A format that is fast, explicit, and extensible. A foundation that multiple tools can build on without constant reinvention.

For a field that has long relied on fragmented, aging infrastructure, that shift is overdue.



The full v0.4 QVF specification, including the JSON Schema sketch, the kind registry, and the considered-and-rejected alternatives, lives in the vibe-qc repository at docs/design_qcv_format.md.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.