Skip to content

Validation Report

Summary of descriptor accuracy against RDKit on a ChEMBL-derived corpus.

Environment: Python 3.12, Apple M-series, chematic v0.4.26, RDKit 2026.03.3


Descriptor Accuracy (4,999-molecule ChEMBL subset)

Descriptor Agreement Tolerance Notes
Molecular weight 100% ±0.001 Da 175-mol reference (avg. MW vs Descriptors.MolWt)
Heavy atom count 100% (4999/4999) exact
H-bond donors (HBD) 100% (4999/4999) exact
H-bond acceptors (HBA) 100% (4999/4999) exact
TPSA 100% (4999/4999) ±0.1 Ų
LogP (Crippen) 100% (4999/4999) exact* max Δ = 1.10e-13
MR (molar refractivity) 100% (4999/4999) ±0.01
Fsp3 100% (4999/4999) ±0.001
Aromatic ring count 100% (4999/4999) exact
Aliphatic ring count 100% (4999/4999) exact
Saturated ring count 100% (4999/4999) exact
Rotatable bonds 100% (4999/4999) exact
Num heteroatoms 100% (4999/4999) exact
Num spiro atoms 100% (4999/4999) exact
Num bridgehead atoms 100% (4999/4999) exact bond-intersection algorithm
Num amide bonds 100% (4999/4999) exact
Arom./aliph. heterocycles 100% (4999/4999) exact
[nH] SMARTS match 100% (4999/4999) precision & recall TP=467 TN=4532 FP=0 FN=0
Num stereocenters (legacy) 99.98% (4998/4999) exact† vs CalcNumAtomStereoCenters
Num stereocenters (new CIP) 98.7% (4932/4999) exact† vs FindPotentialStereo

19 of 19 tested metrics reach ≥98.0% on the 4,999-molecule ChEMBL corpus. chematic stereocenters is calibrated between legacy (99.98%) and new-CIP (98.7%) oracles.


Stereocenters — Oracle Calibration

chematic's stereocenter count is calibrated between two RDKit oracles:

Oracle Agreement Count Notes
Legacy CalcNumAtomStereoCenters 99.98% (4998/4999) 4998/4999 1 molecule where chematic is more accurate (legacy under-counts)
New CIP FindPotentialStereo 98.7% (4932/4999) 4932/4999 67 molecules where chematic correctly agrees with legacy (new CIP over-counts cage systems)
Consensus (all three agree) 98.6% (4931/4999) 4931/4999 molecules where legacy, new CIP, and chematic all agree

Oracle disagreements: 68 molecules where legacy ≠ new CIP. - 1 where legacy under-counts a pseudoasymmetric polyester (chematic and new CIP both correctly return 4; legacy returns 2) - 67 where new CIP over-counts cage/adamantane-like systems (chematic and legacy correctly agree on fewer stereocenters)


Reproduce

# Requires RDKit and a SMILES file
.venv/bin/python scripts/bench5k.py ~/Downloads/SMILES.csv
.venv/bin/python scripts/bench5k.py ~/Downloads/SMILES.csv --detail
.venv/bin/python scripts/bench5k.py ~/Downloads/SMILES.csv --json validation/results/bench5k_latest.json
python3 scripts/gen_validation_report.py validation/results/bench5k_latest.json

Reference TSV files: scripts/rdkit_reference_*.tsv (generated by scripts/gen_rdkit_reference.py).


* LogP max |Δ| = 1.10e-13 — within float64 rounding error. bench5k.py uses ±0.01 as the test threshold.

† Stereocenters: see Oracle Calibration section above.


Known Limitations

  • Kekulization: 1 of 5,000 tested molecules — [H][H] (no heavy atoms; IUPAC InChI library constraint). Returns KekuleError explicitly.
  • Aromaticity model: Hückel 4n+2 per SSSR ring; RDKit uses fused-ring delocalization. Visible in pyridone, quinolone, indolizine.
  • InChI: Pure-Rust implementation is approximate. Use native-inchi feature for standard-compliant InChI/InChIKey.

Validation corpus: ChEMBL-derived 5,000-molecule SMILES set. Details: benchmark.md · rdkit-comparison.md