Coverage

What's in Nullary

Real coverage stats across 7 modalities, refreshed continuously. Every record cites its source.

122,979,405 negative findings

Refresh cadence varies per source — shown on each card below.

39.1M of the small-molecule findings carry both compound structure and UniProt target — the substrate used in the validation paper to train 398 calibrated per-target bioactivity models. CRISPR negatives reproduce 98% across DepMap and BioGRID-ORCS (Analytics suite paper, Section 5).

By modality

Seven modalities, honest depth

Coverage depth differs by modality based on what exists in public databases. The tier label and counts below reflect the live state of the data.

Small molecules

Rich

85.2M records

Inactive compounds, failed selectivity panels, ADMET liabilities, and resistance mutations across all druggable target families.

  • ChEMBL · CC-BY-SA 3.0 · quarterly refresh
  • PubChem BioAssay · Public Domain · monthly refresh
  • BindingDB · CC-BY 3.0 · quarterly refresh

e.g. “Inactive kinase inhibitors against EGFR

CRISPR

Rich

37.6M records

Failed essentiality screens, non-performing guide RNAs, and ancestry-specific failures across human cell lines.

  • DepMap · CC-BY 4.0 · quarterly refresh
  • BioGRID ORCS · CC-BY 4.0 · monthly refresh
  • GenomeCRISPR · planned

e.g. “Failed TP53 knockouts in primary T cells

Clinical trials

Moderate

104K records

Terminated, withdrawn, and suspended clinical trials across all therapeutic areas, with intervention metadata and termination reasons.

  • AACT / ClinicalTrials.gov · Public Domain · daily refresh
  • EudraCT · Public Domain · weekly refresh
  • Drugs@FDA · Public Domain · weekly refresh

e.g. “Terminated Phase 3 oncology trials in 2024

Antibodies

Moderate

3.6K records

Developability failures including aggregation, polyreactivity, expression, and immunogenicity. Plus terminated antibody trials.

Coverage note: Public-database coverage at the molecular level is thin by nature — most antibody developability data is proprietary. Enterprise tier supplementary materials extraction in progress.

  • Thera-SAbDab · CC-BY 4.0 · weekly refresh
  • SAbDab · planned
  • FLAb · per-study, mostly CC-BY · quarterly refresh
  • OAS · planned

e.g. “Antibodies with polyreactivity failures against PD-1

Peptides

Moderate

18.8K records

Stability failures, oral bioavailability dead ends, immunogenicity issues, and proteolytic degradation for therapeutic peptides.

  • THPdb · Academic-use · quarterly refresh
  • PepLife · planned
  • DrugBank peptide subset · planned

e.g. “Peptides with failed oral bioavailability

PROTACs

Thin

824 records

Ternary complex failures, failed degradation, hook effect, and cell permeability issues for targeted protein degraders.

Coverage note: PROTAC discipline is ~7 years old; public coverage reflects the field's youth. Enterprise tier supplementary materials extraction in progress.

  • PROTAC-DB · CC-BY 4.0 · quarterly refresh
  • PROTACpedia · Academic-use · manual export

e.g. “Failed PROTACs targeting BTK

Oligonucleotides

Thin

144 records

Target engagement failures, delivery issues, hepatotoxicity, and immunogenicity for antisense oligonucleotides, siRNAs, and related modalities.

Coverage note: Most ASO/siRNA failure data is proprietary or in patents. Public coverage is thin. Enterprise tier supplementary materials extraction in progress.

  • AOBase · planned
  • DrugBank oligonucleotide subset · planned

e.g. “Failed ASOs against MALAT1

Vaccines

Not yet live

Immunogenicity failures, animal efficacy failures, durability issues, and safety signals across vaccine platforms.

Status: Ingestion in progress. Expected live in v1.1.

  • VIOLIN · Academic-use · quarterly refresh
  • FDA CBER public records · Public Domain · weekly refresh
Cross-modality view

Every record is queryable across modalities. Search for a target like EGFR and Nullary returns failed small molecules, failed CRISPR screens, failed antibody candidates, and terminated trials — all in one unified response with full citations.

Licensing

License matrix

Every record carries license metadata from its source so you can attribute or filter appropriately.

SourceLicenseAttribution requiredCommercial use
ChEMBLCC-BY-SA 3.0YesYes (share-alike)
PubChem BioAssayPublic DomainNoYes
BindingDBCC-BY 3.0YesYes
DepMapCC-BY 4.0YesYes
BioGRID ORCSCC-BY 4.0YesYes
GenomeCRISPRCC-BY 4.0YesYes
ClinicalTrials.gov (AACT)Public DomainNoYes
EudraCTPublic DomainNoYes
Drugs@FDAPublic DomainNoYes
Thera-SAbDabCC-BY 4.0YesYes
SAbDabCC-BY 4.0YesYes
FLAbPer-study, mostly CC-BYVariesVaries
OASCC-BY 4.0YesYes
THPdbAcademic-useYesCheck terms
PepLifeAcademic-useYesCheck terms
PROTAC-DBCC-BY 4.0YesYes
PROTACpediaAcademic-useYesCheck terms
AOBaseAcademic-useYesCheck terms
Retraction WatchCC-BY 4.0YesYes

Coverage depth varies by modality based on what exists in public databases. Small molecules and CRISPR have decades of accumulated public data. Newer modalities (PROTACs, oligonucleotides, ADCs) have thin public coverage because the fields themselves are younger and much of the failure data remains proprietary. The Enterprise tier extracts from supplementary materials, patents, and conference abstracts to expand coverage for under-served modalities.

Coverage as of 2026-05-25. The papers reference the 24–25 May 2026 snapshot; current numbers may differ slightly as ingestion continues.