About Nullary

Mission

Drug discovery's hardest data — what's already been tried and failed — sits in internal databases at Pfizer, Roche, Merck, Novartis, and the rest of the top 20 pharma companies. Those companies use this institutional knowledge as a structural advantage: program portfolios shaped by decades of accumulated failure data their competitors don't have.

Smaller biotechs, academic labs, and computational chemists outside the top 20 don't have this. They run virtual screens against targets that have been hit 5,000 times before. They commit to programs without knowing which approaches have already failed. They train ML models on positive data alone or on computational decoys, because real measured negatives are scattered across 25+ public databases or locked behind enterprise licenses.

Nullary aggregates the public-database record of negative results — 122M findings across small molecules, CRISPR, antibodies, peptides, PROTACs, oligonucleotides, and clinical trials — into a single queryable layer with full provenance. Two technical reports validate the methodology, including the negative findings honestly. See /research.

The mission is simple: democratize access to the institutional knowledge that big pharma has used as a competitive moat. Make negative results intelligence available to every team that needs it — biotech computational leads, academic researchers, ML practitioners, individual scientists — at prices that fit their budgets.

This is the asymmetry we're fixing.

Why this matters

ML models for drug discovery trained on positive results alone perform worse than models trained on both polarities. Selectivity prediction, ADMET prediction, antibody developability prediction, and target tractability all improve with labeled negatives. The data exists; nobody has assembled it as a queryable layer until now.

Beyond ML, working drug discovery teams query for “what's been tried against this target” before launching a new program. That query should return everything — successes and failures — in seconds, with full citations. Today it requires hours across multiple databases and is often impossible for older or obscure targets.

Who built this

The Nullary team. We come from computational and biotech backgrounds and care about negative results because the absence of this data has personally cost us time and dead-ended programs.

Contact

Acknowledgements

Nullary builds on data from ChEMBL (EMBL-EBI), PubChem (NCBI), DepMap (Broad Institute), ClinicalTrials.gov (NIH), and 20+ other public sources. Every record cites its origin. This work would not be possible without decades of investment in open scientific data by these institutions.