AI as a Co-Scientist: What Google’s New Research Means for Internal Auditing

On February 19, 2025, Google Research unveiled a groundbreaking innovation: the AI co-scientist, a multi-agent system powered by Gemini 2.0. Designed to collaborate with researchers, this system is capable of generating novel hypotheses, developing research plans, and even proposing validated biomedical solutions. While its initial applications are rooted in the sciences, the underlying principles have significant implications for the future of internal auditing.

The AI co-scientist is not merely an advanced literature review tool. It is a dynamic, reasoning system that mimics the scientific method through a coalition of specialized agents—such as Generation, Reflection, Ranking, and Evolution. These agents interact, debate, and refine outputs based on feedback and defined research goals. In practice, this allows the system to operate as a virtual research partner, iterating on ideas and improving the quality of its hypotheses over time.

What makes this system particularly noteworthy is its real-world validation. Google applied it to complex biomedical problems—like drug repurposing for leukemia and identifying treatment targets for liver fibrosis—and several of its AI-generated proposals were experimentally confirmed by independent labs. In one case, the AI even independently rediscovered a scientific mechanism that was, at the time, unpublished but under study by a leading research group. This level of foresight and alignment with expert insight suggests that such systems can generate high-quality, non-obvious insights from vast and interdisciplinary data sets.

For internal auditors, this development signals a powerful shift in how emerging technologies can support decision-making and risk assessment. Imagine applying similar multi-agent reasoning systems to identify control weaknesses, simulate fraud scenarios, or assess compliance risks across complex regulatory landscapes. Instead of relying solely on retrospective analyses or manual sampling, auditors could soon collaborate with AI systems capable of generating and testing control hypotheses in real time—grounded in both historical data and current business context.

Moreover, the AI co-scientist showcases how advanced reasoning models can scale with compute to improve results. As the system spends more time evaluating a problem, its outputs improve—measured not only by AI-driven metrics but also through expert human evaluation. This idea of iterative improvement based on structured feedback loops aligns perfectly with how internal audit functions mature over time.

Find out more here.