Evo Mechanistic Interpretability Visualizer

The Arc Institute and Goodfire have partnered to analyze Evo genomic foundation models using state of the art mechanistic interpretability techniques. Goodfire developed a specialized interpreter model—a sparse autoencoder—that reveals how Evo models process information internally. This interpreter can identify meaningful units of model cognition (called "features") and control these features to generate specific DNA sequences. Using a dataset of 100 bacterial genomes, the team created a visualizer that displays significant features corresponding to known genomic concepts. These features are ranked by their classification accuracy, with many achieving very high F1 scores when tested on a comprehensive genome dataset.