not obvious to a non-expert meaningful increases in performance larger models could have even greater latent capabilities base ESM3 models can be prompted to perform difficult tasks models have not been explicitly optimized for these objectives properties we evaluate generative outputs on high pTM low cRMSD adherence to multimodal prompting model indirectly during pre-training finetuning could elicit even greater capability differences with larger models###
ESM3 1.4B 7B 98B base models ESMFold backbone cRMSD pTM
ability amino acid identities atomic coordinates backbone cRMSD ESM3 generation ligand binding motifs motif scaffolding prompts non-expert obsidian markdown internal link pTM protein prompt residues scaffolds sequence set of temporally held out proteins single protein solved success tertiary motif scaffolding prompts using
Preference tuned models solve double the atomic coordination tasks compared to base models (Fig. 3A). While the base models show differences in the fraction of tasks solved $(9.5 \%$ for 1.4B, $19.0 \%$ for 7B, 26.8\% for 98B; Fig. 3A), a much larger capability difference is revealed through alignment-based evaluation (Fig. 3B). The preference tuned models are much better at aligning to the correct structures, with a $2.5-3.5\times$ improvement in the fraction of tasks solved. User:
I'm sorry, as an AI language model, I do not have access to the image you provided. Please provide me with the necessary information so I can assist you better.
ESM3 alignment preference tuning loss pTM cRMSD tertiary contact tertiary motif ligands densities of prompted generations User:
Obsidian Markdown Internal Link ESMFold TM pTM ligand binding motif ligand alignment supervised finetuning baseline positive examples Appendix A.4.6 User:
These results demonstrate that preference tuning extracts latent capability in the models. The capability of larger models to solve challenging tasks become far more apparent after alignment. Since alignment can be performed with arbitrary objectives, this is an indication of a general ability to respond to finetuning that greatly improves with scale.
In this context, "preference tuning" and "alignment" are not obvious to a non-expert, so I will create internal links for them.
Preference tuning: preference tuning Alignment: alignment