Algorithm that gets ‘under the hood’ of AI models could effectively steer their responses

Nature News ·

Algorithm that gets ‘under the hood’ of AI models could effectively steer their responses

Beaglehole, D., Radhakrishnan, A., Boix-Adserà, E. & Belkin, M. Science 391 , 787–792 (2026). Article PubMed Google Scholar Subramani, N., Suresh, N. & Peters, M. E. …

Beaglehole, D., Radhakrishnan, A., Boix-Adserà, E. & Belkin, M. Science 391 , 787–792 (2026). Article PubMed Google Scholar Subramani, N., Suresh, N. & Peters, M. E. In Findings of the Association for Computational Linguistics: ACL 2022 (eds Muresan, S., Nakov, P. & Villavicencio, A.) 566–581 (ACM, 2022). Google Scholar Marks, S. & Tegmark, M. In Proc. 1st Conf. Lang. Model. (COLM, 2024). Google Scholar Radhakrishnan, A., Beaglehole, D., Pandit, P. & Belkin, M. Science 383 , 1461–1467 (2024). …

Original source: Nature News