I am currently in my final year of PhD, dividing my time between University College London (UCL) and Meta AI (FAIR). At UCL, I am supervised by Tim Rocktäschel and am a member of the UCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab. I am also part of the ELLIS PhD & Postdoc Program.

I hold an MSc in Computer Science degree from the University of Oxford where I worked in the Whiteson Research Lab advised by Shimon Whiteson. Prior to that, I obtained Master’s and Bachelor’s degrees from Yerevan State University in Informatics and Applied Mathematics. I have previously held research and development engineering positions at Reddit, Mentor, Toptal and USC Information Sciences Institute.

My research aims to train and evaluate robust AI agents capable of adapting to new environments and coordinating with others. To achieve this, I use techniques from reinforcement learning, open-endedness, and multi-agent learning. My most recent work has centered on using these methodologies to improve the robustness and safety of LLMs.

Contact: mikayel [at] samvelyan [dot] com

Avatar

News

Libraries

Featured Media

Selected Publications

See Google Scholar for more publications.

rainbow

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
M Samvelyan*, S Raparthy*, A Lupu*, E Hambro, A Markosyan, M Bhatt, Y Mao, M Jiang, J Parker-Holder, J Foerster, T Rocktäschel, R Raileanu
arXiv, 2024

@misc{samvelyan2024rainbow,
   title={Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts}, 
   author={Mikayel Samvelyan and Sharath Chandra Raparthy and Andrei Lupu and Eric Hambro and Aram H. Markosyan and Manish Bhatt and Yuning Mao and Minqi Jiang and Jack Parker-Holder and Jakob Foerster and Tim Rocktäschel and Roberta Raileanu},
   year={2024},
   eprint={2402.16822},
   archivePrefix={arXiv},
   primaryClass={cs.CL}
}
madrid

Multi-Agent Diagnostics for Robustness via Illuminated Diversity
M Samvelyan*, D Paglieri*, M Jiang, J Parker-Holder, T Rocktäschel
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024

@misc{samvelyan2024multiagent,
   title={Multi-Agent Diagnostics for Robustness via Illuminated Diversity}, 
   author={Mikayel Samvelyan and Davide Paglieri and Minqi Jiang and Jack Parker-Holder and Tim Rocktäschel},
   year={2024},
   eprint={2401.13460},
   archivePrefix={arXiv},
   primaryClass={cs.LG}
}
craftax

Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
M Matthews, M Beukmans, B Ellis, M Samvelyan, M Jackson, S Coward, J Foerster
arXiv, 2024

@article{matthews2024craftax,
   title={Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning},
   author={Michael Matthews and Michael Beukman and Benjamin Ellis and Mikayel Samvelyan and Matthew Jackson and Samuel Coward and Jakob Foerster},
   journal={arXiv preprint},
   year={2024},
}
jaxmarl

JaxMARL: Multi-Agent RL Environments in JAX
A Rutherford, B Ellis, M Gallici, J Cook, A Lupu, G Ingvarsson, T Willi, A Khan, C Schroeder de Witt, A Souly, S Bandyopadhyay, M Samvelyan, M Jiang, R Lange, S Whiteson, B Lacerda, N Hawes, T Rocktäschel, C Lu, J Foerster
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024

@misc{rutherford2023jaxmarl,
   title={JaxMARL: Multi-Agent RL Environments in JAX}, 
   author={Alexander Rutherford and Benjamin Ellis and Matteo Gallici and Jonathan Cook and Andrei Lupu and Gardar Ingvarsson and Timon Willi and Akbir Khan and Christian Schroeder de Witt and Alexandra Souly and Saptarashmi Bandyopadhyay and Mikayel Samvelyan and Minqi Jiang and Robert Tjarko Lange and Shimon Whiteson and Bruno Lacerda and Nick Hawes and Tim Rocktaschel and Chris Lu and Jakob Nicolaus Foerster},
   year={2023},
   eprint={2311.10090},
   archivePrefix={arXiv},
   primaryClass={cs.LG}
}
SMACv2

SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning
B Ellis, J Cook, S Moalla, M Samvelyan, M Sun, A Mahajan, J Foerster, S Whiteson
Conference on Neural Information Processing Systems (NeurIPS), 2023

 @inproceedings{ellis2023smacv2,
    title={{SMAC}v2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning},
    author={Benjamin Ellis and Jonathan Cook and Skander Moalla and Mikayel Samvelyan and Mingfei Sun and Anuj Mahajan and Jakob Nicolaus Foerster and Shimon Whiteson},
    booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
    year={2023},
    url={https://openreview.net/forum?id=5OjLGiJW3u}
}
mixme

Mix-ME: Quality-Diversity for Multi-Agent Learning
G Ingvarsson, M Samvelyan, M Flageat, B Lim, A Cully, T Rocktäschel
Agent Learning in Open-Endedness (ALOE) Workshop @ NeurIPS, 2023

@inproceedings{ingvarsson2023mixme,
   title={Mix-{ME}: Quality-Diversity for Multi-Agent Learning},
   author={Gar{\dh}ar Ingvarsson and Mikayel Samvelyan and Manon Flageat and Bryan Lim and Antoine Cully and Tim Rockt{\"a}schel},
   booktitle={Second Agent Learning in Open-Endedness Workshop},
   year={2023},
   url={https://openreview.net/forum?id=acD8BxMjwV}
}
MAESTRO

MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning
M Samvelyan, A Khan, M Dennis, M Jiang, J Parker-Holder, J Foerster, R Raileanu, T Rocktäschel
International Conference on Learning Representations (ICLR), 2023

@inproceedings{samvelyan2023maestro,
   title={{MAESTRO}: Open-Ended Environment Design for Multi-Agent Reinforcement Learning},
   author={Mikayel Samvelyan and Akbir Khan and Michael D Dennis and Minqi Jiang and Jack Parker-Holder and Jakob Nicolaus Foerster and Roberta Raileanu and Tim Rockt{\"a}schel},
   booktitle={International Conference on Learning Representations},
   year={2023},
   url={https://openreview.net/forum?id=sKWlRDzPfd7}
}
GriddlyJS

GriddlyJS: A Web IDE for Reinforcement Learning
C Bamford, M Jiang, M Samvelyan, T Rocktäschel
Conference on Neural Information Processing Systems (NeurIPS), 2022

@inproceedings{bamford2022griddlyjs,
   title={Griddly{JS}: A Web {IDE} for Reinforcement Learning},
   author={Christopher Bamford and Minqi Jiang and Mikayel Samvelyan and Tim Rockt{\"a}schel},
   booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
   year={2022},
   url={https://openreview.net/forum?id=YmacJv0i_UR}
}
Accel

Evolving Curricula with Regret-Based Environment Design
J Parker-Holder*, M Jiang*, M Dennis, M Samvelyan, J Foerster, E Grefenstette, T Rocktäschel
International Conference on Machine Learning (ICML), 2022

@article{parkerholder2022evolving,
   title={Evolving Curricula with Regret-Based Environment Design},
   author={Parker-Holder, Jack and Jiang, Minqi and Dennis, Michael D and Samvelyan, Mikayel and Foerster, Jakob Nicolaus and Grefenstette, Edward and Rockt{\"a}schel, Tim},
   journal={arXiv preprint arXiv:2203.01302},
   year={2022}
}
SkillHack

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning
M Matthews, M Samvelyan, J Parker-Holder, E Grefenstette, T Rocktäschel
Conference on Lifelong Learning Agents (CoLLAs), 2022

@misc{matthews2022hierarchical,
   url = {https://arxiv.org/abs/2207.11584},
   author = {Matthews, Michael and Samvelyan, Mikayel and Parker-Holder, Jack and Grefenstette, Edward and Rocktäschel, Tim},
   keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
   title = {Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning},
   publisher = {arXiv},
   year = {2022},
}
genmas

Generalization in Cooperative Multi-Agent Systems
A Mahajan, M Samvelyan, T Gupta, B Ellis, M Sun, T Rocktäschel, S Whiteson
arXiv, 2022

@article{mahajan2022generalization, 
   title={Generalization in Cooperative Multi-Agent Systems}, 
   author={Mahajan, Anuj and Samvelyan, Mikayel and Gupta, Tarun and Ellis, Benjamin and Sun, Mingfei and Rockt{\"a}schel, Tim and Whiteson, Shimon}, 
   journal={arXiv preprint arXiv:2202.00104}, 
   year={2022},
}
MiniHack

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
M Samvelyan, R Kirk, V Kurin, J Parker-Holder, M Jiang, E Hambro, F Petroni, H Küttler, E Grefenstette, T Rocktäschel
Conference on Neural Information Processing Systems (NeurIPS), 2021

@inproceedings{samvelyan2021minihack,
   title={MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research},
   author={Mikayel Samvelyan and Robert Kirk and Vitaly Kurin and Jack Parker-Holder and Minqi Jiang and Eric Hambro and Fabio Petroni and Heinrich Kuttler and Edward Grefenstette and Tim Rockt{\"a}schel},
   booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
   year={2021},
   url={https://openreview.net/forum?id=skFwlyefkWJ}
}
Tesseract

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
A Mahajan, M Samvelyan, L Mao, V Makoviychuk, A Garg, J Kossaifi, S Whiteson, Y Zhu, A Anandkumar
International Conference on Machine Learning (ICML), 2021

@inproceedings{mahajan2021tesseract,
   title = {Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning},
   author = {Mahajan, Anuj and Samvelyan, Mikayel and Mao, Lei and Makoviychuk, Viktor and Garg, Animesh and Kossaifi, Jean and Whiteson, Shimon and Zhu, Yuke and Anandkumar, Animashree},
   booktitle = {Proceedings of the 38th International Conference on Machine Learning},
   publisher = {PMLR},
   volume = {139},
   pages = {7301--7312},
   year = {2021},
}
qmix_journal

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
T Rashid*, M Samvelyan*, C Schroeder de Witt, G Farquhar, J Foerster, S Whiteson
Journal of Machine Learning Research (JMLR), 2020

@article{rashid20monotonic,
   author  = {Tabish Rashid and Mikayel Samvelyan and Christian Schroeder de Witt and Gregory Farquhar and Jakob Foerster and Shimon Whiteson},
   title   = {Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning},
   journal = {Journal of Machine Learning Research},
   year    = {2020},
   volume  = {21},
   number  = {178},
   pages   = {1--51},
}
MAVEN

MAVEN: Multi-Agent Variational Exploration
A Mahajan, T Rashid, M Samvelyan, S Whiteson
Conference on Neural Information Processing Systems (NeurIPS), 2019

@incollection{mahajan2019maven,
   title = {{MAVEN}: {Multi}-{Agent} {Variational} {Exploration}},
   author = {Mahajan, Anuj and Rashid, Tabish and Samvelyan, Mikayel and Whiteson, Shimon},
   booktitle = {Advances in Neural Information Processing Systems 32},
   pages = {7611--7622},
   year = {2019},
}
SMAC

The StarCraft Multi-Agent Challenge
M Samvelyan*, T Rashid*, C Schroeder de Witt, G Farquhar, N Nardelli, T Rudner, C Hung, P Torr, J Foerster, S Whiteson
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2019

@inproceedings{samvelyan2019smac,
   title = {{The} {StarCraft} {Multi}-{Agent} {Challenge}},
   author = {Samvelyan, Mikayel and Rashid, Tabish and Schroeder de Witt, Christian and Farquhar, Gregory and Nardelli, Nantas and Rudner, Tim G. J. and Hung, Chia-Man and Torr, Philip H. S. and Foerster, Jakob and Whiteson, Shimon},
   booktitle = {Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems},
   pages = {2186--2188},
   year = {2019},
}
QMIX

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
T Rashid*, M Samvelyan*, C Schroeder de Witt, G Farquhar, J Foerster, S Whiteson
International Conference on Machine Learing (ICML), 2018

@inproceedings{rashid18qmix,
   title = {{QMIX}: {Monotonic} {Value} {Function} {Factorisation} {for} {Deep} {Multi}-{Agent} {Reinforcement} {Learning}},
   author = {Rashid, Tabish and Samvelyan, Mikayel and Schroeder, Christian and Farquhar, Gregory and Foerster, Jakob and Whiteson, Shimon},
   booktitle = {Proceedings of the 35th International Conference on Machine Learning},
   publisher = {PMLR},
   volume = {80},
   pages = {4295--4304},
   year = {2018},
}

Teaching

  • Spring 2020 - Data Structures (TA)
  • Fall 2019 - Machine Learning (Lecturer)
  • Fall 2018 - Machine Learning (Lecturer)
  • Fall 2018 - Operating Systems (TA)