Difference between revisions of "A Meta-Learning Approach for Preserving and Transferring Beneficial Behaviors in Asynchronous Multi-Agent Reinforcement Learning"

From ISLAB/CAISR
Jump to navigationJump to search
(Created page with "{{StudentProjectTemplate |Summary=Develop a meta-learning system that preserves beneficial behaviors discovered by individual agents and adapts them for transfer across a popu...")
 
m (Alexander moved page “Learning to Preserve and Share Useful Behaviors Across Agents in Asynchronous RL via Meta-Learning” to [[A Meta-Learning Approach for Preserving and Transferring Beneficial Behaviors in Asynchronous Multi-Agent Reinforceme...)
 
(No difference)

Latest revision as of 15:34, 8 September 2025

Title A Meta-Learning Approach for Preserving and Transferring Beneficial Behaviors in Asynchronous Multi-Agent Reinforcement Learning
Summary Develop a meta-learning system that preserves beneficial behaviors discovered by individual agents and adapts them for transfer across a population in asynchronous reinforcement learning.
Keywords Meta-Learning, Asynchronous Reinforcement Learning, Behavior Preservation, Adaptive Aggregation, Multi-Agent LearningProperty "Keywords" has a restricted application area and cannot be used as annotation property by a user.
TimeFrame
References Mnih, V. et al. (2016). Asynchronous Methods for Deep Reinforcement Learning. ICML 2016.

Finn, C., Abbeel, P., & Levine, S. (2017). Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017.

Gupta, A. et al. (2018). Meta-Reinforcement Learning of Structured Exploration Strategies. NeurIPS 2018.

Qi, P. (2024). Model Aggregation Techniques in Federated Learning: A Comprehensive Survey. Future Generation Computer Systems, 139, 1-15

Wu, H. et al. (2024). Adaptive Multi-Agent Reinforcement Learning for Flexible Resource Management. Applied Energy, 374, 121-135

OpenAI Gym. https://www.gymlibrary.ml/ Ray RLLib. https://docs.ray.io/en/latest/rllib.html PyTorch. https://pytorch.org/

Prerequisites Deep learning course. Good programming knowledge in Python. Good knowledge of ML and preferably Meta-learning/RL and OpenAI gym.
Author
Supervisor Alexander Galozy
Level Master
Status Open


Asynchronous reinforcement learning (AsyncRL) allows multiple agents to explore environments in parallel, but standard aggregation methods risk diluting rare yet beneficial behaviors discovered by individual actors. This project proposes a meta-learning approach in which a meta-model dynamically adapts how each actor’s updates influence the global policy based on their state, novelty, and potential usefulness to other agents. The meta-model aims to quickly preserve advantageous behaviors while evaluating their transferability, improving learning efficiency, stability, and knowledge propagation across the agent population. Experiments will be conducted in benchmark environments such as CartPole, LunarLander, and Pong, comparing the meta-learning approach to standard aggregation baselines. The project investigates how population-level adaptive weighting can balance exploration and exploitation, effectively generalizing classical single-agent meta-RL approaches like First-Explore to multi-agent asynchronous learning. The outcomes include insights into behavior preservation, transferability, and scalable multi-agent learning.

Research Questions:

How can a meta-model preserve beneficial behaviors discovered by individual agents in asynchronous RL?

How can the meta-model evaluate and transfer these behaviors to improve other agents’ learning?

Does meta-learning-based adaptive aggregation improve population-level learning efficiency, stability, and sample efficiency compared to standard methods?


Outcomes:

Functional meta-model for adaptive aggregation of actor updates, preservation of rare but beneficial behaviors in the global policy and transfer of useful behaviors across agents

Benchmark evaluation on standard RL environments (CartPole, LunarLander, Pong)

Metrics: return, sample efficiency, stability, behavior retention, transfer success

Open-source, reproducible codebase

Complete thesis documentation with methodology, experiments, and analysis