Combining tree-search, generative models, and nash bargaining concepts in game-theoretic reinforcement learning
Multiagent reinforcement learning (MARL) has benefited significantly from population-based and game-theoretic training regimes. One approach, Policy-Space Response Oracles (PSRO), employs standard reinforcement learning to compute response policies via approximate best responses and combines them via meta-strategy selection. We augment PSRO by adding a novel search procedure with generative sampling of world states, and introduce two new meta-strategy solvers based on the Nash bargaining solution. We evaluate PSRO’s ability to compute approximate Nash equilibrium, and its performance in two negotiation games: Colored Trails, and Deal or No Deal. We conduct behavioral studies where human participants negotiate with our agents (N=346). We find that search with generative modeling finds stronger policies during both training time and test time, enables online Bayesian co-player prediction, and can produce agents that achieve comparable social welfare negotiating with humans as humans trading among themselves.
The widespread use of spreadsheet environments by billions of users presents a unique opportunity for formula-authoring assistance. Although large language models, such as Codex, can assist in general-purpose languages, they are expensive to train and challenging to deploy due to their large model sizes (up to billions of parameters). Moreover, they require hundreds of gigabytes […]
This space uses T2M-GPT models based on Vector Quantised-Variational AutoEncoder (VQ-VAE) and Generative Pre-trained Transformer (GPT) for human motion generation from textural descriptions.
Automated Program Repair (APR) can help developers automatically generate patches for bugs. Due to the impressive performance obtained using Large Pre-Trained Language Models (LLMs) on many code related tasks, researchers have started to directly use LLMs for APR. However, prior approaches simply repeatedly sample the LLM given the same constructed input/prompt created from the original […]
Help us find great AI content
Never miss a thing! Sign up for our AI Hackr newsletter to stay updated.
AI curated tools and resources. Find the best AI tools, reports, research entries, writing assistants, chrome extensions and GPT tools.