Media Summary: Artificial Intelligence (AI) 20 May 2021 Speaker: Rémy Portelas, INRIA (collaboration with Pierre-Yves Oudeyer, INRIA and Katja ... This video is a 15min presentation of a survey paper on In this AI Research Roundup episode, Alex discusses the paper: 'A Matter of TASTE: Improving Coverage and Difficulty of Agent ...

Teachmyagent A Benchmark For Automatic - Detailed Analysis & Overview

Artificial Intelligence (AI) 20 May 2021 Speaker: Rémy Portelas, INRIA (collaboration with Pierre-Yves Oudeyer, INRIA and Katja ... This video is a 15min presentation of a survey paper on In this AI Research Roundup episode, Alex discusses the paper: 'A Matter of TASTE: Improving Coverage and Difficulty of Agent ... In this AI Research Roundup episode, Alex discusses the paper: 'AdaPlanBench: Evaluating Adaptive Planning in Large ... In this AI Research Roundup episode, Alex discusses the paper: 'Agents' Last Exam' While modern LLMs excel at standard ... This lecture discusses the critical shift from evaluating static LLMs to complex AI agents that take action. It explores the vital role of ...

In this video, we break down the definitive framework for evaluating and Most people think Supervised Fine-Tuning (SFT) is simple: show an AI the correct answer and train it to copy it. But what if that ...

Photo Gallery

TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL
Teacher Algorithms for Deep Reinforcement Learning Students | JRC Workshop 2021
Automatic Curriculum Learning for Deep RL: a Short Survey
TASTE: Better Benchmarks for LLM Agents
Interactive web demo of generalization in Deep Reinforcement Learning
AdaPlanBench: Benchmark for LLM Agent Planning
ALE: New Benchmark for Computer-Use Agents
Agent Evaluation & Benchmarks - Agentic AI MOOC 2025 Lecture 4 Summary
17.How to Actually Evaluate & Benchmark AI Agents(Evaluate & Benchmark)
TARGET-SFT Explained: The AI Training Breakthrough That Beats Standard Fine-Tuning
How to Benchmark LLM Skills with an LLM-as-Judge
View Detailed Profile
TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL

TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL

In this talk, I present

Teacher Algorithms for Deep Reinforcement Learning Students | JRC Workshop 2021

Teacher Algorithms for Deep Reinforcement Learning Students | JRC Workshop 2021

Artificial Intelligence (AI) 20 May 2021 Speaker: Rémy Portelas, INRIA (collaboration with Pierre-Yves Oudeyer, INRIA and Katja ...

Automatic Curriculum Learning for Deep RL: a Short Survey

Automatic Curriculum Learning for Deep RL: a Short Survey

This video is a 15min presentation of a survey paper on

TASTE: Better Benchmarks for LLM Agents

TASTE: Better Benchmarks for LLM Agents

In this AI Research Roundup episode, Alex discusses the paper: 'A Matter of TASTE: Improving Coverage and Difficulty of Agent ...

Interactive web demo of generalization in Deep Reinforcement Learning

Interactive web demo of generalization in Deep Reinforcement Learning

TeachMyAgent: A Benchmark for Automatic

AdaPlanBench: Benchmark for LLM Agent Planning

AdaPlanBench: Benchmark for LLM Agent Planning

In this AI Research Roundup episode, Alex discusses the paper: 'AdaPlanBench: Evaluating Adaptive Planning in Large ...

ALE: New Benchmark for Computer-Use Agents

ALE: New Benchmark for Computer-Use Agents

In this AI Research Roundup episode, Alex discusses the paper: 'Agents' Last Exam' While modern LLMs excel at standard ...

Agent Evaluation & Benchmarks - Agentic AI MOOC 2025 Lecture 4 Summary

Agent Evaluation & Benchmarks - Agentic AI MOOC 2025 Lecture 4 Summary

This lecture discusses the critical shift from evaluating static LLMs to complex AI agents that take action. It explores the vital role of ...

17.How to Actually Evaluate & Benchmark AI Agents(Evaluate & Benchmark)

17.How to Actually Evaluate & Benchmark AI Agents(Evaluate & Benchmark)

In this video, we break down the definitive framework for evaluating and

TARGET-SFT Explained: The AI Training Breakthrough That Beats Standard Fine-Tuning

TARGET-SFT Explained: The AI Training Breakthrough That Beats Standard Fine-Tuning

Most people think Supervised Fine-Tuning (SFT) is simple: show an AI the correct answer and train it to copy it. But what if that ...

How to Benchmark LLM Skills with an LLM-as-Judge

How to Benchmark LLM Skills with an LLM-as-Judge

Run configurable skill