Machine Learning

4 Articles

How Sensitive Is PPO to Reward Shaping?

Lock the Promise: Discover why your PPO agent’s impressive performance might be a fragile illusion of reward design. We show how small, seemingly innocuous changes to the reward function can dramatically alter both the learning curve and the final policy’s quality. Tight Premise: Using a standard MuJoCo benchmark, we test PPO’s sensitivity by implementing three […]

Multi-armed bandit problem; Your First Reinforcement Learning Agent in 100 Lines of C

Introduction The landscape of modern machine learning education is dominated by Python. Its vast ecosystem of libraries like TensorFlow, PyTorch, and scikit-learn provides powerful abstractions that allow developers to build complex models with remarkable speed. While this is invaluable for productivity, it can also create a “black box” effect, where the fundamental mechanics of an […]

How Markov Decision Processes Power Reinforcement Learning

How Does an AI Learn to Make Smart Choices? Have you ever taught a pet a new trick? Imagine you’re training your dog to roll over. You give the command, and the dog, a bit confused at first, might just sit or lie down. You offer no treat. Then, it shuffles and accidentally rolls onto […]

Good Agents Gone Bad: The Dark Side of Reward Hacking in RL

Introduction: The Cheating Machines Picture this: You’ve just built the perfect AI agent to play a boat racing game. Your reward function is crystal clear finish the race as quickly as possible, with bonus points for hitting green power-up blocks along the way. You sit back, confident that your digital racer will soon be breaking […]