“`html
A new study explores the potential of a bio-plausible reinforcement learning (RL) approach in achieving performance comparable to more traditional methods like Proximal Policy Optimization (PPO).
- The research compared a fully custom, backprop-free agent based on Hebbian plasticity and predictive coding with PPO.
- Despite the agent’s bio-plausible architecture, it still struggled to match PPO’s performance of approximately 59% in playing Pong.
- The study highlighted a key challenge: the stability versus adaptability trade-off inherent in Hebbian plasticity when dealing with non-stationary environments like self-play scenarios.
-
• The research underscores the difficulty in achieving bio-plausible RL performance, even for simple tasks like Pong.
• It suggests that while backprop-free methods can be effective, they may struggle in dynamic or evolving environments.
• The study provides insights into how traditional RL techniques might continue to hold an advantage over biologically inspired approaches in certain contexts.
“`
Source Read original →
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

![Backprop-free Pong: PC + distributional Hebbian plasticity vs. PPO: 57% vs. 59%, ~1500 lines from scratch [P]](https://ai-maestro.online/wp-content/uploads/2026/05/backprop-free-pong-pc-distributional-hebbian-plasticity-vs-p-1024x1024.jpg)


