Backprop-free Pong: PC + distributional Hebbian plasticity vs. PPO: 57% vs. 59%, ~1500 lines from scratch [P]

“`html A new study explores the potential of a bio-plausible reinforcement learning (RL) approach in achieving performance comparable to more traditional methods…

By AI Maestro May 19, 2026 1 min read
Backprop-free Pong: PC + distributional Hebbian plasticity vs. PPO: 57% vs. 59%, ~1500 lines from scratch [P]

“`html

A new study explores the potential of a bio-plausible reinforcement learning (RL) approach in achieving performance comparable to more traditional methods like Proximal Policy Optimization (PPO).

  • The research compared a fully custom, backprop-free agent based on Hebbian plasticity and predictive coding with PPO.
  • Despite the agent’s bio-plausible architecture, it still struggled to match PPO’s performance of approximately 59% in playing Pong.
  • The study highlighted a key challenge: the stability versus adaptability trade-off inherent in Hebbian plasticity when dealing with non-stationary environments like self-play scenarios.
    • The research underscores the difficulty in achieving bio-plausible RL performance, even for simple tasks like Pong.
    • It suggests that while backprop-free methods can be effective, they may struggle in dynamic or evolving environments.
    • The study provides insights into how traditional RL techniques might continue to hold an advantage over biologically inspired approaches in certain contexts.

“`

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top