Backprop-free Pong: PC + distributional Hebbian plasticity vs. PPO: 57% vs. 59%, ~1500 lines from scratch [P]

“`html

A new study explores the potential of a bio-plausible reinforcement learning (RL) approach in achieving performance comparable to more traditional methods like Proximal Policy Optimization (PPO).

The research compared a fully custom, backprop-free agent based on Hebbian plasticity and predictive coding with PPO.
Despite the agent’s bio-plausible architecture, it still struggled to match PPO’s performance of approximately 59% in playing Pong.
The study highlighted a key challenge: the stability versus adaptability trade-off inherent in Hebbian plasticity when dealing with non-stationary environments like self-play scenarios.

“`

Source Read original →

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Backprop-free Pong: PC + distributional Hebbian plasticity vs. PPO: 57% vs. 59%, ~1500 lines from scratch [P]

Empowering Businesses with AI — Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Mira Murati steps back…

AI enthusiasts are in…

Building a Semantic Search…