In a Training Loop 🔄

7 15 59

R

juiceb0xc0de

JuiceB0xC0de

AI & ML interests

destroying heuristic determination in 4 dimensions to flood the engines with diversity and a lot of swear words

Recent Activity

liked a Space about 14 hours ago

huggingface/paperswithcode

posted an update 1 day ago

Introducing the Gemma-4-E2B Brain Atlas, an interactive neural census of every layer, every head, 16 behavior categories in Google's flagship 2B model. We ran 184,320 probe prompts across 35 layers × 8 components and mapped what came back. The Brain Atlas is an interactive tool that lets you explore the internal behavior of Google's Gemma-4-E2B model layer by layer, head by head. Pick a behavior category, pick a layer, and see exactly which components light up and which go quiet. The dataset is fully queryable if you want to go deeper. The mapping combines multiple single-direction techniques run in parallel across every layer and component. Activation taxonomy (classifying each neuron by how broadly it fires across prompt categories), coactivation pair analysis (which neurons lock together and on what topics), F-stat behavioral separation (one-way ANOVA per feature across 16 behavior categories), per-head specificity scoring, and a full compliance probe pipeline using SVD, sparse decomposition, and variance analysis. Here's what I found when I ran it. The sharpest behavioral signal isn't at the output. It's Layer 0. Up projection hits F=22.7, nearly 2x anything in the final third of the network. The model does its behavioral sorting before it's barely started, then spends the next 34 layers… doing what exactly? The gate has a lifecycle. 70% dormant at L1, highest in the model. Brutal sparsification at L23–26 (>58% silent). Then reopens. The final five layers are the most alive gates anywhere. The model's last act is a gate flare. Layer 4 routes 5 projections to dim 448. One layer. One dimension. That's a topology highway. Zero specialist neurons. Not one. 1.2M neurons analyzed. None fires exclusively on a single category. This model distributes everything. 🧠 Space: https://huggingface.co/spaces/juiceb0xc0de/gemma-4-e2b-brain-atlas 📊 Dataset (1.3M rows, fully queryable): https://huggingface.co/datasets/juiceb0xc0de/gemma-4-e2b-atlas

updated a Space 1 day ago

juiceb0xc0de/gemma-4-e2b-brain-atlas

View all activity

Organizations

upvoted 2 articles 17 days ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 331

Article

Granite 4.1 LLMs: How They’re Built

ibm-granite

•

19 days ago

• 71

upvoted a collection 17 days ago

smol2operator Release

Collection

4 items • Updated Sep 23, 2025 • 26

upvoted an article 17 days ago

Article

AI evals are becoming the new compute bottleneck

evaleval

•

19 days ago

• 26

upvoted a collection 17 days ago

Favorite Models

Collection

Mostly uncensored models for low VRAM budget • 23 items • Updated 2 days ago • 5

upvoted a changelog about 1 month ago

Hugging Face Changelog

Introducing Kernels

Apr 15

• 182

upvoted 2 articles about 1 month ago

Article

lucky_pick_scheduler

juiceb0xc0de

•

Apr 15

• 1

Article

New Old Llamas

mike-ravkine

•

Jan 3

• 3

upvoted a collection about 1 month ago

LLM

Collection

10 items • Updated Apr 6 • 1

upvoted a paper about 1 month ago

Therefore I am. I Think

Paper • 2604.01202 • Published Apr 2 • 33

upvoted an article about 1 month ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 895

upvoted a collection about 2 months ago

Yi-1.5 (2024/05)

Collection

10 items • Updated May 20, 2024 • 93

upvoted a changelog about 2 months ago

Hugging Face Changelog

Hugging Face Papers for AI Agents

Mar 18

• 140

upvoted a collection about 2 months ago

🍺 The Bartenders 🍺

Collection

This is a collection of models that I've trained on data collected through conversations with frontier models GPT, Claude, Perplexity and myself. • 9 items • Updated 3 days ago • 3

upvoted a collection 2 months ago

Olmo Hybrid

Collection

6 items • Updated Mar 5 • 27

R

AI & ML interests

Recent Activity

Organizations

juiceb0xc0de's activity

KV Caching Explained: Optimizing Transformer Inference Efficiency

Granite 4.1 LLMs: How They’re Built

AI evals are becoming the new compute bottleneck

Introducing Kernels

lucky_pick_scheduler

New Old Llamas

Welcome Gemma 4: Frontier multimodal intelligence on device

Hugging Face Papers for AI Agents