AI Researcher · World Models · Self-Supervised Learning · Time-Series

Toward world models that are provably safe, sample-efficient, and ready for the real world.

I'm Randall Balestriero — an AI researcher building the theoretical foundations of world models, self-supervised learning, and learnable signal processing for time-series. A decade of research (Rice · Meta FAIR · Citadel) turning rigorous theory into systems that improve the state of the art on vision, NLP, geophysics, bioacoustics, medical signals, and quantitative finance.

100+
Publications
10y
In deep learning
FAIR
w/ Yann LeCun
NASA
Mars SEIS deploy
Research

From world models to deployed AI — bridged by first-principles theory.

We are putting one AI on every desk and in every home. That puts intense pressure on practitioners to deliver empirical breakthroughs, and on regulators to safeguard users. I work across the full pipeline — data, architecture, loss, and time-series structure — so that the next generation of world models is self-contained, provably safe, and energy-efficient.

Time-Series & Learnable Signal Processing

A decade of learnable signal processing — parametrized wavelets, deep wavelet transforms, and structured operators for high-dimensional, non-stationary signals. Deployed in NASA's Mars SEIS, bioacoustics, geophysics, medical signals, and quantitative finance.

waveletsforecastingnon-stationarygeophysicsfinance

Self-Supervised Learning

Provable foundations for SSL — augmentation, contrastive and joint-embedding losses, and the geometry of learned representations. Co-author of Meta's Cookbook of Self-Supervised Learning with Yann LeCun.

VICRegJEPAcontrastiveaugmentation

Geometry & Spline Theory of Deep Nets

Reading deep networks as continuous piecewise-affine spline operators — turning geometry into practical wins on batch-norm, generative networks, inversion, and beyond.

CPA splinesmanifoldsinterpretability

Safe, Fair & Regulator-Ready AI

Quantifying and removing dataset biases from trained generative networks — without retraining — and providing the theoretical answers regulators need as AI scales.

MaGNETfairnessmanifolds

Real-World Deployment

Practical theory for noisy, non-stationary domains: quantitative finance (Citadel · GQS), medical data, and large-scale generative AI.

forecastingrobustnessproduction
Bio

A decade of learnable signal processing.

Ten years connecting signal processing, geometry, and deep learning — across academia, industry, and a NASA mission.

2024 — present
Featured on MLST · World Models, JEPA, Specialist LLMs

Three back-to-back appearances on Machine Learning Street Talk on JEPA-style world models, the geometry of neural networks, and how to build specialist LLMs without massive pretraining.

2023 — present
Quantitative Researcher · GQS, Citadel

Bringing first-principles representation learning and forecasting to one of the hardest real-world time-series domains: highly noisy, non-stationary financial signals. Industry exposure that sharpens the research agenda toward practical, deployable theory.

2021 — 2023
Postdoctoral Researcher · Meta AI / FAIR (with Yann LeCun)

Broadened my research to self-supervised learning and the biases that emerge from augmentation and regularization — leading to publications, an ICML tutorial, and Meta's SSL Cookbook.

2016 — 2021
PhD · Rice University (with Richard Baraniuk)

Developed the affine spline operator view of deep networks, then used it to revisit batch-normalization and generative networks — turning theory into empirical wins.

2013 — 2016
Learnable Signal Processing

Early work on learnable parametrized wavelets, later extended to deep wavelet transforms — deployed in NASA's Mars SEIS mission for marsquake detection.

Selected work

Selected publications.

A few highlights — for the full record see the live BibTeX index further down or Google Scholar.

NeurIPS 2025 Planning with latent dynamics models 27

Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models

JEPA-trained latent world models for planning — more data-efficient and better generalization to unseen layouts than model-free RL. With Sobal, Zhang, Cho, Rudner, and LeCun.

arXiv 2026 LeWorldModel: end-to-end JEPA from pixels 11

LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

The first JEPA that trains stably end-to-end from raw pixels — collapsing six tunable loss terms down to one. With Maes, Le Lidec, Scieur, and LeCun.

arXiv 2026 Hierarchical planning with latent world models

Hierarchical Planning with Latent World Models

Latent world models learned at multiple temporal scales — enabling zero-shot, long-horizon robotic control (70% pick-and-place success vs 0% for single-scale baselines).

NeurIPS 2025 Joint-embedding vs reconstruction 19

Joint-Embedding vs Reconstruction: Provable Benefits of Latent Space Prediction for SSL

Closed-form analysis of when JEPA wins over reconstruction: latent prediction is strictly preferred when irrelevant features dominate the input signal.

arXiv 2026 Semantic tube prediction

Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA

Extending JEPA to language. Constrains hidden trajectories to a tube around the geodesic — drastically reducing the data needed to fine-tune LLMs. With Huang and LeCun.

SSL Cookbook SSL Cookbook 705

A Cookbook of Self-Supervised Learning

The reference text on modern SSL — co-authored with Yann LeCun and the FAIR team. Distills years of theory and practice into a practitioner's guide.

arXiv 2024 Deep Networks Always Grok 68

Deep Networks Always Grok and Here Is Why

Grokking happens everywhere — even on CIFAR10 and ImageNette — driven by a phase transition in the network's linear regions. With Humayun and Baraniuk.

NeurIPS 2025 Curvature tuning model steering

Curvature Tuning: Provable Training-Free Model Steering from a Single Parameter

One scalar that shifts a trained network's decision boundary — no retraining, no backprop. Choose its value by cross-validation alone.

Nature Comm. 2020 Clustering earthquake signals 266

Clustering Earthquake Signals and Background Noise in Continuous Seismic Data with Unsupervised Deep Learning

Deep wavelet representations for time-series — deployed in NASA's Mars SEIS mission for marsquake detection. Demonstrates learnable signal processing at the scale of real geophysical data.

arXiv 2021 High-dimension extrapolation 185

Learning in High Dimension Always Amounts to Extrapolation

Why everything we call "interpolation" in modern deep learning is actually extrapolation — and what that means for generalization theory. With Pesenti and LeCun.

Published at
ICLR NeurIPS ICML CVPR ECCV MSML Nature Communications IEEE Springer
Show the full live publication index (BibBase)
Media & talks

Featured on Machine Learning Street Talk and beyond.

Long-form conversations on world models, self-supervised learning, the geometry of deep nets, and the future of LLMs — plus press coverage from Meta AI, Rice, NYU, and IEEE.

JEPA with Randall Balestriero
EP11 · JEPA
YouTube · World Models

JEPA with Randall Balestriero

Joint Embedding Predictive Architectures — the world-model paradigm Yann LeCun and I have been pushing forward at FAIR.

Build Specialist LLMs Like It's 2019
MLST · 2025
YouTube · MLST

Build Specialist LLMs Like It's 2019 — LLMs Without Pretraining and SSL

Counter-intuitive results: 7B-parameter LLMs can match pretrained baselines when trained from scratch on small task-specific corpora. A unified view of SSL and supervised learning.

Neural Networks Are Elastic Origami
MLST · 2025
YouTube · MLST

Neural Networks Are Elastic Origami!

The geometry of deep networks via spline theory — grokking, intrinsic dimensionality, toxicity detection, and what RLHF actually does to representations.

Yann LeCun + Randall Balestriero SSL NeurIPS 2022
MLST · NeurIPS 2022
YouTube · MLST

#86 · Prof. Yann LeCun & Randall Balestriero — SSL, Data Augmentation, Reward Isn't Enough

A long-form NeurIPS 2022 conversation on the future of self-supervised learning, the limits of reward signals, and the path to autonomous AI.

Interpolation and Extrapolation with LeCun and Balestriero
MLST · 061
YouTube · MLST

061 · Interpolation, Extrapolation and Linearisation — with Yann LeCun

Why "interpolation" doesn't mean what you think in high dimensions — and what that implies for generalization in modern deep nets.

Tutorial
ICML 2023
ICML · Tutorial

Self-Supervised Learning in Vision: from Research Advances to Best Practices

ICML 2023 tutorial co-presented at the conference — the practical and theoretical state of SSL, with the FAIR Cookbook as companion.

Press & coverage
Meta AI releases the Self-Supervised Learning Cookbook (co-authored with Yann LeCun)
ai.meta.com
AI discovers new events in geophysical data for earthquake detection
news.rice.edu
The NYU Center for Data Science at NeurIPS 2023
nyu.edu
How can we use signal processing tools to understand neural networks better?
IEEE Signal Processing Society
Deep wavelet transforms for time-series and geophysics (Nature Communications)
nature.com
Get in touch

Open to research collaborations & opportunities.

Interested in world models, SSL, or theory of deep learning — or hiring? I'd love to hear from you.