06 Sep 2025 1 min read GenAI

GenAI is the new Systems Design for TPM

When I started out as a TPM, the must-have technical muscle was systems design. You couldn’t sit in a room with engineers and talk architecture without being able to map out load balancers, databases, queues, and failover strategies. That was the craft.

Fast forward to today, and the conversations are shifting. I’m finding that understanding large language models—their tokenization, attention mechanisms, and inference trade-offs—is becoming the new systems design skill set. You don’t have to build the models, but you do need to understand how they work well enough to guide integration, cost forecasting, and reliability planning.

Think about it: tokenization defines your cost model and latency profile. Attention mechanisms (and the limits around context windows) influence product feasibility and user experience. These aren’t abstract research concepts anymore; they’re design constraints in the same way replication lag or cache invalidation once were.

So just like we used to sketch out sharded databases and traffic flows, TPMs today need to sketch out how prompts, embeddings, and model calls will behave under scale. It’s the same discipline of system-level thinking, just applied to a new substrate. The TPMs who lean into this will be the ones who stay credible and valuable in AI-driven roadmaps.

You might also like...

Getting Started with Hugging Face

Vector Databases Why Your Recommender System Just Got Smarter

Book a 30-min Meeting