The Interview Gap

Four months ago I was laid off. One of the interviews I got since, for an AI builder role, focused on building product on top of LLMs, opened with 45 minutes of BGP routing and TLS handshake internals. Things I haven't touched in a decade, if ever. Things the job description, by its own account, would not require me to touch.

I keep coming back to that interview. Not because it stung, though it did, but because it surfaced something I had been quietly noticing while searching.

I am a senior product engineer with 14+ years of shipping various projects, including:

an internal deployment platform (similar to Vercel's UX, on top of Cloud Run) that fifty-plus teams adopted,
an AI-powered incident response platform that collected and analyzed thousands of incidents via Slack and Meet transcripts and produced structured reports for both engineers and leadership.
200+ services migrated from OpenShift to GKE and built the standard CI/CD pipeline for the whole organization, advocated for the adoption of GitHub Actions and built the tooling around it.
Hookie, a CLI tool for receiving, inspecting, and replaying webhook events across developer and agent environments, built in Go.
RSC Boundary as a fun project to help debug React Server Components.
Various other products, solo or with friends, trying to find interesting problems to solve, that may or may not lead to a business.

Yet, at most companies that doesn't change the funnel.

This article is a checkpoint piece. Four months in, here is what I think is happening, and what I would change if I were running a hiring funnel for senior product engineers.

The asymmetry I keep noticing

Open any dev X feed. Half the content is about how AI agents are going to write most of the code. Loop architecture. Harness wars. "Design the system, the agent ships." The premise sits underneath every post: implementation is the part that is being automated.

Now look at how some companies hire senior engineers. The gate is still whether you can produce a graph traversal cold in a 45-minute speed run.

I cannot make both of these true at once, and that is what has been bothering me. Either implementation is the part agents are taking over, in which case the screen is filtering for the layer that is being automated. Or implementation is still the senior engineer's core job, in which case half the industry's discourse is theater. The truth is probably somewhere in the middle. Even at the middle, the math on the current default doesn't make sense to me.

Owning my side of it

Before I go further, the honest disclosure: leetcode under a clock is not where I am at my best.

I understand the concepts and I did practice. But these feel easier to me when I am at home, in my own time, not under a clock while someone watches me like a hawk.

And to be honest, in my career, I have rarely needed to reach for any of these. When I did have to, I have known which one to reach for, did some research if I had to, and built it. For example, RSC Boundary, the OSS tool I maintain, contains a depth-first search over the React component tree. I knew that was the right shape for the problem, researched it, and wrote it. It works. None of that required me to spit it out faster than the speed of light, with an audience.

The obvious response is that I am making excuses because I can't do leetcode. Fair enough. But if years of shipping doesn't prepare a candidate to be "interview ready," then maybe the test has drifted from the work it was meant to measure. If I am hiring for a barista role, I wouldn't expect them to tear down the espresso machine and rebuild it from scratch in 35 minutes.

What senior product engineers actually do now

Last month I worked on increasing the security of an API. I shipped HMAC-SHA256 hashing to avoid plain text in the DB and kept an AES encrypted version as well, because the product needed decryptable keys for backward compatibility and I had to reason about which primitives gave us that property without compromising key rotation and security. The same week I designed a scopes-and-capabilities model for fine-grained authorization on the same API. I built the spec, the agent wrote a lot of the code. The agent could not have made either of those calls, without knowing the context of the product and the tradeoffs we were making.

Right now, I am building Thingly, a mobile app, in part because I needed deep practice with AI agent workflows for product engineering work and I am curious to see how AI agents handle mobile development. Most of my time on it is not typing implementation. It is deciding what the app should do, designing what good UX looks like, which tradeoffs the StoreKit plumbing forces, when to override the agent's first suggestion, and when to let it ride.

My reading list right now is the boar book, Designing Data-Intensive Applications, alongside Go in 24 hours, because my recent work with Hookie has made me curious about Golang. Not Cracking the Coding Interview. The work that is left, the work that an agent cannot do for me, is reasoning about systems and tradeoffs and more importantly, the creativity. The agent can produce a working LRU cache faster than I can. It can't tell me whether caching is the right answer for this product at this scale, and it can't tell me if the user experience will be improved by it.

This is the shape of senior product engineering when agents handle implementation. Picking the right primitive or experimenting with a couple, to see the tradeoffs in a real-world context. Catching a subtle correctness issue in code the agent confidently produced. Knowing when the agent's first answer is wrong and being able to steer it. Owning the architectural call that determines whether the next six months are smooth or painful. Communicating tradeoffs to humans who are not engineers.

None of that surfaces in a leetcode round.

Counterarguments, taken seriously

I have been on the other side of the table enough times to know the strongest defense of leetcode. It goes something like this:

We do not actually care about the puzzle. We care that you can think clearly under pressure, communicate while solving, and reason about tradeoffs. At scale, with 200 applicants, we need an objective time-bounded filter. Takehomes get gamed by AI and architectural conversations are subjective. Leetcode is the least-bad calibrated signal we have.

Fair enough. But here is where I think it breaks down.

Thinking clearly under pressure

Clear thinking under pressure is the right goal, but it is the wrong kind of pressure. The pressure in real senior work is a deadline, a half-down production system, a customer escalation, an ambiguous spec, a stakeholder who needs to understand a tradeoff in plain language. The skill of staying composed while being scrutinized for slipups is a real skill, but it is not the skill the job uses.

Flooded with applications

At scale, with 200 applicants, we need an objective time-bounded filter.

That is probably the strongest point. 200 applicants do need a filter, but the key is to filter for the right signal. A demo round filters too. A short system design conversation filters too. Both are gameable in different ways than leetcode is gameable. None of them are perfect.

The question is: which imperfect filter selects for the work the role actually involves?

Gamed by AI and subjective architectural conversations

Takehomes get gamed by AI and architectural conversations are subjective.

That's true. But it is not a reason to retreat to a format that was designed when the floor was somewhere else. If anything, subjective interviews are probably the least likely to be gamed by AI and the most likely to be a good signal of the work the role actually involves.

Why are we still testing for implementation skills, in a world where producing implementation is no longer the bottleneck?

You just suck at leetcode

This last one is personal, "You just suck at leetcode." Yeah, that may be true. But both can be true at once: wrong filter and me sucking at such filter, the second one does not invalidate the first. Years of shipping say weakness on a leetcode screen matters less than the screen pretends it does. If hiring managers disagree, that is the disagreement this post is about.

What can we do instead?

A small startup I interviewed with recently ran me through two rounds. First round, I demo'd a project of mine and answered questions about it. Second round, I did system design on a whiteboard, with a twist: one interviewer played the user, the other played a teammate. My job was to gather requirements, surface tradeoffs, and propose a solution.

That is the closest an interview process has come to resembling the actual work. The demo round forced me to show something real I had built and defend the decisions I made. The design round forced me to handle ambiguity, ask the right questions, communicate to two audiences with different concerns, and explain why I would do A over B knowing what each one costs.

Two interviews. Two signals. Both directly mapping to the work the role actually involves.

If I were running it, I would go one step further and let candidates pick a track:

A builder path with the format above, or
A specialist path with traditional algorithms and systems-heavy interviews.

Same company, same bar, different filter depending on what the role actually requires and/or what the candidate is strongest at. A startup hiring a senior product engineer to ship features against a real codebase and big-tech hiring a distributed systems engineer to design a new storage layer are not hiring for the same skill. Pretending the same filter works for both is the mistake.

Additionally, a paid trial as a third interview, to see if the candidate is a good fit for the company and works well with the team, is probably the best way to get a sense of the candidate's fit, without the artificial pressure to perform and the incentive to memorize the Blind 75.

What I'm not saying

Fundamentals don't matter

Fundamentals do matter. Knowing which data structure fits a problem is a skill, but producing the implementation cold in 45 minutes with an audience is not a great judge of that skill.

AI will replace engineers

Maybe, maybe not. The argument works whether AI does 10% of the implementation or 70% of it. The skills that get more valuable as agents get better are all the skills leetcode doesn't measure.

Leetcode is stupid

I'm not saying that using leetcode is lazy or stupid. The filter was well-calibrated for a previous era and may still be suitable for some roles. The job has changed, the filter has not.

The gap

Four months in, this is the pattern I keep seeing. The funnel is filtering for the layer that is being automated and missing the layer that is not. Whatever anyone believes about how good agents are going to get, the screens running today over-weight the part of senior engineering that is depreciating the fastest and under-weight everything that is appreciating. The candidates who would be best at the work that is left are also the candidates most likely to fail the first round.

I don't know exactly what should replace the current default. I know what I would try if I were running it. Mostly I know the current default has stopped doing the job it was designed to do, and almost no one is saying it out loud.

Product Engineering
Engineering Interviews
Thinking
AI