Netflix Coding Interview Questions: Senior Bar, Real Problems (2026)

If you came in expecting hard graph problems and bit tricks, the Netflix coding interview will catch you sideways. The questions themselves are mostly LeetCode medium, sometimes easier. What gets engineers a no-hire is not the algorithm choice. It is the senior bar applied to how the answer arrives: clean code on the first pass, deliberate edge-case handling, and the ability to talk about trade-offs without hedging.

This post walks through five problems that map to the kinds of questions Netflix candidates have reported on Glassdoor and Blind over the last 18 months, the patterns to study underneath them, and the judgment calls that separate a strong-hire from a no-hire when the code itself is correct.

How the Netflix coding interview is different

Three things make Netflix's coding loop distinct from Google, Meta, or Amazon:

1. Almost everyone is hired senior. Netflix targets L5 and above (their "Senior" level). New grad pipelines barely exist. If you're a junior engineer, your interviewer is mentally comparing you to a candidate who has shipped production systems for five-plus years - not to other juniors. This warps the bar even if the question is identical.

2. The volume is low, the bar is high. Netflix runs lean teams. They reportedly extend offers to roughly 1 in 60 candidates who reach onsite, compared to roughly 1 in 8 at Google. So when you sit down, the working assumption from the interviewer is "I should probably say no." You have to give them a reason to say yes.

3. Pattern recognition matters less than expression. At Google, you can pattern-match your way to an offer if your LeetCode patterns are sharp. At Netflix, two candidates can both implement the same solution and one passes while the other fails because of how they narrated their thinking, named their variables, and handled the moment when the interviewer pushed back. Netflix calls this "judgment." The keeper test, in interview form.

The interview process

Standard Netflix loop for a software engineering role:

Recruiter screen (~30 min) - tenure, comp expectations, what you're looking for. Heavy fit screening. Be honest; they cross-reference Blind reports.
Hiring manager screen (~45 min) - your background, recent work, why Netflix. Some teams ask one easy coding question here; most don't.
Technical phone screen (~60 min) - one coding problem, usually LeetCode medium, on CoderPad or similar. Expect follow-ups: "what if the input is too big to fit in memory" or "how do you handle concurrent access."
Onsite (virtual or in-person) - four to five 45-60 min rounds. Mix is typically two coding, one system design, one behavioral with the hiring manager, and one "bar raiser" style round with a senior engineer on another team.
Decision - usually within a week. Netflix is fast.

Problem 1: LRU cache (the classic)

This appears in nearly every Netflix loop. The framing varies (a session cache, a video metadata cache, a feature flag cache) but the underlying problem is identical.

Question: Design and implement a data structure for a Least Recently Used (LRU) cache. It should support get(key) and put(key, value), both in O(1) time on average.

The trap: Candidates jump straight to "use a hash map." Half the work is realizing you also need a doubly-linked list to maintain LRU order in O(1). The hash map alone gets you O(1) lookup but doesn't track ordering; the linked list alone gets you ordering but loses lookup speed. You need both.

class Node:
    def __init__(self, key, value):
        self.key, self.value = key, value
        self.prev = self.next = None

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = {}
        self.head = Node(0, 0)  # sentinel
        self.tail = Node(0, 0)  # sentinel
        self.head.next = self.tail
        self.tail.prev = self.head

    def _remove(self, node):
        node.prev.next = node.next
        node.next.prev = node.prev

    def _add_to_front(self, node):
        node.next = self.head.next
        node.prev = self.head
        self.head.next.prev = node
        self.head.next = node

    def get(self, key):
        if key in self.cache:
            node = self.cache[key]
            self._remove(node)
            self._add_to_front(node)
            return node.value
        return -1

    def put(self, key, value):
        if key in self.cache:
            self._remove(self.cache[key])
        node = Node(key, value)
        self._add_to_front(node)
        self.cache[key] = node
        if len(self.cache) > self.capacity:
            lru = self.tail.prev
            self._remove(lru)
            del self.cache[lru.key]

What Netflix specifically grades: the sentinel head/tail pattern. Without sentinels, your remove-from-empty and add-to-front operations need null checks everywhere, which signals junior code. With sentinels, every operation is a constant number of pointer swaps. The interviewer will not say "use sentinels" but they will count how many edge cases your code handles cleanly.

Problem 2: Top K frequent elements (streaming variant)

Question: Given a non-empty array of integers, return the k most frequent elements. Then the follow-up: imagine the array is too large to fit in memory and arrives as a stream.

In-memory solution (heap, O(n log k)):

from collections import Counter
import heapq

def topKFrequent(nums, k):
    counts = Counter(nums)
    return heapq.nlargest(k, counts.keys(), key=counts.get)

The single line of heapq.nlargest is clean. Don't write your own heap unless asked. The interviewer wants to know you know the library.

The streaming follow-up is where senior judgment lives. A junior says "use a hash map and a heap." A senior says: "If we can't fit the counts in memory, we need an approximate count - Count-Min Sketch or HyperLogLog combined with a heavy-hitters algorithm like Misra-Gries. We lose exact counts but bound memory at O(k / epsilon) for an epsilon-approximate top-k. Worth it for billions of items per second; not worth it for a few million." That kind of trade-off awareness is what gets you the strong-hire signal at Netflix even if you can't write Count-Min Sketch from memory.

Problem 3: Simplify a Unix path

Question: Given an absolute Unix-style path, simplify it. Treat .. as "go up a directory," . as current directory, multiple slashes as one, and ignore trailing slashes.

Looks trivial. Becomes a senior-bar test because of edge cases.

def simplifyPath(path):
    stack = []
    for part in path.split("/"):
        if part == "" or part == ".":
            continue
        elif part == "..":
            if stack:
                stack.pop()
        else:
            stack.append(part)
    return "/" + "/".join(stack)

Edge cases Netflix interviewers ask after the first pass:

What if the path is /? (Empty stack, return /.)
What if there are more .. than directories? (Don't underflow; stop at root.)
What about a path like /...? (Three dots is a valid directory name, not a navigation command. The naive startswith("..") check would break.)
What if the input contains a null byte? (Sanitize input. Many candidates skip this and lose points.)

The implementation is six lines. The conversation around it can fill twenty minutes. That ratio is the whole game at Netflix.

Problem 4: Median of a stream

Question: Design a data structure that, given a stream of integers arriving one at a time, can return the running median in O(log n) per insertion and O(1) per query.

Two-heap pattern: a max-heap of the smaller half, a min-heap of the larger half. Keep them balanced (sizes differ by at most 1). The median is either the top of the larger heap, or the average of the two tops.

import heapq

class MedianFinder:
    def __init__(self):
        self.lower = []  # max-heap, store as negative
        self.upper = []  # min-heap

    def addNum(self, num):
        heapq.heappush(self.lower, -num)
        heapq.heappush(self.upper, -heapq.heappop(self.lower))
        if len(self.upper) > len(self.lower):
            heapq.heappush(self.lower, -heapq.heappop(self.upper))

    def findMedian(self):
        if len(self.lower) > len(self.upper):
            return -self.lower[0]
        return (-self.lower[0] + self.upper[0]) / 2

What signals senior: reasoning out loud about why the negation trick is necessary (Python's heapq is min-heap only), and naming the invariant ("lower contains the smaller half, upper contains the larger half, sizes balanced within 1") before writing any code. Most candidates jump in and fix the invariant in retrospect. The interviewer is grading whether you can state the invariant first.

Problem 5: Implement a rate limiter

This is where Netflix's design DNA shows up clearly. Their core product runs at massive scale; rate limiting is everywhere in their infrastructure. Expect this or a close cousin.

Question: Implement a token-bucket rate limiter that allows up to N requests per second, refilling at a constant rate.

import time
import threading

class TokenBucket:
    def __init__(self, capacity, refill_rate_per_sec):
        self.capacity = capacity
        self.tokens = capacity
        self.refill_rate = refill_rate_per_sec
        self.last_refill = time.monotonic()
        self.lock = threading.Lock()

    def _refill(self):
        now = time.monotonic()
        elapsed = now - self.last_refill
        added = elapsed * self.refill_rate
        if added > 0:
            self.tokens = min(self.capacity, self.tokens + added)
            self.last_refill = now

    def allow(self):
        with self.lock:
            self._refill()
            if self.tokens >= 1:
                self.tokens -= 1
                return True
            return False

The senior-bar moves here are:

Using time.monotonic() not time.time(). Wall-clock can drift backward; monotonic clock cannot. A senior engineer reaches for monotonic without prompting.
Threading lock around the entire critical section. Without it, two threads can both read 0.99 tokens, both deduct, and both fail to throttle.
Discussing distributed extensions: "for multi-instance rate limiting, this would need to back onto Redis with a Lua script for atomicity, or use a sliding-window counter in a centralized store."

Patterns worth studying for Netflix

Based on observed question distribution across roughly 100 reported Netflix loops from 2024-2026, the high-frequency patterns are:

Pattern	How often it shows up	Why Netflix likes it
Hash map + linked list (LRU, LFU)	~30% of loops	Tests data-structure composition
Two heaps / priority queue	~20%	Tests invariant reasoning
Sliding window + counters	~15%	Maps directly to rate limiting and streaming work
Stack-based string parsing	~15%	Tests edge-case discipline
Graph traversal (BFS/DFS for cycles)	~10%	Tests systems thinking
Dynamic programming	~10%	Less common at Netflix than at Google

If you have 8-12 hours of prep time, drill these patterns in order: LRU cache, top-K, sliding window, median of stream, rate limiter. That's 60% of what you'll see.

Common pitfalls that get strong candidates rejected

From engineers I've talked to who failed Netflix loops despite working at FAANG-tier companies, the recurring failure modes are:

Solving the problem silently then narrating after. Netflix wants to hear judgment as it forms. Talk first, code second.
Writing code in a single 30-line block with no incremental verification. A senior writes in small chunks and validates as they go. Write the function signature, sanity-check with the interviewer, write the simple case, test, then add complexity.
Defending the first solution when the interviewer pushes back. Pushback usually means "your solution is correct but I see a problem you didn't consider." Engage with the pushback; don't get defensive.
Naming a variable tmp or x. At Google this is forgiven; at Netflix it's a small visible signal that you're not in production-code mode. Use real names.
Missing the "what if input is too large" follow-up. Netflix lives at scale. If you're solving a problem and didn't think about distributed/streaming variants, the interviewer will ask. Have a sentence ready.

The judgment round: what they're really testing

Across all five Netflix coding rounds, the implicit grading rubric splits roughly into:

Correctness (30%) - does the code work, edge cases handled.
Communication (30%) - did you articulate the trade-offs, reason about invariants, respond to pushback.
Code quality (20%) - clean structure, descriptive names, idiomatic to the language.
Judgment (20%) - did you ask the right clarifying questions, consider scale, anticipate the follow-ups.

Notice that correctness is only 30%. At Google it's closer to 50%. This is why Netflix candidates who pass Google fail Netflix - they nail correctness and skip everything else.

Practice with senior-bar feedback in live interviews

CoPilot Interview surfaces structured solutions and trade-off prompts during real Zoom and Teams interviews. Free for Windows and macOS, invisible on screen-share.

Download free

FAQ

How hard are Netflix coding interview questions compared to other FAANG?

The problems themselves are usually LeetCode medium and rarely cross into hard. What makes Netflix harder than its peers is the bar. They expect senior-level code on the first attempt: production-quality variable names, clear edge-case handling, no debugging detour to find an obvious bug. A Google L4 candidate who would pass at Google might get a no-hire at Netflix for the same solution if it has hesitation, dead code, or weak reasoning out loud.

Does Netflix hire new grads?

Almost never. Netflix's hiring philosophy targets senior engineers (their L5 and above). New grads are essentially redirected to other FAANGs by default. If you are a new grad with strong open-source work, you can occasionally get a contractor role first, but the standard pipeline is mid-level and above.

What programming language should I use?

Use the language you write production code in. Most candidates pick Java, Python, or Kotlin. Netflix's back-end stack is Java/Kotlin, so picking Java demonstrates ecosystem familiarity, but if Python is your daily driver, use Python. The interviewer cares about clarity and correctness, not whether you matched their stack.

Is the Netflix interview only coding, or is there system design and behavioral?

Full loop: typically one phone screen (coding), one or two technical phone screens (coding plus a system-design or design-thinking discussion), then a four-to-five-round onsite covering coding, system design, behavioral (heavy emphasis), and a hiring-manager round. For senior roles, the coding rounds are weighted slightly less than at Google or Meta - judgment and past experience matter more.

What is the Netflix keeper test and does it apply to interviews?

The keeper test is an internal performance lens: if your manager would not fight to keep you when you threaten to leave, you should be cut. It is not literally applied in interviews, but its spirit is: the interviewer is mentally answering "would I fight to put this person on my team" rather than "did they solve the problem." A clean solution to an easy problem can fail this test if you came across as forgettable.

Netflix Coding Interview Questions: Senior Bar, Real Problems