System DesignAdvancedarticle

Stateless Auth: Managing JWT Blacklisting at Scale

The truth about stateless JWTs. How to implement revocation lists using Redis without giving up your performance.

Sachin SarawgiApril 20, 20264 min read4 minute lesson

Stateless Auth: JWT Revocation

JWTs are often marketed as "stateless authentication", but the first real security requirement (logout, account disable, token theft response) immediately introduces state.

If you do not design revocation explicitly, you either accept long exposure windows or degrade user experience with aggressive token expiry.

Why revocation matters

Common revocation scenarios:

  • user clicks logout on shared/public device
  • account is disabled by admin
  • password reset after suspected compromise
  • refresh token family is invalidated due to theft detection

Without revocation, a valid unexpired JWT continues to grant access.

The core model: token identity + deny list

Issue access tokens with a unique jti (JWT ID). On each authenticated request:

  1. validate signature, issuer, audience, and expiry
  2. check if jti is revoked in low-latency store (usually Redis)
  3. reject if revoked

This preserves scalable verification while enabling immediate invalidation.

Data structures for revocation in Redis

A practical key model:

  • key: revoked:jti:<id>
  • value: reason or metadata
  • TTL: token remaining lifetime (exp - now)

Auto-expiry keeps memory bounded.

For account-wide invalidate:

  • key: user:revoked_after:<user_id> timestamp
  • reject token if iat < revoked_after

This avoids writing millions of per-token entries in compromise events.

Choosing token lifetime strategy

Security and performance trade off here:

  • short-lived access token (5-15 min) reduces blast radius
  • refresh token rotation handles session continuity
  • revocation checks still needed for immediate logout/high-risk events

Long-lived access tokens without revocation checks are operationally risky.

Gateway integration pattern

Place revocation check in API gateway or shared auth middleware so every service does not reinvent logic.

Flow:

  1. parse and verify JWT
  2. check local cache for recently seen non-revoked/revoked JTIs
  3. fallback to Redis lookup
  4. attach auth context to upstream request

Use tiny in-process caches to reduce Redis read pressure on hot tokens.

Preventing Redis from becoming bottleneck

At high RPS, naive per-request Redis lookups can be expensive.

Mitigations:

  • cache positive/negative revocation checks briefly (seconds)
  • use Redis cluster with key hashing by jti
  • pipeline lookups where middleware handles batched traffic
  • monitor miss ratio and p99 lookup latency

Fail-open vs fail-closed behavior must be explicit and risk-based.

Fail-open vs fail-closed policy

If Redis is unavailable:

  • fail-closed is safer (deny requests), but risks broad outage
  • fail-open keeps availability, but allows potentially revoked tokens

A common approach:

  • fail-closed for admin/financial scopes
  • bounded fail-open for low-risk read-only scopes with alerting

Make this a policy decision, not accidental behavior.

Event-driven revocation propagation

In microservices, revocation events should be published:

  • TOKEN_REVOKED
  • USER_REVOKED_AFTER_UPDATED

Consumers can warm local caches proactively. This reduces consistency lag during incidents and supports edge deployments.

Handling logout correctly

Logout should invalidate:

  • current access token (jti)
  • optionally current refresh token
  • optionally all session family refresh tokens (for "logout all devices")

Do not rely on client-side token deletion only.

Security hardening essentials

  • sign with strong asymmetric keys where possible
  • enforce aud, iss, and clock skew checks
  • rotate signing keys with kid
  • store minimal claims (avoid sensitive data in JWT payload)
  • detect replay with jti + device/session context when needed

JWT is transport format, not complete security architecture.

Observability and operations

Track:

  • revocation check latency (p50, p95, p99)
  • revoked-token access attempts
  • Redis availability and timeout rates
  • auth decision split (valid/revoked/invalid/expired)
  • false rejection incidents

Security controls without observability are hard to trust.

Common implementation mistakes

  • no jti claim, making selective revocation hard
  • storing revocations forever (unbounded memory growth)
  • not syncing clock assumptions (iat/exp) across services
  • revoking only refresh tokens but not active access tokens
  • missing admin emergency "revoke all for tenant/user" path

Reference architecture

  • Identity service issues signed JWTs with jti
  • API gateway verifies and checks revocation
  • Redis stores revocation markers with TTL
  • Event bus propagates revocation updates
  • Auth dashboard allows targeted and bulk revocation operations

This architecture keeps request path fast while giving security teams immediate control.

Final takeaway

Stateless auth is a scalability optimization, not a security exemption. At scale, robust JWT systems combine short-lived tokens, centralized revocation checks, and operational observability to keep both performance and incident response strong.

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

System DesignAdvanced

System Design: Building a Session Management Platform

Sessions feel simple until they become a security boundary. A user logs in. The system gives them a token or cookie. Requests work. End of story. Then reality arrives: - one user is logged in on five devices - an access…

Apr 18, 202611 min read
Deep Dive
#system design#session management#authentication
System DesignAdvanced

System Design: Designing a Global Distributed Rate Limiter

System Design Masterclass: Designing a Distributed Rate Limiter In a distributed environment, a single malicious script, a misconfigured client, or a massive traffic spike can easily overwhelm your backend servers, bring…

Apr 20, 20266 min read
Case StudyBackend Systems Mastery
#system-design#rate-limiting#redis
System DesignAdvanced

System Design: Designing a Global Payment Gateway (Stripe Scale)

System Design Masterclass: Designing a Payment Gateway (Stripe) Designing a system to serve photos or short URLs is fundamentally about optimizing for read-latency and disk space. If a user's photo fails to load, they re…

Apr 20, 20265 min read
Case StudyBackend Systems Mastery
#system-design#fintech#payment-gateway
System DesignIntermediate

System Design: Designing Stateless Authentication

System Design: Designing Stateless Authentication In a microservices architecture, you can't rely on server-side sessions (stored in memory/database) because every request might hit a different service instance. Stateles…

Apr 22, 20263 min read
Deep DiveBackend Systems Mastery
#system design#authentication#jwt

More in System Design

Category-based suggestions if you want to stay in the same domain.