feat: add redis response caching #1803

johnpanos · 2026-01-22T03:13:25Z

Description

Adds Redis-based response caching to reduce latency and costs for repeated LLM requests. Our NLP team needed configurable caching with standard HTTP semantics for cache control.

Features:

Shared Redis cache across all ext_proc instances
HTTP Cache-Control support (no-cache, no-store, private, max-age)
Per-route TTL configuration (default: 1 hour)
x-aigw-cache: hit/miss response header for observability

Configuration:

responseCache field on AIGatewayRoute
extProc.redis.addr in Helm values (or secretRef for production)
Controller flags for Redis connection

Related Issues/PRs

N/A

Notes for reviewers

Main changes:

internal/cache/ - Cache interface, Redis client, Cache-Control parsing
internal/extproc/processor_impl.go - Cache lookup/store logic
api/v1alpha1/ai_gateway_route.go - ResponseCacheConfig API
examples/response-cache/ - Example manifests
site/docs/capabilities/traffic/response-caching.md - Documentation
tests/e2e/response_cache_test.go - E2E tests

Signed-off-by: John Panos <[email protected]>

codecov-commenter · 2026-01-22T03:45:33Z

Codecov Report

❌ Patch coverage is 95.58011% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.20%. Comparing base (fefb039) to head (d6357f5).

Files with missing lines	Patch %	Lines
internal/extproc/processor_impl.go	93.26%	3 Missing and 4 partials ⚠️
internal/cache/cache.go	96.96%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1803      +/-   ##
==========================================
+ Coverage   84.04%   84.20%   +0.15%     
==========================================
  Files         117      120       +3     
  Lines       12990    13171     +181     
==========================================
+ Hits        10917    11090     +173     
- Misses       1418     1422       +4     
- Partials      655      659       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: John Panos <[email protected]>

johnpanos · 2026-01-23T21:46:40Z

We've been running this in prod for around a few days now, and so far it's been holding up really well.

missBerg · 2026-01-24T10:33:44Z

Might be too much of a stretch for initial implementation, but I'm thinking if this could be done in a way that it would work for pure Envoy Gateway 🤔 so that AIGW simply utilizes it with a different cache key pattern.

So that people needing caching for non-inference could also use it.

Appreciate this may be too much for initial implementation but would be sooo cool!

johnpanos requested a review from a team as a code owner January 22, 2026 03:13

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jan 22, 2026

johnpanos added 5 commits January 21, 2026 19:13

feat: add response caching

af34e6a

Signed-off-by: John Panos <[email protected]>

feat: adhere to http cache control

e07b2e7

Signed-off-by: John Panos <[email protected]>

feat: add docs

63e4893

Signed-off-by: John Panos <[email protected]>

feat: add e2e cache test

c51fedf

Signed-off-by: John Panos <[email protected]>

style: run precommit

4839284

Signed-off-by: John Panos <[email protected]>

johnpanos force-pushed the add-response-caching branch from a33f7f5 to 4839284 Compare January 22, 2026 03:13

johnpanos changed the title ~~Add response caching~~ feat: add redis response caching Jan 22, 2026

johnpanos added 2 commits January 21, 2026 19:27

feat: allow configuring redis with a secret

52b2233

Signed-off-by: John Panos <[email protected]>

test: update cache coverage

70bd7d3

Signed-off-by: John Panos <[email protected]>

johnpanos added 5 commits January 21, 2026 19:48

test: ensure fail open in the case that redis is down

9a1d39f

Signed-off-by: John Panos <[email protected]>

test: update test coverage

ac69b5d

Signed-off-by: John Panos <[email protected]>

test: fix data race

67b1da1

Signed-off-by: John Panos <[email protected]>

test: wait for webhook config update

7668e91

Signed-off-by: John Panos <[email protected]>

test: restore previous e2e state

d011007

Signed-off-by: John Panos <[email protected]>

johnpanos force-pushed the add-response-caching branch 3 times, most recently from 443a99d to 166d190 Compare January 22, 2026 06:24

test: restore previous e2e state

d6357f5

Signed-off-by: John Panos <[email protected]>

johnpanos force-pushed the add-response-caching branch from 166d190 to d6357f5 Compare January 22, 2026 06:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add redis response caching #1803

feat: add redis response caching #1803

Uh oh!

johnpanos commented Jan 22, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jan 22, 2026 •

edited

Loading

Uh oh!

johnpanos commented Jan 23, 2026 •

edited

Loading

Uh oh!

missBerg commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add redis response caching #1803

Are you sure you want to change the base?

feat: add redis response caching #1803

Uh oh!

Conversation

johnpanos commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

johnpanos commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

missBerg commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

johnpanos commented Jan 22, 2026 •

edited

Loading

codecov-commenter commented Jan 22, 2026 •

edited

Loading

johnpanos commented Jan 23, 2026 •

edited

Loading