Spring AI Advisors API in Java: Build Middleware for AI Interactions
I am a developer who loves Java, Spring, Quarkus, Micronaut, Open source, Microservices, Cloud
Spring AI Advisors API Explained
Series: Spring AI Complete Course — Lecture 4 of 12
Reading Time: 8 minutes
Level: Intermediate
Most developers stop at ChatClient.
That is enough for demos. It is not enough for production.
The moment an AI feature becomes real, you need behavior around the model call. You need logging. You need request filtering. You need latency tracking. You may need to attach tenant context, inject retrieval metadata, sanitize prompts, or block unsafe content before it ever reaches the model.
If you place all of that inside controllers and services, the codebase turns into a junk drawer. Business logic gets mixed with prompt plumbing. The same pre-processing and post-processing logic gets duplicated across every AI flow.
This is exactly where the Spring AI Advisors API comes in.
Advisors are Spring AI's middleware layer for LLM calls. If you already understand servlet filters, Spring MVC interceptors, or WebFlux filters, you already understand the core idea. Advisors let you intercept and shape requests before they hit the model, and inspect or transform responses before they return to your application.
In this article, we will break down:
CallAdvisorvsStreamAdvisorChatClientRequestandChatClientResponse- Advisor ordering with
getOrder() - Building a custom logging advisor
- Building a content filtering advisor
- Composing advisors with
ChatClient— globally and per-request - Common mistakes to avoid
Let's get into it.
Why Advisors Exist
A simple Spring AI application often starts like this:
@RestController
@RequiredArgsConstructor
public class ChatController {
private final ChatClient chatClient;
@GetMapping("/ask")
public String ask(@RequestParam String message) {
return chatClient.prompt()
.user(message)
.call()
.content();
}
}
This works beautifully in the early stage.
Then reality arrives.
Now you need to answer questions like:
- How do I log prompts and responses for debugging?
- How do I block prohibited content before it reaches the model?
- How do I inject metadata for observability or multi-tenant routing?
- How do I attach retrieval context for RAG?
- How do I measure model latency without scattering timers everywhere?
You could do all of this manually in service methods.
You shouldn't.
The better pattern is to centralize these cross-cutting concerns in advisors so your application code stays focused on business logic.
Core Types in the Advisors API
Spring AI defines different advisor types for synchronous and streaming scenarios.
| Interface | Use Case |
|---|---|
CallAdvisor |
Blocking (synchronous) calls |
StreamAdvisor |
Streaming (reactive) calls |
CallAdvisor
Use this for normal, non-streaming interactions.
A CallAdvisor surrounds a standard ChatClient call, letting you inspect or modify the request, call the next step in the chain, then inspect or modify the response on the way back.
StreamAdvisor
Use this for streaming scenarios.
When your UI is receiving tokens progressively, the execution model changes. StreamAdvisor is designed for that case, but the mental model stays the same: intercept, enrich, observe, or block the flow.
ChatClientRequest
This object represents the request moving through the advisor chain. It carries:
- The prompt (user message, system instructions)
- Model options (temperature, max tokens, etc.)
- Metadata and shared advisor context
Advisors can inspect it, enrich it, or pass it along unchanged.
ChatClientResponse
This object represents the final model response as it flows back through the advisor chain. It carries:
- The generated response content
- Token usage statistics
- Response metadata and shared advisor context
The shared context is important. It lets one advisor place data into the chain and another advisor read it later — across the full lifecycle of a single call.
Think of Advisors as a Stack
This is the most important mental model.
When Spring AI invokes a ChatClient, it builds an advisor chain. Each advisor gets the request in sequence, and each one has the option to:
- inspect the request
- mutate the request
- add context
- call the next advisor
- short-circuit the chain and return a response directly
- inspect or mutate the response
- throw an exception if processing should fail
Eventually, the final framework-provided advisor sends the request to the LLM.
Then the response comes back through the same chain in reverse order.
Request Path: Advisor A (order 100) → Advisor B (order 200) → Model
↓
Response Path: Advisor A (order 100) ← Advisor B (order 200) ← Response
That means the first advisor to process the request becomes the last advisor to process the response.
This stack-like behavior is the key to understanding advisor ordering.
Advisor Ordering with getOrder()
Spring AI uses the standard Spring ordering model.
Each advisor exposes getOrder(), and lower values execute first.
But here's the catch: lower values execute first on the request path and last on the response path.
That means:
- Lower order value → earlier for request, later for response
- Higher order value → later for request, earlier for response
This is why many developers get confused the first time they build more than one advisor.
Practical interpretation
If your advisor is doing request validation or security checks, give it high precedence so it sees the request early.
If your advisor is doing response formatting or final output shaping, you might want it to run later on the request side so it gets earlier access to the response side.
Also note:
Ordered.HIGHEST_PRECEDENCEmeans the smallest possible order valueOrdered.LOWEST_PRECEDENCEmeans the largest possible order value- if two advisors share the same order value, execution order is not guaranteed
In production systems, do not leave this vague. Make ordering explicit.
Build a Custom Logging Advisor
Let's create a simple advisor that logs the incoming prompt and the outgoing model response.
package com.example.ai.advisors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.ai.chat.client.advisor.api.CallAdvisor;
import org.springframework.ai.chat.client.advisor.api.CallAdvisorChain;
import org.springframework.core.Ordered;
import org.springframework.stereotype.Component;
@Component
public class LoggingAdvisor implements CallAdvisor {
private static final Logger log = LoggerFactory.getLogger(LoggingAdvisor.class);
@Override
public String getName() {
return "logging-advisor";
}
@Override
public int getOrder() {
return Ordered.HIGHEST_PRECEDENCE + 10;
}
@Override
public ChatClientResponse adviseCall(ChatClientRequest request, CallAdvisorChain chain) {
log.info("Prompt user text: {}", request.prompt().getUserMessage().getText());
ChatClientResponse response = chain.nextCall(request);
log.info("Model response: {}", response.chatResponse().getResult().getOutput().getText());
return response;
}
}
What this advisor is doing
- It reads the user prompt from
ChatClientRequest - It forwards the request to the next element in the chain using
chain.nextCall(request) - It receives the
ChatClientResponse - It logs the final output text before returning the response
This is the canonical advisor structure. It looks very similar to a servlet filter or a Spring AOP around advice because conceptually, that is exactly what it is.
Build a Content Filtering Advisor
Now let's create something more interesting: a policy guardrail.
This advisor checks the user prompt for prohibited phrases and blocks execution if the content is unsafe.
package com.example.ai.advisors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.ai.chat.client.advisor.api.CallAdvisor;
import org.springframework.ai.chat.client.advisor.api.CallAdvisorChain;
import org.springframework.core.Ordered;
import org.springframework.stereotype.Component;
import java.util.Set;
@Component
public class ContentFilterAdvisor implements CallAdvisor {
private static final Logger log = LoggerFactory.getLogger(ContentFilterAdvisor.class);
private static final Set<String> BLOCKED_TERMS = Set.of(
"password dump",
"credit card scrape"
);
@Override
public String getName() {
return "content-filter-advisor";
}
@Override
public int getOrder() {
return Ordered.HIGHEST_PRECEDENCE;
}
@Override
public ChatClientResponse adviseCall(ChatClientRequest request, CallAdvisorChain chain) {
String userText = request.prompt().getUserMessage().getText().toLowerCase();
boolean blocked = BLOCKED_TERMS.stream().anyMatch(userText::contains);
if (blocked) {
log.warn("Blocked request containing prohibited content");
throw new IllegalArgumentException("Prompt blocked by content policy");
}
return chain.nextCall(request);
}
}
Why this matters
This keeps policy enforcement out of your controller and service layer.
More importantly, it gives you a reusable place to apply consistent rules across all AI endpoints.
Today it's a simple keyword blocklist. Tomorrow it could be:
- PII detection
- regex-based sanitization
- tenant-specific policy controls
- prompt risk scoring
- role-based access enforcement
That is the real value of advisors. They create clean extension points for production concerns.
Composing Advisors with ChatClient
Global wiring via the builder
Once the advisors exist, attach them to your ChatClient so every call gets the treatment automatically.
package com.example.ai.config;
import com.example.ai.advisors.ContentFilterAdvisor;
import com.example.ai.advisors.LoggingAdvisor;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ChatConfig {
@Bean
public ChatClient chatClient(ChatModel chatModel,
LoggingAdvisor loggingAdvisor,
ContentFilterAdvisor contentFilterAdvisor) {
return ChatClient.builder(chatModel)
.defaultAdvisors(contentFilterAdvisor, loggingAdvisor)
.build();
}
}
Because ContentFilterAdvisor has HIGHEST_PRECEDENCE, it runs first on the request path. The chain works like this:
- content filter examines the request
- logging advisor logs the prompt
- framework sends the request to the model
- response comes back
- logging advisor logs the response
- filter advisor sees the final unwind
This is why ordering is not just a number. It defines the behavior of your full middleware pipeline.
Per-request wiring
You can also attach advisors to individual calls when you need selective behavior:
@Service
public class AiService {
private final ChatClient chatClient;
private final LoggingAdvisor loggingAdvisor;
private final ContentFilterAdvisor filterAdvisor;
public AiService(ChatClient chatClient,
LoggingAdvisor loggingAdvisor,
ContentFilterAdvisor filterAdvisor) {
this.chatClient = chatClient;
this.loggingAdvisor = loggingAdvisor;
this.filterAdvisor = filterAdvisor;
}
public String chatWithGuardrails(String userMessage) {
return chatClient.prompt()
.user(userMessage)
.advisors(loggingAdvisor, filterAdvisor)
.call()
.content();
}
}
When to Use StreamAdvisor
If you are using synchronous calls with .call(), CallAdvisor is the right abstraction.
If you are returning streamed responses with .stream(), use StreamAdvisor. The pattern is nearly identical but you work with Flux streams.
package com.example.ai.advisors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.ai.chat.client.advisor.api.StreamAdvisor;
import org.springframework.ai.chat.client.advisor.api.StreamAdvisorChain;
import org.springframework.stereotype.Component;
import reactor.core.publisher.Flux;
@Component
public class StreamingLoggingAdvisor implements StreamAdvisor {
private static final Logger log = LoggerFactory.getLogger(StreamingLoggingAdvisor.class);
@Override
public Flux<ChatClientResponse> adviseStream(ChatClientRequest request, StreamAdvisorChain chain) {
log.info("Streaming prompt: {}", request.prompt().getUserMessage().getText());
return chain.nextStream(request)
.doOnNext(response -> log.debug("Stream chunk received"));
}
@Override
public int getOrder() {
return 100;
}
@Override
public String getName() {
return "streaming-logging-advisor";
}
}
Streaming advisors are especially useful in scenarios like:
- live chat interfaces
- SSE endpoints
- token-by-token observability
- real-time content moderation
- progressive UI rendering
Real Production Use Cases for Advisors
Once you understand the pattern, a lot of Spring AI features make more sense.
Advisors are the right place for:
- Prompt logging for debugging and traceability
- Guardrails for content filtering and safety policies
- RAG context injection before the model call
- Caching — check a cache in your advisor, return early if the result exists
- Rate limiting — throttle requests based on user tiers or global limits
- Observability hooks for metrics and latency tracking
- Tenant metadata propagation in multi-tenant apps
- Response post-processing for cleanup or formatting
This is why I describe Advisors as middleware for AI. They sit around the model call and handle the non-business concerns that every serious AI application eventually needs.
Common Mistakes to Avoid
1. Putting advisor logic in controllers
If your controllers are doing policy checks, prompt shaping, and response logging, the design is already drifting.
2. Ignoring order semantics
If you have multiple advisors and no explicit ordering strategy, bugs will appear in weird places. Make ordering explicit — always.
3. Using one advisor for unrelated concerns
Keep advisors focused. Logging, policy enforcement, retrieval context, and response shaping should usually be separate units.
4. Forgetting streaming support
If your app supports both standard and streaming interactions, make sure you are using the correct advisor type for each path.
Key Takeaways
CallAdvisoris for normal request-response model calls;StreamAdvisoris for streamingChatClientRequestandChatClientResponsecarry data — and shared context — through the chain- Advisors can inspect, enrich, mutate, block, or observe the call
getOrder()determines execution order, but the response path unwinds in reverse- Attach advisors globally via the
ChatClientbuilder or selectively per request - Custom advisors give you a clean way to add logging, filtering, RAG, observability, and guardrails without polluting your business logic
If ChatClient is how you talk to the model, Advisors are how you control the conversation pipeline.
That distinction is what takes you from toy AI integrations to production-ready AI systems.
Resources
- Spring AI Advisors API Reference: https://docs.spring.io/spring-ai/reference/api/advisors.html
- Spring AI ChatClient Reference: https://docs.spring.io/spring-ai/reference/api/chatclient.html
- Spring AI Project Docs: https://docs.spring.io/spring-ai/reference/
Final Thought
A lot of AI tutorials focus on prompting.
Very few focus on architecture.
But production systems are not defined by how cleverly you ask the model a question. They are defined by how cleanly you manage everything around that question.
That is exactly why the Advisors API matters.
If you're building serious Spring AI applications, this is one of the features worth mastering early.
In the next lecture, we move into Tool Calling, where the model stops being just a text engine and starts invoking real Java methods.
That is where the fun begins.

