Spring AI Advisors API in Java: Build Middleware for AI Interactions
I am a developer who loves Java, Spring, Quarkus, Micronaut, Open source, Microservices, Cloud
Spring AI Advisors API in Java: Build Middleware for AI Interactions
Series: Spring AI Complete Course — Lecture 4 of 12
Reading Time: 8 minutes
Level: Intermediate
Introduction
When your Spring AI application starts growing, you'll inevitably face a familiar problem: cross-cutting concerns. Logging user prompts for compliance, filtering inappropriate content before it reaches the model, caching frequent queries, or injecting context from a vector database — these aren't business logic, but they need to happen on every AI interaction.
If you're manually adding this code to every ChatClient call, you're doing it wrong. Your business logic gets buried under infrastructure noise, and consistency becomes a nightmare. This is exactly the problem that middleware solves in web frameworks, and Spring AI brings the same pattern to AI interactions through the Advisors API.
In this post, we'll explore Spring AI's Advisors API — what it is, why it matters, and how to build production-ready custom advisors in Java.
What Are Advisors in Spring AI?
Advisors are Spring AI's middleware layer. Think of them as interceptors that sit between your application code and the AI model. They can:
- Inspect requests before they reach the model
- Transform prompts or responses
- Short-circuit the chain and return responses directly
- Add metadata or context to interactions
The Advisors API is built around two core interfaces:
| Interface | Use Case |
CallAdvisor | Blocking (synchronous) calls |
StreamAdvisor | Streaming (reactive) calls |
Each advisor implements a simple method that receives a ChatClientRequest and either modifies it, transforms the response, or returns a response directly.
The Advisor Chain and Ordering
Advisors execute in a chain pattern. When you attach multiple advisors to a ChatClient, they execute in order based on their getOrder() method value. The flow looks like this:
Request Path: Advisor A (order 100) → Advisor B (order 200) → Model
↓
Response Path: Advisor A (order 100) ← Advisor B (order 200) ← Response
Key rules:
- Lower
getOrder()values execute first on the request path - The chain unwinds in reverse on the response path
- The last advisor to touch the request is the first to process the response
This is identical to how Servlet Filters or Spring's HandlerInterceptor work.
Core Objects: ChatClientRequest and ChatClientResponse
The objects you work with are ChatClientRequest and ChatClientResponse. These encapsulate everything about the interaction:
ChatClientRequest
- The prompt (user message, system instructions)
- Model options (temperature, max tokens, etc.)
- Metadata and custom attributes
ChatClientResponse
- The generated response content
- Token usage statistics
- Response metadata
An advisor can read these, modify them, or return a ChatClientResponse immediately without ever calling the model. That's how you implement things like caching — check the cache in your advisor, return the cached response if it exists, otherwise proceed down the chain.
Lab: Building Custom Advisors in Java
Let's build two practical custom advisors: one for logging, and one for content filtering. Then we'll wire them together.
Advisor 1: Logging Advisor
This advisor logs every request before it hits the model and logs every response that comes back:
package com.example.ai.advisors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.advisor.api.CallAdvisor;
import org.springframework.ai.chat.client.advisor.api.CallAdvisorChain;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.stereotype.Component;
@Component
public class LoggingAdvisor implements CallAdvisor {
private static final Logger logger = LoggerFactory.getLogger(LoggingAdvisor.class);
@Override
public ChatClientResponse adviseCall(ChatClientRequest request, CallAdvisorChain chain) {
// Log the incoming request
logger.info("[ADVISOR] Sending prompt: {}", request.prompt().getContents());
// Continue down the chain
ChatClientResponse response = chain.nextCall(request);
// Log the response
logger.info("[ADVISOR] Received response: {}",
response.chatResponse().getResult().getOutput().getContent());
return response;
}
@Override
public int getOrder() {
return 100; // Execute early in the chain
}
@Override
public String getName() {
return "LoggingAdvisor";
}
}
Key points:
- We log the request, call
chain.nextCall(request)to proceed, then log the response - Order value of 100 means this runs before advisors with higher order values
- The
getName()method helps with debugging and monitoring
Advisor 2: Content Filtering Advisor
This advisor intercepts requests and checks for prohibited keywords. If found, it returns an error response immediately without calling the model:
package com.example.ai.advisors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.advisor.api.CallAdvisor;
import org.springframework.ai.chat.client.advisor.api.CallAdvisorChain;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.ai.chat.messages.AssistantMessage;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.model.Generation;
import org.springframework.stereotype.Component;
import java.util.List;
@Component
public class ContentFilterAdvisor implements CallAdvisor {
private static final Logger logger = LoggerFactory.getLogger(ContentFilterAdvisor.class);
private static final List<String> BLOCKED_KEYWORDS = List.of(
"harmful", "illegal", "prohibited", "malicious"
);
@Override
public ChatClientResponse adviseCall(ChatClientRequest request, CallAdvisorChain chain) {
String userInput = request.prompt().getContents();
boolean containsBlocked = BLOCKED_KEYWORDS.stream()
.anyMatch(keyword -> userInput.toLowerCase().contains(keyword));
if (containsBlocked) {
logger.warn("[ADVISOR] Blocked request containing prohibited content");
// Return a synthetic response without calling the model
return ChatClientResponse.builder()
.chatResponse(ChatResponse.builder()
.results(List.of(Generation.builder()
.output(new AssistantMessage(
"Request blocked by content policy."))
.build()))
.build())
.build();
}
// Content is clean, proceed down the chain
return chain.nextCall(request);
}
@Override
public int getOrder() {
return 200; // Execute after logging (100)
}
@Override
public String getName() {
return "ContentFilterAdvisor";
}
}
Key points:
- Runs after the logging advisor because order 200 > 100
- If blocked content is detected, returns a synthetic
ChatClientResponse - The model never gets called when content is blocked
- This is a powerful guardrail pattern for production AI applications
Wiring Advisors to ChatClient
Now we attach these advisors to our ChatClient. You can do this globally in the builder:
package com.example.ai.config;
import com.example.ai.advisors.ContentFilterAdvisor;
import com.example.ai.advisors.LoggingAdvisor;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ChatConfig {
@Bean
public ChatClient chatClient(ChatModel chatModel,
LoggingAdvisor loggingAdvisor,
ContentFilterAdvisor filterAdvisor) {
return ChatClient.builder(chatModel)
.defaultAdvisors(loggingAdvisor, filterAdvisor)
.build();
}
}
Every call through this ChatClient now gets logged and filtered automatically.
Alternative: Per-request advisors
You can also attach advisors to individual requests:
@Service
public class AiService {
private final ChatClient chatClient;
private final LoggingAdvisor loggingAdvisor;
private final ContentFilterAdvisor filterAdvisor;
public AiService(ChatClient chatClient,
LoggingAdvisor loggingAdvisor,
ContentFilterAdvisor filterAdvisor) {
this.chatClient = chatClient;
this.loggingAdvisor = loggingAdvisor;
this.filterAdvisor = filterAdvisor;
}
public String chatWithGuardrails(String userMessage) {
return chatClient.prompt()
.user(userMessage)
.advisors(loggingAdvisor, filterAdvisor) // Per-request
.call()
.content();
}
}
Working with Streaming Responses
If you're using streaming responses, implement StreamAdvisor instead. The pattern is nearly identical, but you work with Flux streams:
package com.example.ai.advisors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.advisor.api.StreamAdvisor;
import org.springframework.ai.chat.client.advisor.api.StreamAdvisorChain;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.stereotype.Component;
import reactor.core.publisher.Flux;
@Component
public class StreamingLoggingAdvisor implements StreamAdvisor {
private static final Logger logger =
LoggerFactory.getLogger(StreamingLoggingAdvisor.class);
@Override
public Flux<ChatClientResponse> adviseStream(
ChatClientRequest request,
StreamAdvisorChain chain) {
logger.info("[ADVISOR] Streaming prompt: {}",
request.prompt().getContents());
return chain.nextStream(request)
.doOnNext(response ->
logger.info("[ADVISOR] Stream chunk received"));
}
@Override
public int getOrder() {
return 100;
}
@Override
public String getName() {
return "StreamingLoggingAdvisor";
}
}
The chain mechanism works identically — you process the stream, pass it along, and process it on the way back.
Real-World Use Cases for Advisors
Here are practical patterns you can build with the Advisors API:
1. Request/Response Logging
Audit trails for compliance, debugging, or analytics.
2. Content Filtering
Block harmful prompts or filter sensitive responses before they reach users.
3. Caching Layer
Cache frequent queries and return cached responses to save on API costs and latency.
4. Rate Limiting
Throttle requests based on user tiers or global limits.
5. Context Injection (RAG)
Retrieve relevant documents from a vector store and inject them into prompts. Spring AI provides RetrievalAdvisor for this.
6. Prompt Enrichment
Automatically add system instructions, examples, or dynamic context based on user type.
Key Takeaways
- Advisors are Spring AI's middleware pattern for intercepting AI interactions
- Use
CallAdvisorfor blocking calls andStreamAdvisorfor streaming - The
getOrder()method controls execution order (lower = earlier on requests) - Advisors receive
ChatClientRequestand can returnChatClientResponse - Attach advisors once to your
ChatClientbuilder, and every prompt gets the treatment automatically - Keep business logic in services, infrastructure concerns in advisors
Conclusion
The Advisors API brings clean architecture principles to Spring AI applications. By extracting cross-cutting concerns into reusable, chainable components, you keep your service layer focused on what matters: your business logic.
Whether you need logging, filtering, caching, or RAG context injection, advisors provide a consistent, extensible pattern. And because they integrate seamlessly with Spring's dependency injection and bean lifecycle, you get all the power of the Spring ecosystem alongside your AI interactions.
What's Next?
In Lecture 5, we'll dive deep into RetrievalAdvisor — Spring AI's built-in advisor for RAG systems. You'll learn how to wire a vector database into your advisor chain and automatically inject relevant context into every prompt. It's the foundation for building production-grade retrieval-augmented generation applications.
Tags
#spring-ai #java #spring-boot #ai #advisors #middleware #llm #chatgpt #claude #rag #vector-database #interceptor-pattern #cross-cutting-concerns
Follow the Series: Spring AI Complete Course — All Lectures


