Skip to main content

Command Palette

Search for a command to run...

Spring AI Advisors API in Java: Build Middleware for AI Interactions

Updated
11 min read
K

I am a developer who loves Java, Spring, Quarkus, Micronaut, Open source, Microservices, Cloud

Spring AI Advisors API Explained

Series: Spring AI Complete Course — Lecture 4 of 12
Reading Time: 8 minutes
Level: Intermediate


Most developers stop at ChatClient.

That is enough for demos. It is not enough for production.

The moment an AI feature becomes real, you need behavior around the model call. You need logging. You need request filtering. You need latency tracking. You may need to attach tenant context, inject retrieval metadata, sanitize prompts, or block unsafe content before it ever reaches the model.

If you place all of that inside controllers and services, the codebase turns into a junk drawer. Business logic gets mixed with prompt plumbing. The same pre-processing and post-processing logic gets duplicated across every AI flow.

This is exactly where the Spring AI Advisors API comes in.

Advisors are Spring AI's middleware layer for LLM calls. If you already understand servlet filters, Spring MVC interceptors, or WebFlux filters, you already understand the core idea. Advisors let you intercept and shape requests before they hit the model, and inspect or transform responses before they return to your application.

In this article, we will break down:

  • CallAdvisor vs StreamAdvisor
  • ChatClientRequest and ChatClientResponse
  • Advisor ordering with getOrder()
  • Building a custom logging advisor
  • Building a content filtering advisor
  • Composing advisors with ChatClient — globally and per-request
  • Common mistakes to avoid

Let's get into it.


Why Advisors Exist

A simple Spring AI application often starts like this:

@RestController
@RequiredArgsConstructor
public class ChatController {

    private final ChatClient chatClient;

    @GetMapping("/ask")
    public String ask(@RequestParam String message) {
        return chatClient.prompt()
                .user(message)
                .call()
                .content();
    }
}

This works beautifully in the early stage.

Then reality arrives.

Now you need to answer questions like:

  • How do I log prompts and responses for debugging?
  • How do I block prohibited content before it reaches the model?
  • How do I inject metadata for observability or multi-tenant routing?
  • How do I attach retrieval context for RAG?
  • How do I measure model latency without scattering timers everywhere?

You could do all of this manually in service methods.

You shouldn't.

The better pattern is to centralize these cross-cutting concerns in advisors so your application code stays focused on business logic.


Core Types in the Advisors API

Spring AI defines different advisor types for synchronous and streaming scenarios.

Interface Use Case
CallAdvisor Blocking (synchronous) calls
StreamAdvisor Streaming (reactive) calls

CallAdvisor

Use this for normal, non-streaming interactions.

A CallAdvisor surrounds a standard ChatClient call, letting you inspect or modify the request, call the next step in the chain, then inspect or modify the response on the way back.

StreamAdvisor

Use this for streaming scenarios.

When your UI is receiving tokens progressively, the execution model changes. StreamAdvisor is designed for that case, but the mental model stays the same: intercept, enrich, observe, or block the flow.

ChatClientRequest

This object represents the request moving through the advisor chain. It carries:

  • The prompt (user message, system instructions)
  • Model options (temperature, max tokens, etc.)
  • Metadata and shared advisor context

Advisors can inspect it, enrich it, or pass it along unchanged.

ChatClientResponse

This object represents the final model response as it flows back through the advisor chain. It carries:

  • The generated response content
  • Token usage statistics
  • Response metadata and shared advisor context

The shared context is important. It lets one advisor place data into the chain and another advisor read it later — across the full lifecycle of a single call.


Think of Advisors as a Stack

This is the most important mental model.

When Spring AI invokes a ChatClient, it builds an advisor chain. Each advisor gets the request in sequence, and each one has the option to:

  • inspect the request
  • mutate the request
  • add context
  • call the next advisor
  • short-circuit the chain and return a response directly
  • inspect or mutate the response
  • throw an exception if processing should fail

Eventually, the final framework-provided advisor sends the request to the LLM.

Then the response comes back through the same chain in reverse order.

Request Path:   Advisor A (order 100) → Advisor B (order 200) → Model
                                                              ↓
Response Path:  Advisor A (order 100) ← Advisor B (order 200) ← Response

That means the first advisor to process the request becomes the last advisor to process the response.

This stack-like behavior is the key to understanding advisor ordering.


Advisor Ordering with getOrder()

Spring AI uses the standard Spring ordering model.

Each advisor exposes getOrder(), and lower values execute first.

But here's the catch: lower values execute first on the request path and last on the response path.

That means:

  • Lower order value → earlier for request, later for response
  • Higher order value → later for request, earlier for response

This is why many developers get confused the first time they build more than one advisor.

Practical interpretation

If your advisor is doing request validation or security checks, give it high precedence so it sees the request early.

If your advisor is doing response formatting or final output shaping, you might want it to run later on the request side so it gets earlier access to the response side.

Also note:

  • Ordered.HIGHEST_PRECEDENCE means the smallest possible order value
  • Ordered.LOWEST_PRECEDENCE means the largest possible order value
  • if two advisors share the same order value, execution order is not guaranteed

In production systems, do not leave this vague. Make ordering explicit.


Build a Custom Logging Advisor

Let's create a simple advisor that logs the incoming prompt and the outgoing model response.

package com.example.ai.advisors;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.ai.chat.client.advisor.api.CallAdvisor;
import org.springframework.ai.chat.client.advisor.api.CallAdvisorChain;
import org.springframework.core.Ordered;
import org.springframework.stereotype.Component;

@Component
public class LoggingAdvisor implements CallAdvisor {

    private static final Logger log = LoggerFactory.getLogger(LoggingAdvisor.class);

    @Override
    public String getName() {
        return "logging-advisor";
    }

    @Override
    public int getOrder() {
        return Ordered.HIGHEST_PRECEDENCE + 10;
    }

    @Override
    public ChatClientResponse adviseCall(ChatClientRequest request, CallAdvisorChain chain) {
        log.info("Prompt user text: {}", request.prompt().getUserMessage().getText());

        ChatClientResponse response = chain.nextCall(request);

        log.info("Model response: {}", response.chatResponse().getResult().getOutput().getText());
        return response;
    }
}

What this advisor is doing

  1. It reads the user prompt from ChatClientRequest
  2. It forwards the request to the next element in the chain using chain.nextCall(request)
  3. It receives the ChatClientResponse
  4. It logs the final output text before returning the response

This is the canonical advisor structure. It looks very similar to a servlet filter or a Spring AOP around advice because conceptually, that is exactly what it is.


Build a Content Filtering Advisor

Now let's create something more interesting: a policy guardrail.

This advisor checks the user prompt for prohibited phrases and blocks execution if the content is unsafe.

package com.example.ai.advisors;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.ai.chat.client.advisor.api.CallAdvisor;
import org.springframework.ai.chat.client.advisor.api.CallAdvisorChain;
import org.springframework.core.Ordered;
import org.springframework.stereotype.Component;

import java.util.Set;

@Component
public class ContentFilterAdvisor implements CallAdvisor {

    private static final Logger log = LoggerFactory.getLogger(ContentFilterAdvisor.class);

    private static final Set<String> BLOCKED_TERMS = Set.of(
            "password dump",
            "credit card scrape"
    );

    @Override
    public String getName() {
        return "content-filter-advisor";
    }

    @Override
    public int getOrder() {
        return Ordered.HIGHEST_PRECEDENCE;
    }

    @Override
    public ChatClientResponse adviseCall(ChatClientRequest request, CallAdvisorChain chain) {
        String userText = request.prompt().getUserMessage().getText().toLowerCase();

        boolean blocked = BLOCKED_TERMS.stream().anyMatch(userText::contains);
        if (blocked) {
            log.warn("Blocked request containing prohibited content");
            throw new IllegalArgumentException("Prompt blocked by content policy");
        }

        return chain.nextCall(request);
    }
}

Why this matters

This keeps policy enforcement out of your controller and service layer.

More importantly, it gives you a reusable place to apply consistent rules across all AI endpoints.

Today it's a simple keyword blocklist. Tomorrow it could be:

  • PII detection
  • regex-based sanitization
  • tenant-specific policy controls
  • prompt risk scoring
  • role-based access enforcement

That is the real value of advisors. They create clean extension points for production concerns.


Composing Advisors with ChatClient

Global wiring via the builder

Once the advisors exist, attach them to your ChatClient so every call gets the treatment automatically.

package com.example.ai.config;

import com.example.ai.advisors.ContentFilterAdvisor;
import com.example.ai.advisors.LoggingAdvisor;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ChatConfig {

    @Bean
    public ChatClient chatClient(ChatModel chatModel,
                                 LoggingAdvisor loggingAdvisor,
                                 ContentFilterAdvisor contentFilterAdvisor) {
        return ChatClient.builder(chatModel)
                .defaultAdvisors(contentFilterAdvisor, loggingAdvisor)
                .build();
    }
}

Because ContentFilterAdvisor has HIGHEST_PRECEDENCE, it runs first on the request path. The chain works like this:

  1. content filter examines the request
  2. logging advisor logs the prompt
  3. framework sends the request to the model
  4. response comes back
  5. logging advisor logs the response
  6. filter advisor sees the final unwind

This is why ordering is not just a number. It defines the behavior of your full middleware pipeline.

Per-request wiring

You can also attach advisors to individual calls when you need selective behavior:

@Service
public class AiService {

    private final ChatClient chatClient;
    private final LoggingAdvisor loggingAdvisor;
    private final ContentFilterAdvisor filterAdvisor;

    public AiService(ChatClient chatClient,
                     LoggingAdvisor loggingAdvisor,
                     ContentFilterAdvisor filterAdvisor) {
        this.chatClient = chatClient;
        this.loggingAdvisor = loggingAdvisor;
        this.filterAdvisor = filterAdvisor;
    }

    public String chatWithGuardrails(String userMessage) {
        return chatClient.prompt()
                .user(userMessage)
                .advisors(loggingAdvisor, filterAdvisor)
                .call()
                .content();
    }
}

When to Use StreamAdvisor

If you are using synchronous calls with .call(), CallAdvisor is the right abstraction.

If you are returning streamed responses with .stream(), use StreamAdvisor. The pattern is nearly identical but you work with Flux streams.

package com.example.ai.advisors;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.ChatClientRequest;
import org.springframework.ai.chat.client.advisor.api.StreamAdvisor;
import org.springframework.ai.chat.client.advisor.api.StreamAdvisorChain;
import org.springframework.stereotype.Component;

import reactor.core.publisher.Flux;

@Component
public class StreamingLoggingAdvisor implements StreamAdvisor {

    private static final Logger log = LoggerFactory.getLogger(StreamingLoggingAdvisor.class);

    @Override
    public Flux<ChatClientResponse> adviseStream(ChatClientRequest request, StreamAdvisorChain chain) {
        log.info("Streaming prompt: {}", request.prompt().getUserMessage().getText());

        return chain.nextStream(request)
                .doOnNext(response -> log.debug("Stream chunk received"));
    }

    @Override
    public int getOrder() {
        return 100;
    }

    @Override
    public String getName() {
        return "streaming-logging-advisor";
    }
}

Streaming advisors are especially useful in scenarios like:

  • live chat interfaces
  • SSE endpoints
  • token-by-token observability
  • real-time content moderation
  • progressive UI rendering

Real Production Use Cases for Advisors

Once you understand the pattern, a lot of Spring AI features make more sense.

Advisors are the right place for:

  • Prompt logging for debugging and traceability
  • Guardrails for content filtering and safety policies
  • RAG context injection before the model call
  • Caching — check a cache in your advisor, return early if the result exists
  • Rate limiting — throttle requests based on user tiers or global limits
  • Observability hooks for metrics and latency tracking
  • Tenant metadata propagation in multi-tenant apps
  • Response post-processing for cleanup or formatting

This is why I describe Advisors as middleware for AI. They sit around the model call and handle the non-business concerns that every serious AI application eventually needs.


Common Mistakes to Avoid

1. Putting advisor logic in controllers

If your controllers are doing policy checks, prompt shaping, and response logging, the design is already drifting.

2. Ignoring order semantics

If you have multiple advisors and no explicit ordering strategy, bugs will appear in weird places. Make ordering explicit — always.

3. Using one advisor for unrelated concerns

Keep advisors focused. Logging, policy enforcement, retrieval context, and response shaping should usually be separate units.

4. Forgetting streaming support

If your app supports both standard and streaming interactions, make sure you are using the correct advisor type for each path.


Key Takeaways

  • CallAdvisor is for normal request-response model calls; StreamAdvisor is for streaming
  • ChatClientRequest and ChatClientResponse carry data — and shared context — through the chain
  • Advisors can inspect, enrich, mutate, block, or observe the call
  • getOrder() determines execution order, but the response path unwinds in reverse
  • Attach advisors globally via the ChatClient builder or selectively per request
  • Custom advisors give you a clean way to add logging, filtering, RAG, observability, and guardrails without polluting your business logic

If ChatClient is how you talk to the model, Advisors are how you control the conversation pipeline.

That distinction is what takes you from toy AI integrations to production-ready AI systems.


Resources


Final Thought

A lot of AI tutorials focus on prompting.

Very few focus on architecture.

But production systems are not defined by how cleverly you ask the model a question. They are defined by how cleanly you manage everything around that question.

That is exactly why the Advisors API matters.

If you're building serious Spring AI applications, this is one of the features worth mastering early.

In the next lecture, we move into Tool Calling, where the model stops being just a text engine and starts invoking real Java methods.

That is where the fun begins.

More from this blog

C

Coding Saint - Simple Short Tutorials

57 posts

I am Kumar Pallav, a passionate programmer.I love java, open source & microservices . I create Simple , Short Tutorials Follow me at https://twitter.com/kumar_pallav