Spring Boot AI Integration: Complete Guide to Adding LLM Capabilities to Your Application

Adding AI capabilities to your Spring Boot application doesn’t require rebuilding from scratch. In this comprehensive guide, you’ll learn how to enhance an existing Spring Boot application with LLM-powered features using LangChain4j.

What You’ll Build

A customer support assistant that can:

Answer questions about your product using documentation (RAG)
Process natural language commands via tool calling
Maintain conversation context across requests

Step 1: Add Dependencies

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
<dependencies>
    <!-- LangChain4j with OpenAI -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
        <version>0.36.2</version>
    </dependency>
    
    <!-- For RAG with PgVector -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-pgvector</artifactId>
        <version>0.36.2</version>
    </dependency>
</dependencies>

Step 2: Configure LLM Connection

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# application.yml
langchain4j:
  open-ai:
    chat-model:
      model-name: gpt-4o-mini
      temperature: 0.3
      max-tokens: 2000
      timeout: 30s
      log-requests: true
      log-responses: true

For cost optimization, use gpt-4o-mini for most tasks and gpt-4o only for complex reasoning.

Step 3: Create Your First AI Service

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
@AiService
public interface CustomerSupportAssistant {
    
    @SystemMessage("""
        You are a customer support assistant for our SaaS product.
        Be helpful, concise, and professional.
        If you don't know the answer, say so honestly.
        Always provide specific steps when giving instructions.
        """)
    String chat(@UserMessage String userMessage,
                @MemoryId String sessionId);
}

The @MemoryId parameter enables automatic conversation memory per session.

Step 4: Add RAG (Knowledge Base)

Load your documentation into a vector store:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
@Configuration
public class RagConfig {
    
    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("text-embedding-3-small")
            .build();
    }
    
    @Bean
    public EmbeddingStore<TextSegment> embeddingStore(DataSource dataSource) {
        return PgVectorEmbeddingStore.builder()
            .dataSource(dataSource)
            .table("document_embeddings")
            .dimension(1536)
            .build();
    }
    
    @Bean
    public ContentRetriever contentRetriever(
            EmbeddingStore<TextSegment> store,
            EmbeddingModel model) {
        return EmbeddingStoreContentRetriever.builder()
            .embeddingStore(store)
            .embeddingModel(model)
            .maxResults(5)
            .minScore(0.7)
            .build();
    }
}

Update your AI service to use RAG:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
@AiService
public interface CustomerSupportAssistant {
    
    @SystemMessage("""
        Answer questions based on the provided documentation.
        If the documentation doesn't contain the answer, say so.
        Always cite the source document name in your response.
        """)
    String chat(@UserMessage String question,
                @MemoryId String sessionId,
                @Embedding Match documentation);
}

Step 5: Index Your Documents

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
@Service
public class DocumentIndexer {
    
    private final EmbeddingModel embeddingModel;
    private final EmbeddingStore<TextSegment> embeddingStore;
    
    public void indexDocuments(Path docsDir) {
        DocumentLoader loader = DocumentLoaders.fromFileSystem(docsDir);
        List<Document> documents = loader.load();
        
        // Split into chunks
        TextSplitter splitter = DocumentByParagraphSplitter.builder()
            .maxSegmentSize(500)
            .maxOverlapSize(100)
            .build();
        
        List<TextSegment> segments = splitter.splitAll(documents);
        
        // Store with embeddings
        embeddingStore.addAll(
            embeddingModel.embedAll(
                segments.stream()
                    .map(TextSegment::text)
                    .toList()
            ).content().vectorStoreEmbeddings(),
            segments
        );
    }
}

Step 6: Add Tool Calling

Let your AI assistant perform actions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
public class SupportTools {
    
    private final OrderService orderService;
    private final TicketService ticketService;
    
    @Tool("Look up an order by order ID. Returns order status and details.")
    public OrderInfo lookupOrder(@P("orderId") String orderId) {
        return orderService.getOrderInfo(orderId);
    }
    
    @Tool("Create a support ticket with the given description.")
    public String createTicket(@P("subject") String subject,
                               @P("description") String description,
                               @P("priority") String priority) {
        return ticketService.create(subject, description, priority);
    }
    
    @Tool("Check the current system status and any ongoing incidents.")
    public SystemStatus checkSystemStatus() {
        return statusService.getCurrentStatus();
    }
}

Wire tools into your assistant:

1
2
3
4
5
6
7
8
@AiService
public interface CustomerSupportAssistant {
    
    String chat(@UserMessage String question,
                @MemoryId String sessionId,
                @Embedding Match documentation,
                @ToolSet SupportTools tools);
}

Step 7: REST API

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
@RestController
@RequestMapping("/api/chat")
public class ChatController {
    
    private final CustomerSupportAssistant assistant;
    
    @PostMapping
    public ChatResponse chat(@RequestBody ChatRequest request) {
        String response = assistant.chat(
            request.getMessage(),
            request.getSessionId()
        );
        return new ChatResponse(response);
    }
    
    @PostMapping("/stream")
    public Flux<String> chatStream(@RequestBody ChatRequest request) {
        // For streaming responses
        return Flux.create(sink -> {
            String response = assistant.chat(
                request.getMessage(),
                request.getSessionId()
            );
            sink.next(response);
            sink.complete();
        });
    }
}

Production Best Practices

1. Prompt Versioning

1
2
3
4
5
6
7
8
@AiService
public interface CustomerSupportAssistant {
    
    // Use externalized prompts
    @SystemMessage(fromResource = "prompts/support-v2.txt")
    String chat(@UserMessage String question,
                @MemoryId String sessionId);
}

2. Cost Monitoring

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
@Aspect
@Component
public class LlmCostMonitor {
    
    @Around("@annotation(dev.langchain4j.service.AiService)")
    public Object monitorCost(ProceedingJoinPoint joinPoint) throws Throwable {
        long start = System.currentTimeMillis();
        Object result = joinPoint.proceed();
        long duration = System.currentTimeMillis() - start;
        
        // Log and alert on expensive calls
        if (duration > 5000) {
            log.warn("Slow LLM call: {}ms", duration);
        }
        
        metricsService.recordLlmCall(duration);
        return result;
    }
}

3. Caching

1
2
3
4
5
6
7
8
@Bean
public ChatLanguageModel cachedModel(ChatLanguageModel delegate) {
    return CachingChatLanguageModel.builder()
        .delegate(delegate)
        .expireAfterWrite(Duration.ofMinutes(10))
        .maximumSize(1000)
        .build();
}

4. Graceful Degradation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
@Service
public class SmartAssistant {
    
    private final CustomerSupportAssistant aiAssistant;
    private final FaqService faqService;
    
    public String respond(String question, String sessionId) {
        try {
            return aiAssistant.chat(question, sessionId);
        } catch (Exception e) {
            log.warn("AI service unavailable, falling back to FAQ", e);
            return faqService.findBestMatch(question)
                .orElse("I'm currently unable to process your request. " +
                       "A human agent will be with you shortly.");
        }
    }
}

Performance Tips

Use smaller models for simple tasks: gpt-4o-mini handles most customer queries fine
Cache frequent answers: Questions like “What are your hours?” shouldn’t hit the API every time
Async processing: Use CompletableFuture for non-interactive AI tasks
Batch embeddings: When indexing documents, batch your embedding API calls

What’s Next?

Add evaluation metrics (answer quality, latency)
Implement A/B testing for prompts
Add support for multiple languages
Build an admin dashboard for monitoring

Conclusion

Adding AI to a Spring Boot application is surprisingly straightforward with LangChain4j. The key is starting simple — a basic chat endpoint — and iteratively adding RAG, tools, and production hardening.

The complete working example is available on GitHub.

Questions? Drop them in the comments!