Securing the Future of AI: Authorization for Java RAG Systems using LangChain4j and OpenFGA

Deepu Sasidharan
In this post, we explore how to build a robust Java-based RAG system by integrating LangChain4j with OpenFGA for fine-grained, relationship-based access control. Learn how to tackle the unique security challenges of RAG applications—from dynamic context and complex document relationships to real-time authorization checks—and follow step-by-step examples that show you how to implement a secure system.

Artificial Intelligence is evolving at a breathtaking speed. AI agents will soon become the standard for building applications and they are going to be the new Web 2.0. With all that comes the critical need for robust security measures and checks. Retrieval-Augmented Generation (RAG) systems have become a cornerstone in building AI agents and applications. Agents that can access and utilize domain-specific knowledge are more accurate and more useful. However, as these systems become more prevalent, ensuring they only access and expose authorized information has become a critical challenge.

Sensitive Information Disclosure is a common issue that plagues RAG-based systems. We don’t want the Large Language Model (LLM) to accidentally access or expose sensitive data from a database. Traditional Role-Based Access Control (RBAC) systems are not enough to secure RAG applications and agents. This is where Fine-Grained Authorization (FGA) or Relationship-Based Access Control (ReBAC) shines as an Authorization solution for RAG.

The Security Challenge in RAG Systems

RAG systems present some unique security challenges that traditional Role-Based Access Control (RBAC) systems struggle to address:

  1. Dynamic Context: RAG systems need to make real-time decisions about which documents and tools can be used for context enhancement.
  2. Complex Relationships: Documents often have intricate relationships with users, teams, and projects, forming a complex graph that simple role-based systems can’t model effectively.
  3. Granular Control: Different parts of documents may have different sensitivity levels requiring fine-grained control.
  4. Performance Requirements: Authorization checks must be fast to maintain the AI system’s responsiveness.

This is where FGA shines, it allows us to:

  • Model complex relationships between users, groups, and documents
  • Handle hierarchical and transitive permissions naturally
  • Scale authorization rules efficiently

Let’s look at some real-world scenarios where sensitive data could be at risk:

  • A knowledge base assistant exposing confidential documents to employees
  • A support AI revealing sensitive user information
  • An AI assistant leaking proprietary information about a company
  • A chatbot exposing private data to unauthorized users

Why OpenFGA?

OpenFGA is an open-source Fine-Grained Authorization system, created by Auth0, that allows you to manage access control in your applications. It’s currently a CNCF Sandbox Project. OpenFGA is a powerful tool that allows you to define complex access control policies and enforce them in your application. It is a great choice for securing RAG applications as it solves most of the security challenges in RAG systems, including the ones mentioned above. You can also use a hosted version of OpenFGA from Auth0 if you don’t want to self-host.

If you’re new to RAG and access control, I recommend checking out this introductory post on RAG and Access Control: Where Do You Start?.

Why Java and LangChain4j?

Java isn’t far behind when it comes to building AI applications. It has a large developer community and a rich ecosystem of open-source frameworks, and enterprises are adopting these frameworks.

LangChain4j is one of the popular Java AI frameworks. It is a powerful open-source framework that provides capabilities similar to LangChain and LlamaIndex used in the Python/JS ecosystem. It is becoming the Spring Boot of the Java AI ecosystem. The framework offers a comprehensive set of features for building AI applications. Some of the key features include:

  • Easy integration with major commercial and open-source LLMs
  • Unified API for different model providers
  • Tools like prompt templating, chat memory management, and function calls.
  • High-level patterns, abstractions, and implementations for building Agents and RAG applications

Building a Secure RAG System

LangChain4j makes it simple to get started with the “Easy RAG” feature while providing advanced capabilities for complex use cases. Hence, in this guide, we will look at a secure Java RAG system using LangChain4j and OpenFGA. We’ll focus on Relationship-Based Access Control (ReBAC), which prevents unauthorized access to sensitive information while maintaining the flexibility and power of RAG systems.

Prerequisites

This guide was created with the following tools and services:

Setting up the project

To get started, clone the auth0-ai-samples repository from GitHub:

git clone https://github.com/oktadev/auth0-ai-samples.git
cd auth0-ai-samples/authorization-for-rag/langchain4j-java

The application uses Gradle as the build system and is structured as follows:

  • src/main/java/rag/RagApplication.java: The main entry point of the application that defines the RAG pipeline.
  • src/main/java/rag/FGARetriever.java: The FGA-aware retriever that retrieves documents from the vector store based on the user’s permissions.
  • src/main/java/rag/FGAInit.java: Script to create the FGA store and models.
  • src/main/resources/docs/*.md: Sample markdown documents that will be used as context for the LLM. We have public and private documents for demonstration purposes.
  • build.gradle: The Gradle build file that defines the dependencies and tasks.

Let us look at the important bits and pieces before we run the application.

RAG pipeline

The RAG pipeline configures the underlying LLM and defines the retrievers for data. The pipeline is defined as follows:

/** src/main/java/rag/RagApplication.java **/
final ChatLanguageModel CHAT_MODEL_OPENAI = OpenAiChatModel
                .builder()
                .apiKey(System.getProperty("OPENAI_API_KEY"))
                .modelName(GPT_4_O_MINI)
                .build();

var user = "user2";
// 1. Read and load documents from the assets folder
var documents = FileSystemDocumentLoader
                  .loadDocuments("src/main/resources/docs");
// 2. Create an in-memory vector store from the documents.
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
// 3. Create a base retriever
ContentRetriever baseRetriever = EmbeddingStoreContentRetriever.from(embeddingStore);
// 4. Create the FGA retriever that wraps the base retriever
ContentRetriever fgaRetriever = FGARetriever.create(
        baseRetriever,
        // FGA tuple to query for the user's permissions
        content -> new ClientBatchCheckItem()
                .user("user:" + user)
                .relation("viewer")
                ._object(
                  "doc:" + 
                  content.textSegment().metadata()
                  .getString("file_name").split("\\.")[0]
                )
        );
// 5. Create an assistant with the FGA retriever
var assistant = AiServices.builder(Assistant.class)
        .chatLanguageModel(CHAT_MODEL_OPENAI)
        .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
        .contentRetriever(fgaRetriever)
        .build();
// 6. Query the retrieval chain with a prompt
var query = "Show me forecast for ZEKO?";
var answer = assistant.chat(query);

If you want to use Ollama running locally, uncomment the CHAT_MODEL_OLLAMA variable and comment out the CHAT_MODEL_OPENAI line. Be sure to update the base URL and model name to match your local Ollama instance.

Here is a visual representation of the RAG architecture

FGA retriever

The FGARetriever class implements ContentRetriever and filters documents based on the authorization model defined in the FGA store. This retriever is ideal for scenarios where you already have documents in a vector store and want to filter the vector store results based on the user’s permissions. Assuming the vector store already narrows down the documents to a few, the FGA retriever will further narrow down the documents to only the ones to which the user has access.

Here is the important part of the FGARetriever class simplified for readability:

/** src/main/java/rag/FGARetriever.java **/
// Create a client to interact with the FGA store
new OpenFgaClient(new ClientConfiguration()
        .apiUrl(/*...*/)
        .storeId(/*...*/)
        // credentials only required if using Auth0 FGA, 
        // comment out for local OpenFGA
        .credentials(/*...*/)
)

// Check permissions for the user and object
private Map<String, Boolean> checkPermissions(List<ClientBatchCheckItem> checks) {
    var options = //...
    var request = //...
    var response = fgaClient.batchCheck(request, options).get();

    Map<String, Boolean> permissionMap = new HashMap<>();
    for (var result : response.getResult()) {
        var checkKey = getCheckKey(result.getRequest());
        permissionMap.put(checkKey, result.isAllowed());
    }
    return permissionMap;
}

// retrieve the documents and filter them using checkPermissions function
public List<Content> retrieve(Query query) {
    // First, get relevant documents from the base retriever
    List<Content> relevantContent = baseRetriever.retrieve(query);

    // Create data structures to track checks and document mappings
    // var checks, documentToObject, seenChecks = ...

    // Process each document to build checks
    for (Content doc : relevantContent) {
        ClientBatchCheckItem check = buildQuery.apply(doc);
        var checkKey = getCheckKey(check);
        documentToObject.put(doc, checkKey);

        // Skip duplicate checks for same user, object, and relation
        if (!seenChecks.contains(checkKey)) {
            /*...*/
        }
    }

    var permissionsMap = checkPermissions(checks);
    // filter based on permission
    return relevantContent.stream()
            .filter(doc -> permissionsMap.get(documentToObject.get(doc)))
            .collect(Collectors.toList());
}

Set up An FGA Instance

Run OpenFGA locally with Docker using the following command:

docker pull openfga/openfga && \
docker run -p 8080:8080 -p 8081:8081 -p 3000:3000 openfga/openfga run

If you are using Auth0 FGA, visit the dashboard, navigate to Settings, and in the Authorized Clients section, click + Create Client. Give your client a name, mark all three client permissions, and then click Create. Once your client is created, you’ll see a modal containing Store ID, Client ID, and Client Secret. Click Continue to see the FGA_API_URL and FGA_API_AUDIENCE.

Add .env file with the following content to the root of the project.

# OpenAI key. Not required if using Ollama
OPENAI_API_KEY=<your-openai-api-key>

# Open FGA
FGA_STORE_ID=<your-fga-store-id>
# Required only for Auth0 FGA
FGA_CLIENT_ID=<your-fga-store-client-id>
FGA_CLIENT_SECRET=<your-fga-store-client-secret>
FGA_API_URL=https://api.xxx.fga.dev
FGA_API_AUDIENCE=https://api.xxx.fga.dev/

Check the instructions here to find your OpenAI API key.

Create the FGA models and tuples

We will use a simple model that defines just an owner and viewer relation for docs:

model
  schema 1.1

type user

type doc
  relations
    define owner: [user]
    define viewer: [user, user:*]

Check out this documentation to learn more about creating an authorization model in FGA.

Now, to have access to the public information, we will need the following tuple on FGA.

  • Useruser:*
  • Object: select doc and add public-doc in the ID field
  • Relation : viewer

A tuple signifies a user’s relation to a given object. For example, the above tuple implies that all users can view the public-doc object.

Now, to have access to the private information, we will need the following tuple on FGA.

  • Useruser:user1
  • Object: select doc and add private-doc in the ID field
  • Relation : viewer

For OpenFGA, the store, models, and tuples can be created programmatically using the OpenFGA Java SDK. This is defined in src/main/java/rag/FGAInit.java and can be run using the following command:

gradle runFGAInit

Once done, copy the store ID from the console and update the .env file with the store ID.

If you are using Auth0 FGA, navigate to Model Explorer. You’ll need to update the model information with the above schema. Next, navigate to the Tuple Management section and click + Add Tuple, fill in the details for both tuples detailed above.

Test the application

Now that you have set up the application and the FGA store, you can run the application using the following command:

./gradlew run

The application will start with the query, Show me forecast for ZEKO? Since this information is in a private document, and user2 does not have access to this document, the application will not be able to retrieve it. The FGA retriever will filter out the private document from the vector store results and, hence, print a similar output.

The provided context does not include specific forecasts or projections for Zeko Advanced Systems Inc. ...

If you change the query to something that is available in the public document, the application will be able to retrieve the information.

Now change the user to user1 and run the application again:

./gradlew run

This time, you should see a response containing the forecast information since we have a tuple that defines the viewer relation for user1 to the private-doc object.

Congratulations! You have run a simple RAG application using LangChain4j and secured it using OpenFGA.

Implementation Patterns for Securing RAG Systems

Here are some common patterns for securing RAG systems using FGA:

  1. Pre-filtering Pattern: Check permissions before searching the vector database. This can be done using the OpenFGA list objects request and metadata filtering feature of supported vector databases.
  2. Post-filtering Pattern: Perform a vector database search first and then filter the results by permissions. This can be done using the OpenFGA batch check request and custom retrievers or re-rankers. This is what we just did in the example above.
  3. Hybrid Pattern: Combine pre-filtering and post-filtering patterns to achieve the best of both worlds.

Learn more about OpenFGA

In this post, you learned how to secure a LangChain4j-based RAG application using OpenFGA. We invite you to check out the OpenFGA code on GitHub. Visit Auth for GenAI to learn more about how Auth0 can help you secure your GenAI applications.

Total
0
Shares
Previous Post

High-Demand Workshops at JCON EUROPE 2025 – Tickets Available Now

Next Post

AI-Powered Form Wizards: Chat, Click, Done

Related Posts