Bridging Java and Python for AI/ML in Production: The Case for GraalPy on GraalVM

Java with a dash of Python

Abstract

In the Java stack, tapping into Python’s powerhouse of NLP and AI/ML libraries often means messy inter-process plumbing—until GraalPy changed the game. With GraalVM’s Python runtime embedded directly in your JVM, you can import and run Python libraries like TextBlob for sentiment analysis straight from Java. This approach offers a viable alternative for the more complex solutions, such as HTTP, gRPC, or subprocesses.

In this article, we’ll explore a proof-of-concept project built with Spring Boot that bundles Python sentiment scoring into a Java API. We’ll show you how the setup is quick, the code stays clean, and the performance is surprisingly good, all while keeping the entire process in-memory.

Source project: https://github.com/vshanbha/graalpy-sentiment


Introduction

I am a career Java programmer, and until AI/ML entered everyday conversations, I was happy using Java for almost everything development. I have even used Java (through GWT) for creating web-based user interfaces that ultimately get deployed as HTML + CSS + JavaScript. That was before the arrival of Angular and React. We could, of course, use HTML + JavaScript, but in those days, web development with compiled code wasn’t actually available any other way. Java + GWT offered an elegant (albeit slightly slower) solution for a Java-first approach.

I digress, though, just to make a point. The point being that I like to create software so that full-stack developers do not have to learn too many different languages just to maintain Production code. That said, at present, frontend technologies have changed a lot, and GWT is now legacy. Full-stack programmers have to learn at least HTML + CSS + Typescript (or JavaScript). That’s just bread and butter now, and at least up until recently, it was sufficient.

The AI ERa

For much of my career, I wasn’t doing any projects that involved AI integration. That changed in the last 5-6 years, and more and more projects were using AI/ML for something or other. Back then, I worked with AI/ML specialists who only spoke one language. Words like Python, Anaconda, and Pandas suddenly meant that I wasn’t watching Animal Planet or National Geographic, but that I was talking to my AI/ML colleagues.

At one point, I remember doing a Google search involving the words Machine Learning and Java, but Google decided that I should learn Machine Learning using Python and gave me results about Getting Started accordingly. It was like Java didn’t exist. While that statement isn’t true anymore, now and then, one comes across a problem involving AI / ML / NLP where the solution lies in using one or another Python library.

Suddenly, I came to the realisation that as a full-stack or Backend Engineer, being a Java Pro wasn’t enough. You also need to delve a little bit into Python because of your AI/ML specialist colleagues, who, by the way, don’t speak Java.

That leads me to this article. The friction between enterprise-grade JVM-based production environments and Python’s vast ML/NLP ecosystem is a common challenge for many developers. Historically, integrating the two meant complex and often slow inter-process communication (IPC).

In this article, our thesis is to demonstrate an alternative way: by seamlessly embedding Python code directly within a Java + Spring Boot API using GraalPy. We’ll also set up the Project as a Spring Boot microservice so that it can be easily run within any containerised deployments.


Why Combine Java and Python?

Java is an undisputed champion for building large-scale, resilient, and high-performance enterprise applications, thanks to its robust type system, mature tooling, and strong community. Python, on the other hand, excels in scientific computing, data analysis, and machine learning due to its simple syntax and a rich ecosystem of libraries.

Combining these two languages allows you to leverage the best of both worlds. Imagine a Java microservice that needs to perform quick, on-the-fly sentiment analysis on user feedback in real-time. Rather than setting up a separate Python service and dealing with HTTP, gRPC or another form of interprocess connection, you can now use a library like TextBlob directly from your Java code. Common use cases for this polyglot approach include sentiment scoring, rapid experimentation with machine learning models, and building efficient, self-contained microservices. This is where GraalPy, Oracle’s GraalVM Python runtime, comes in.


Overview of the PoC Project

The proof-of-concept (PoC) project is a simple Java Spring Boot REST API that calls Python sentiment analysis logic. The key is that, thanks to GraalPy, there’s no need for subprocesses or HTTP bridging. The Python code is pure, in-memory embedding within the same Java Virtual Machine.

Here’s a high-level look at the architecture: a REST controller receives a string of text and passes it to a Java service. This service then uses the GraalPy runtime to invoke a Python function, which uses TextBlob to calculate the sentiment polarity and subjectivity. The results are then returned to the Java service and sent back as a JSON response.

The magic happens with the Polyglot API, which allows Java and Python to communicate as if they were one language. A simple snippet of this “glue code” looks like this:

// Java code to invoke the Python function
try (Context context = GraalPyResources.createContext()) {
    context.eval(Source.newBuilder("python", new File("sentiment_analyzer.py")).build());
    Value sentimentFunction = context.getBindings("python").getMember("analyze_sentiment");
    Value result = sentimentFunction.execute(text);
    // Process the result
}

Developer Setup and Project Structure

Setting up the project is quick.

  1. Clone the repository: git clone https://github.com/vshanbha/graalpy-sentiment
  2. Java + Maven config: The pom.xml file is configured to include the necessary GraalPy dependencies. It pulls in the org.graalvm.polyglot and org.graalvm.python libraries, which are essential for the interop classes.
  3. Python virtual environment setup: The Python libraries (like TextBlob) are installed within a virtual environment, which keeps the project’s dependencies isolated and manageable.

The project structure is a standard Spring Boot layout, with a key addition: the Python source files reside within the project’s resource directory. The heart of the application is a Spring Boot REST controller that calls a service layer. This service layer is where the Java-to-Python bridging occurs, encapsulating the GraalPy logic and keeping the controller clean.


Core Code Snippets

Let’s look at the two key parts of the application.

First, the Python side, which is simple and clean:

# sentiment_analyzer.py
import json
import time
from textblob import TextBlob

def analyze_sentiment(text):
    start = time.time()
    
    analysis = TextBlob(text)
    sentiment_score = analysis.sentiment.polarity
    sentiment_classification = (
        "positive" if sentiment_score > 0.1 else
        "negative" if sentiment_score < -0.1 else
        "neutral"
    )
    
    end = time.time()
    print(f"[Sentiment] Took {(end - start) * 1000:.2f} ms")

    return json.dumps({
        "score": sentiment_score,
        "classification": sentiment_classification
    })

All we do here is use the TextBlob library, pass it a text and get back the sentiment. We have also added a time taken print statement to help us get a sense of the time taken in the actual Python processing.

Next, the Java side, which uses the GraalPy runtime to load and invoke this Python code:

// Java SentimentService.java
@SpringBootApplication
@RestController
@RequestMapping("/analyze")
public class SentimentAnalysisController {

    private final Context context;
    
    private Value sentimentFunction;

    public SentimentAnalysisController() {
        this.context = GraalPyResources.createContext();
    }

    @PostConstruct
    public void initialize() {
        try {
            // Load and execute the Python script
            File scriptFile = new File("src/main/resources/sentiment_analyzer.py");
            Source scriptSource = Source.newBuilder("python", scriptFile).build();
            context.eval(scriptSource);

            Value sentimentFunction = context.getBindings("python").getMember("analyze_sentiment");
        } catch (IOException e) {
            // Handle error
            e.printStackTrace();
        }

    }


    public ResponseEntity<String> analyzeText(String text) {
        try {
		if (analyzeSentimentFunction == null) {
		    return ResponseEntity
                        .status(HttpStatus.INTERNAL_SERVER_ERROR)
                        .body("Python function not initialized.");
		}
		
		try {
		    Value result = analyzeSentimentFunction.execute(text);
		    return ResponseEntity.ok(result.asString());
		} catch (Exception e) {
		    System.err.println("Error analyzing sentiment: " + e.getMessage());
		    e.printStackTrace();
		    return ResponseEntity
                        .status(HttpStatus.INTERNAL_SERVER_ERROR)
		        .body("An error occurred while analyzing sentiment.");
		}

    }
}

GraalPy’s magic is in its type marshalling, which handles the conversion of data types between Java and Python seamlessly. However, in this code, we have skipped that by using JSON strings due to the aim of sending an HTTP response as JSON.

Broadly, the above snippet has two important pieces

  • An initialize function that looks up the relevant Python code file and loads it using GraalPy. This allows us to interact with the Python function as if it were a Java function.
  • An analyzeText function that delegates the actual analyses to the Python function that was initialised at boot time.

GraalPy also leverages the JVM’s JIT (Just-In-Time) compiler, which can optimise the combined code for surprisingly good performance. More on that later.


Running the App & API Demo

After building the Spring Boot application, you can run it with mvn spring-boot:run. The API will be available at http://localhost:8080/api/sentiment.

You can test it with a simple curl command:

curl -X POST \
  http://localhost:8080/api/sentiment \
  -H 'Content-Type: text/plain' \
  -d 'This is an amazing and fantastic project.'

The response would be a JSON object, showing the calculated score and subjectivity. And with that, we are ready to use it in Production. Are we?

proof is in the performance (or was it Pudding)

The key takeaway here is performance. Because the Python code runs in-process, there is minimal latency, unlike a microservices approach that would involve network overhead. That’s just not it, per GraalPy documentation, the code can even run faster than CPython.

The GraalPy documentation claims that this is for the Python benchmark tests, and when using GraalPy with JIT compilation. However, I believe that the proof of the pudding is in eating it, or the proof of the performance is in testing it.

For the test Machine, I am using the following configuration

  • Hardware Model: Apple Inc. Macmini6,2
  • Memory: 16.0 GiB
  • Processor: Intel® Core™ i7-3615QM × 8
  • Graphics: Intel® HD Graphics 4000 (IVB GT2)
  • Firmware Version: 429.0.0.0.0
  • OS Name: Kali GNU/Linux Rolling
  • OS Type: 64-bit
  • Kernel Version: Linux 6.12.38+kali-amd64

Yeah, it’s a really old Mac Mini with a Linux install. The hardware still works after 10+ years, so I decided that it could serve as a Linux test machine for such side projects.

Python Performance

Therefore, let’s put this to the test and compare it with an equivalent pure Python API. We can use a simple Python Flask API router to wrap the Sentiment Analysis function and expose the output as an API endpoint. This is similar to what we have done already in Java:

@app.route('/analyze', methods=['GET', 'POST'])
def analyze():
    if request.method == 'GET':
        text = request.args.get('text', '')
    elif request.method == 'POST':
        text = request.get_data(as_text=True)
    sentiment_json = analyze_sentiment(text)
    response = json.loads(sentiment_json)
    return jsonify(response)

Now that we have an equivalent API in both Java and Python, it’s time to do a comparison of pure Python vs Java + GraalPy + Python. I fully expect that the complexity in the GraalPy approach will make things significantly slower.

We have designed a simple JMeter configuration that does the following

  • Loads a CSV containing a bunch of different texts.
  • Hits the Sentiment API multiple times based on a configured number of users (Threads)
  • Each user runs calls to the same API a set number of times (loop count)

Since Python is where we would consider designing an API to consume Python libraries, we’ll use Python readings as a baseline and as the numbers to beat.

Given that we are using Flask API, we can simply execute flask run, but Flask advertises that the Flask server is only for development. We’ll instead use gunicorn to spin up a production equivalent:
gunicorn -b 0.0.0.0:8080 sentiment_analysis_api:app

Then we start the JMeter test for 10 users with 10 loops, and we observe the logs:

[2025-09-12 20:46:28 +0200] [10915] [INFO] Starting gunicorn 23.0.0
[2025-09-12 20:46:28 +0200] [10915] [INFO] Listening at: http://0.0.0.0:8080 (10915)
[2025-09-12 20:46:28 +0200] [10915] [INFO] Using worker: sync
[2025-09-12 20:46:28 +0200] [10916] [INFO] Booting worker with pid: 10916
[Sentiment] Took 35.19 ms
[Sentiment] Took 0.45 ms
[Sentiment] Took 0.39 ms
[Sentiment] Took 0.29 ms
[Sentiment] Took 0.39 ms

Note that the first hit takes a lot longer, but after that, the function runs really fast. However, when we run the same test with 1000 users and 100 loops, the situation is slightly different.

The sentiment analysis function still takes under 1ms, but the overall JMeter test yields a different result.

Across a load of 1000 users and 100 samples per user, we get an average response time of 793 ms and a 99% Line at 1012 ms.

So our target performance metric to beat under load is 1 second.

Java + GraalPy + Python Performance

We can get an executable Jar file by running mvn clean install . We start the server using:
java -jar target/graalpy-sentiment-v0.3.3.jar

And after a few seconds, we get this to indicate that the server is started.

2025-09-12T21:09:11.864+02:00 INFO 14014 --- [ main] c.example.SentimentAnalysisController : Started SentimentAnalysisController in 23.72 seconds (process running for 24.408)

Almost 24 seconds for startup is likely not good if we want to run this in serverless mode, but for most Java use cases, I would argue that the warmup time is acceptable.

Then we run a test of 10 users to check this. The test presents some serious issues with performance. Every thread is taking a while to warm up before the performance improves.

[Sentiment] Took 12181.00 ms
[Sentiment] Took 13175.00 ms
[Sentiment] Took 14254.00 ms
[Sentiment] Took 13542.00 ms
[Sentiment] Took 13545.00 ms
[Sentiment] Took 15283.00 ms
[Sentiment] Took 16405.00 ms
[Sentiment] Took 16673.00 ms
[Sentiment] Took 16442.00 ms
[Sentiment] Took 16794.00 ms
[Sentiment] Took 1680.00 ms
[Sentiment] Took 1177.00 ms
[Sentiment] Took 806.00 ms
[Sentiment] Took 62.00 ms
[Sentiment] Took 394.00 ms

This is clearly an issue because at scale (1000+ users), the Warm-up time for each thread will make up for a very bad experience. However, that’s not the only problem. If we repeat the test, we see Warm-up time issues showing up again. The overall JMeter results are just too bad.

While trying a run with 1000 users and 100 samples per user, the average time taken kept hovering around the 8 – 10 second mark. That just seems wrong, so contradictory to Oracle’s claims of a GraalPy run being potentially faster.

Further investigation was certainly needed. The startup log actually had a warning:

[engine] WARNING: The polyglot engine uses a fallback runtime that does not support runtime compilation to native code.
Execution without runtime compilation will negatively impact the guest application performance.
The following cause was found: JVMCI is not enabled for this JVM. Enable JVMCI using -XX:+EnableJVMCI.
For more information see: https://www.graalvm.org/latest/reference-manual/embed-languages/#runtime-optimization-support.

Who looks at warnings anyway !! We only care about Errors, don’t we? Turns out this warning was important and in fact the hint to the performance issues. Just like Java, Python also has a JIT compilation and the JVMCI piece isn’t enabled in my JVM. For this test, I am still using an Older JVM (JDK 21) more out of backwards compatibility interest than anything else.

openjdk 21.0.8 2025-07-15
OpenJDK Runtime Environment (build 21.0.8+9-Debian-1)
OpenJDK 64-Bit Server VM (build 21.0.8+9-Debian-1, mixed mode, sharing)

Adding the -XX:+EnableJVMCI flag didn’t really make a difference, and the warning continued. Further reading about JVMCI suggested that this could indeed be the culprit. My somewhat trusty AI friend suggested downloading GraalVM and using it instead.

So I now installed GraalVM and used that to run the same code, and then the warmup (1 user 10 samples ) immediately showed a difference.

[Sentiment] Took 4173.00 ms
[Sentiment] Took 27.00 ms
[Sentiment] Took 33.00 ms
[Sentiment] Took 19.00 ms

The initial Warmup is slow, but the runs after warmup are significantly faster. So clearly, the change of JVM helps. However, the warm-up time is still significant, so we still need a Warm-up for a multi-user run. We can then run a larger test of 1000 users and 100 samples.

Post warmup, a load of 1000 users and 100 samples per user, we get an average response time of 1411 ms. That is certainly way faster than before. It is still slower than pure Python, but I would say that this is still certainly acceptable.

There may be some further tuning possible by using JVM flags. I have not focused on many other things, such as CPU and Memory usage or generally observability and logging. For use in Production at scale, I would say there is much more to do but I would leave that to the curious reader

In my test, the Java + Python polygot service was slower than a pure Python solution. So if we already have services built fully in Python, I wouldn’t rush to replace them with Java. However, if building new services or enhancing existing services with the use of Python libraries, this polyglot option is certainly worth consideration.

For now, it is sufficient to say that Java + GraalPy does offer a viable alternative to Python code integration into Java-based services. This concept is especially interesting for teams that have a large amount of Java code and need to embed specific libraries from Python into their code for some speciality use cases while avoiding the overhead of learning Python or maintaining significant portions of code that are built only in Python.


Lessons Learned & Challenges

This approach offers significant advantages: it’s lightweight, eliminates IPC (Inter-Process Communication), and allows you to leverage powerful JVM tooling for monitoring, debugging, and profiling the entire application.

However, GraalPy is still a maturing technology. One of the main challenges is dealing with Python packages that rely on native C extensions, as they may require specific configurations or may not be fully supported yet.

Another important point to note is the use of the GraalVM as the JVM. The usage of GraalVM ensures that GraalPy is effective and performant. Without GraalVM, the performance is drastically slower.

As of the latest release, v0.3.3 (July 28, 2025), the project is stable for usage as a starter project for any Java + Python use cases. It also includes a few more features relevant to Production such as authentication, logging support, port changes etc.


Production Considerations

When moving this concept to a production environment, you must consider several factors.

  • Packaging: Creating a GraalVM native image can drastically reduce startup time and memory footprint, making it ideal for containerised deployments. You’ll also need to consider how to package the necessary Python dependencies. In my tests startup times were nearly 1/4th of startup times with a Jar packaging.
  • Performance Monitoring: While in-process execution is fast, it’s crucial to monitor memory and CPU usage. The shared memory space means that a memory leak in the Python side can affect the entire JVM.

The decision to choose embedding vs. a separate microservice (using Docker, REST, or gRPC) depends on your specific needs. Embedding is best for low-latency, tightly coupled functionality, while a microservice approach offers more scalability and fault isolation. In this article, we have made a case for those attempts.


Conclusion

We’ve demonstrated how GraalPy provides a simple and effective solution for integrating Python’s powerful NLP libraries directly into Java applications. By seamlessly embedding Python code, we can bypass the complexities of traditional IPC methods, resulting in a cleaner, faster, and more unified architecture.

This proof-of-concept shows that combining the enterprise strength of Java with the analytical power of Python is not only possible but can also be effortless. I invite you to explore the source code, try it out for yourself, and contribute to the project. Your feedback and contributions can help mature this powerful polyglot technology.

Try the project and contribute on GitHub: https://github.com/vshanbha/graalpy-sentiment

References

Featured in the special edition
JAVAPRO – Java 25 (Part 2)

Explore in-depth coverage of Java 25 and related topics in the full issue.

Discover the edition

Total
0
Shares
Previous Post

Boxlang intellij ide released

Next Post

BoxLang v1.11.0 released

Related Posts