Introduction to Java Agents

As promised recently, I will soon publish an article about JVMTI, however, to understand it we need to get basics of (standard) Java Agents. This article shows Java Agents and their capabilities along with some examples.

What is Java Agent?

The official Java documentation states that: “Provides services that allow Java programming language agents to instrument programs running on the JVM.”. That is a super simple definition yet explains the whole topic pretty well. I think that some people might wonder that “instrumentation” means. It turns out that Oracle docs explain that too: “Instrumentation is the addition of byte-codes to methods for the purpose of gathering data to be utilized by tools”.

So in essence, what are Java Agents?

They allow us to do “different things” within JVM while being completely separated from the main application(let say, our business-critical REST API deployed on Tomcat). This is how tools like NewRelic and Plumbr work: they provide separate Java agents that might be added to the JVM during start phase. Usually, these agent do two things: instrument code and collect all kinds of metrics, statistics, informations(properties, flags, used libraries, etc.).

Also:

  • They are ordinary Java code and .jar files written and built using the same techniques as most of our beloved applications/libraries.
  • They are executed before any other “application” code.
  • They might be loaded by adding -javaagent:agent-path.jar to the JVM start parameters.
  • Instead of classic “main” method, their entry point is public static void premain(String agentArgs, Instrumentation inst).
  • They are quite powerful yet might be insecure – after all, they mess with our source code.
  • Compared to JVMTI they are quite limited.

Talk is cheap, show me your code.

Let’s write a VERY simplified version of our tracing app. The story is simple: we are the founders of super-hot startup that will crush NewRelic offerings by implementing even better Java agent. Our first task is to setup project and write Java Agent capable of tracing method calls but only to methods annotated with @Trace. For now “tracing” means just writing to the console every time traced method is called – enough for demo.

I have prepared an example project here.

It consists of three modules:

  • tracer-api – module shared between the final app (eg. our Spring API) and Java Agent. It doesn’t contain any logic – just one shared annotation that act as pointcut. NewRelic works the same way – you have to import their lib if you want to add tracing(annotate) to your methods.
  • tracer – a module that holds all logic related to our Java Agent: its build steps and source code.
  • business-app – this is traced module. In real-life scenarios, this would be Java app serving our business.

Tracer module logic

The first and only class in tracer module:

public class TracerAgent {
    public static void premain(String args, Instrumentation instrumentation) {
        System.out.println("Starting TracerAgent");

        new AgentBuilder.Default()
                .type(ElementMatchers.any())
                .transform((builder, typeDescription, classLoader, module) -> builder
                        .method(ElementMatchers.isAnnotatedWith(Trace.class))
                        .intercept(MethodCall.call(() -> {
                            System.out.println("Supervised app called method annotated with @Trace");
                            return null;
                        }).andThen(SuperMethodCall.INSTANCE)))
                .installOn(instrumentation);
    }
}

What it does is that it matches any type, check whether the method that is being called is annotated with @Trace annotation and calls our lambda before the real method is executed. This code is self-explanatory so I hope any further details aren’t needed.
It’s worth pointing that I used ByteBuddy – it provides a neat way to define such transformations. The same transform logic would be possible using instrument package but I value your time – I don’t want you to spend time reading hundreds of lines of code when we can do it in less than 10 lines.

Business App module logic

public class Main {
    public static void main(String[] args) throws InterruptedException {
        int i = 0;
        while (true) {
            System.out.println("I am simulating ling running process eg Tomcat. Iteration " + i);

            callSupervisedMethod();
            callNonSupervisedMethod();

            i++;
            Thread.sleep(TimeUnit.SECONDS.toMillis(5));
        }
    }
    
    private static void callNonSupervisedMethod() {
    }

    @Trace
    private static void callSupervisedMethod() {
    }
}

There isn’t anything complicated here – we just call two methods every five seconds in a loop. The first method is annotated with @Trace – which should be intercepted by Tracer. The second method is just an ordinary method that does nothing – it shouldn’t be handled by Tracer as it doesn’t have any annotations on it.

Running it

What would be the result when we run our business app with our agent?

Starting TracerAgent
I am simulating ling running process eg Tomcat. Iteration 0
Supervised app called method annotated with @Trace

We clearly see that our agent intercepted only method call on method annotated with @Trace annotation 🙂
Please check the repository mentioned above to check how to run the example.

Wrapping up

That would be it – a bit of theory proved in practice. Since today all of you will know how all these magic tools like NewRelic, Plumbr, YourKit works – they instrument your classes at runtime and collect all kinds of statistics. As the presented topic is super interesting and particularly exciting to me – after JVMTI we will focus a bit on how NewRelic and Plumbr agents are implemented. What to expect? Reverse-engineering, hacking, debugging and more!

Leave a Reply