Post

Eventstore Observability

Eventstore Observability with Micrometer, Prometheus and Grafana

Eventstore Observability

EventStore Observability

Understanding how your EventStore deployment performs in production is critical for maintaining a healthy event-sourced system. Observability enables you to:

  • Track operational health: Monitor append and query rates to detect unusual activity patterns
  • Identify performance bottlenecks: Measure operation durations to find slow queries or contention
  • Optimize resource usage: Understand which event streams are most active and resource-intensive
  • Debug production issues: Correlate metrics with application behavior during incident investigation
  • Capacity planning: Use historical metrics to predict growth and plan infrastructure scaling

Micrometer Integration

EventStore uses Micrometer as its metrics collection framework. Micrometer provides a vendor-neutral facade similar to SLF4J for logging, allowing you to emit metrics once and send them to various monitoring backends (Prometheus, Grafana Cloud, Datadog, etc.).

When creating an EventStore instance, provide a MeterRegistry to enable metrics collection:

1
2
3
4
5
6
7
// Option 1: Use the global registry (simplest approach)
EventStorage storage = PostgresEventStorage.newBuilder().build();
EventStore store = EventStoreFactory.get().eventStore(storage);

// Option 2: Provide a custom registry with specific configuration
MeterRegistry registry = new SimpleMeterRegistry();
EventStore store = EventStoreFactory.get().eventStore(storage, registry);

Adding Custom Tags for Drill-Down Analysis

To enable drill-down analysis by deployment context, add common tags to your registry:

1
2
3
4
5
6
7
8
9
10
11
MeterRegistry registry = new SimpleMeterRegistry();

// Add tags for deployment context
registry.config().commonTags(
    "instance", "eventstore-01",           // Instance identifier
    "deployment", "production-eu-west",    // Deployment unit/region
    "app.version", "1.2.3",                // Application version
    "environment", "production"             // Environment name
);

EventStore store = EventStoreFactory.get().eventStore(storage, registry);

These tags are automatically applied to all metrics, enabling you to:

  • Compare performance across different instances
  • Identify version-specific issues after deployments
  • Separate production from staging metrics
  • Analyze regional performance differences

Available Metrics

EventStore exposes the following metrics through Micrometer. All metrics include these automatic tags:

TagDescriptionExample Values
contextEvent stream context"customer", "order", "" (empty for any-context)
purposeEvent stream purpose"123", "aggregate-id", "" (empty for any-purpose)
typedWhether stream uses typed or raw events"true", "false"
storageStorage backend name"postgres", "inmemory"

Counters

Metric NameDescriptionUnit
sliceworkz.eventstore.stream.createNumber of event stream objects createdcount
sliceworkz.eventstore.appendNumber of successful append operationscount
sliceworkz.eventstore.append.eventTotal number of events appendedcount
sliceworkz.eventstore.append.optimisticlockNumber of append operations rejected due to optimistic locking conflictscount
sliceworkz.eventstore.queryNumber of query operations executedcount
sliceworkz.eventstore.query.eventTotal number of events retrieved by queriescount
sliceworkz.eventstore.get.eventNumber of individual event lookups by IDcount
sliceworkz.eventstore.bookmark.placeNumber of bookmark updatescount
sliceworkz.eventstore.bookmark.getNumber of bookmark retrievalscount

Timers

Metric NameDescriptionUnit
sliceworkz.eventstore.append.durationTime taken to append events (including optimistic locking check)milliseconds
sliceworkz.eventstore.query.durationTime taken to execute queriesmilliseconds

Gauges

Metric NameDescriptionUnit
sliceworkz.eventstore.append.positionHighest event position appended to this streamposition

Example Configuration: Prometheus

To expose metrics to Prometheus, add the Prometheus Micrometer registry dependency and configure an HTTP endpoint.

Maven Dependencies

1
2
3
4
5
6
7
8
9
10
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
    <version>1.16.0</version>
</dependency>
<dependency>
    <groupId>io.javalin</groupId>
    <artifactId>javalin</artifactId>
    <version>6.4.0</version>
</dependency>

Java Configuration with Javalin

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import io.javalin.Javalin;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.prometheus.PrometheusConfig;
import io.micrometer.prometheus.PrometheusMeterRegistry;

public class EventStoreApp {
    public static void main(String[] args) {
        // Create Prometheus registry
        PrometheusMeterRegistry prometheusRegistry =
            new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);

        // Add common tags for drill-down
        prometheusRegistry.config().commonTags(
            "instance", System.getenv("HOSTNAME"),
            "app.version", "1.2.3"
        );

        // Create EventStore with Prometheus metrics
        EventStorage storage = PostgresEventStorage.newBuilder().build();
        EventStore eventStore = EventStoreFactory.get()
            .eventStore(storage, prometheusRegistry);

        // Expose metrics endpoint via Javalin
        Javalin app = Javalin.create().start(8080);

        app.get("/metrics", ctx -> {
            ctx.contentType("text/plain; version=0.0.4");
            ctx.result(prometheusRegistry.scrape());
        });

        // Your application logic here...
    }
}

Prometheus Scrape Configuration

Add this job to your prometheus.yml:

1
2
3
4
5
6
scrape_configs:
  - job_name: 'eventstore'
    static_configs:
      - targets: ['localhost:8080']
    metrics_path: '/metrics'
    scrape_interval: 15s

Example Reporting: Grafana

Grafana provides powerful visualization and alerting capabilities for EventStore metrics using Prometheus as a datasource.

Setting Up Grafana with Prometheus

  1. Add Prometheus datasource in Grafana:
    • Navigate to Configuration → Data Sources
    • Select “Prometheus”
    • Set URL to your Prometheus instance (e.g., http://localhost:9090)
    • Click “Save & Test”
  2. Create EventStore dashboard with useful panels:

Panel: Append Rate by Stream Context

rate(sliceworkz_eventstore_append_total[5m])

Panel: Query Duration (95th Percentile)

histogram_quantile(0.95,
  rate(sliceworkz_eventstore_query_duration_seconds_bucket[5m])
)

Panel: Optimistic Locking Conflict Rate

rate(sliceworkz_eventstore_append_optimisticlock_total[5m])

Panel: Events Appended per Second

rate(sliceworkz_eventstore_append_event_total[5m])

Panel: Highest Event Position by Stream

sliceworkz_eventstore_append_position

Key Metrics to Monitor

  • High optimistic locking conflicts: May indicate contention on specific aggregates requiring architectural review
  • Slow query durations: Could signal missing indexes, inefficient queries, or database resource constraints
  • Append rate spikes: Unusual activity patterns that might indicate bugs or attacks
  • Event position growth: Helps predict storage requirements and identify most active streams

With Grafana, you can set up alerts on these metrics to proactively detect issues before they impact users.

This post is licensed under CC BY 4.0 by the author.