Backstage with Lakebase

Part 3: The one-query FinOps solution

Cameron Casher ,

Shanil Anushka Fernando and

Kevin Hartman

Published: June 05, 2026

In the first part of this series, running Backstage on Databricks Lakebase gave us one-second database branching. In part two, meanwhile, Unity Catalog absorbed that operational database into the enterprise governance plane.

But here’s the payoff that actually changes the org chart.

In a normal stack, answering ‘who owns the infrastructure running up our cloud spend, and what did it cost?’ crosses two boundaries. The ownership graph lives in Backstage (owned by platform engineering), while the cost data lives in a data warehouse (governed by the data team). Answering the question requires an ETL pipeline, a Jira ticket or a Slack thread.

Separated compute makes sharing viable

The reason a FinOps analyst can run massive analytical queries against the exact same underlying storage without impacting the live portal is that Lakebase isolates compute per workload.

Backstage gets its own isolated, autoscaling compute envelope. During normal portal use, catalog queries ran at 55–65 ms end-to-end, and searches hit two to four milliseconds. Because your web application and your analytical workloads aren't contending for the same compute cluster, they can finally safely share the same data substrate.

The workaround: Lakehouse federation auth

To join the live Postgres data to our analytical billing data, we use Databricks Lakehouse Federation. However, Lakehouse Federation’s Postgres connector currently only supports static user/password credentials. Because Lakebase authenticates app identities via OAuth JWTs, the federation engine needs a parallel auth path.

The workaround is creating a native Postgres role with SCRAM-SHA-256 auth, wired to federation separately from the OAuth identity the app uses:

CREATE ROLE feduser WITH LOGIN PASSWORD '###PASSWORD###'
  NOSUPERUSER NOCREATEDB NOCREATEROLE;

GRANT USAGE ON SCHEMA public TO feduser;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO feduser;

CREATE CONNECTION lakebase_backstage TYPE postgresql
OPTIONS (host '...', port '5432', user 'feduser', password '###PASSWORD###');

CREATE FOREIGN CATALOG lakebase_bs
USING CONNECTION lakebase_backstage
OPTIONS (database 'backstage_plugin_catalog');

You’re now managing two auth paths for the same database.

The FinOps join

With the foreign catalog live, a FinOps analyst can write a single query that pulls the Backstage resource name directly from the operational Postgres table, and joins it to Lakebase's own billing rows in system.billing.usage:

SELECT
  get_json_object(b.final_entity, '$.metadata.name') AS backstage_resource_name,
  get_json_object(b.final_entity, '$.metadata.annotations["databricks/project-id"]') AS lakebase_project_id,
  DATE(u.usage_start_time) AS usage_date,
  ROUND(SUM(u.usage_quantity), 4) AS total_dbus
FROM lakebase_bs.public.final_entities b
JOIN system.billing.usage u
  ON get_json_object(b.final_entity, '$.metadata.annotations["databricks/project-id"]') = u.usage_metadata.project_id
WHERE u.billing_origin_product = 'LAKEBASE'
GROUP BY 1, 2, 3
ORDER BY usage_date DESC, total_dbus DESC;

The real result:

backstage_resource_name   lakebase_project_id                   usage_date   total_dbus
-----------------------   ------------------------------------  ----------   ----------
lakebase-operational-db   a856ea10-2261-4df4-9e37-1f4e7d127b23  2026-04-08   39.8667
lakebase-operational-db   a856ea10-2261-4df4-9e37-1f4e7d127b23  2026-04-07   43.6231

The left side of that row comes directly from inside the live Backstage Postgres catalog; the right side comes from a Unity Catalog system billing table. Those two things have historically never been in the same SQL engine, and now they join with zero data movement.

Why not just use ETL?

A skeptic might ask why we don't just use a Python script to sync an RDS instance to a Delta table once an hour.

The answer is branching. When a developer creates an ephemeral, 1-second database clone to test a PR, you would have to dynamically provision new ETL pipelines just to get cost visibility into that temporary test environment. With Lakebase, the moment the branch is created, its billing and ownership data are instantly queryable. (In this POC, the dropped test branch was automatically and independently attributed 0.0107 DBU).

Operationalizing convergence

This three-part series started with a 1-second database branch, moved through unified governance and landed here — a single SQL query that joins operational ownership data to cloud billing data with zero pipelines between them. That's the proof that convergence works technically. The question practitioners will ask next is: what does it take to operationalize this?

Two things stood out from this POC that are worth calling out for teams planning to follow this path.

The federation auth gap

The Lakehouse Federation workaround we described — a native Postgres role with static credentials wired separately from the OAuth identity the app uses — is the right approach today. Every team that wants to join their Lakebase operational data with analytical tables in Unity Catalog will need to set up this parallel auth path. Federation probably shouldn't run as your application user anyway, so the separation has a security upside, but password rotation is on you. For teams adopting this pattern, the steps can be packaged into a repeatable script: generate a secure password, create the role with read-only grants, wire the connection, create the foreign catalog. One-time setup, minutes once you know the pattern. Natively supporting OAuth JWTs in Federation would eliminate this workaround entirely.

Branch cost visibility for dev teams

The FinOps join answers the platform question: what does this infrastructure cost and who owns it? But the same billing data tells a second story that matters to engineering managers: what does the development process itself cost?

In the branching workflow from part one, every pull request creates an ephemeral CI branch and every developer has their own feature branch. These show up as independent line items in system.billing.usage, broken down by branch_id and endpoint_id. An engineering manager can see exactly how much compute their team's dev/test branching consumed in a sprint versus production – and make informed decisions about branch lifecycle policies.

The key is that ephemeral branches should be treated as ephemeral in the billing data too. CI branches created with a short TTL auto-expire if cleanup fails for any reason — a direct push to main, a workflow error, a missed event. Without lifecycle controls, orphaned branches can accumulate quietly, each one with an active compute endpoint billing against the project. The test branch cost 0.0107 DBU. That's trivial. Thirty orphaned branches running for a month are not.

The point isn't that branching is expensive — it's a cost vs productivity gain. When a team eliminates two days of environment wait time per sprint and stops maintaining 20-30% of their codebase in mock objects, the 0.0107 DBU per branch isn't a line item to manage — it's the cheapest productivity investment the team has ever made. And unlike most productivity investments, this one is measurable: the infrastructure tells you exactly what it cost, per branch, per developer, per sprint. That's a conversation most engineering teams have never been able to have with their database.

What comes next

Before we wrap, there's one more point to the FinOps story that should be called out. Lakebase endpoints scale to zero. When a branch isn't being queried, its compute suspends and the bill stops. The 0.0107 DBU figure is the cost of a branch that ran, not the cost of a branch that exists; a fleet of ephemeral branches sitting between test runs contributes nothing.

Across this series, we proved the infrastructure works — real app, real benchmarks, real governance, real cost data. From our side, Databricks and Thoughtworks are working together to take this from POC to practice: real development teams, real sprints, real velocity measurements. The constraint that kept operational and analytical data in separate worlds for thirty years is dissolving.

There's a Monday morning takeaway for every piece of this series:

Branch your next migration on a real schema.
Rewrite one mock-heavy suite against a branch.
Join your billing data to your ownership graph.

The teams that move first will define what comes next.

View less

Industries

Publications and Tools

All Insights