The Funnel Block: Multi-Step Conversion Analysis in a Journey Analytics Tool

This is part of a series where we're building a DIY journey analytics platform from scratch. If you're just joining, check out the earlier posts to catch up!

In the last post, we introduced the notebook-and-block model and walked through the table builder. The teaser at the bottom of that post promised that the next deep-dive would be on the funnel block, so here we are.

Funnels feel like a solved problem at first glance. Step 1, step 2, step 3, count how many people made it to each step, draw a sloping bar chart, done. Every analytics tool on earth has one. So why does the funnel in a journey analytics tool deserve its own post?

Mostly because of how nicely it falls out of everything we've already built. Earlier posts in this series have done the heavy lifting - stitching identities across devices, sessionizing events, unifying data sources, and getting everything cleanly sorted by timestamp. With that foundation in place, the funnel block itself turns out to be pretty straightforward. The hard parts of journey analytics funnels were mostly solved upstream, and this post is really about how cleanly the funnel layer drops on top of them.

A walkthrough of the funnel block - configuring steps, scope, conversion windows, and side-by-side segment comparisons.

Where the Upstream Work Pays Off

Most funnels you've seen in traditional digital analytics tools have a hidden assumption baked in: the user is on a single device, in a single session, doing things roughly in order, within a short time window. Dropping those assumptions would normally making things quite challenging, but luckily the layers we've already built handle them for us:

Cross-device journeys. Thanks to identity stitching, a user who browses on their phone and converts on a laptop is one person_id, not two. Person-scoped funnels naturally span devices.
Out-of-order events. People bounce back to earlier steps, open multiple tabs, and revisit products. Events come pre-sorted by timestamp from the data group layer, so the funnel can just walk through them in order.
Mixed-source events. Thanks to the data group layer, a funnel can mix events from web, mobile, in-store POS, and email engagement systems. The funnel engine doesn't care where each event came from.

What Users See: The Block Config

Like every other block, the funnel block is a small JSON config that the React component knows how to render and the API knows how to execute. The user-facing configuration is:

Steps - 2 to 10 ordered steps, each with one or more conditions. A condition can reference a dimension (e.g. page_name = 'cart') or a metric (e.g. revenue > 0). Multiple conditions inside a step can be combined with AND or OR.
Scope - Either person or session. We'll cover this in more detail below.
Conversion window - The maximum time allowed between the first and last step. Default is 24 hours, configurable in minutes, hours, or days. It can also be disabled entirely.
Comparisons - A drag-and-drop zone where users can drop segments or specific dimension values to render side-by-side comparison funnels. "Everyone" is always there as a baseline.
Percentage display - Whether step-over-step rates are shown relative to step 1 or relative to the previous step.
Date range - Same shared concept every block uses: absolute dates or a rolling preset.

Person vs Session Scope

This toggle defines what your funnel actually means.

A session-scoped funnel asks: "How many sessions saw step 1, and of those, how many also saw step 2 within the same session?" This is the classic web analytics funnel - the right answer when you care about single-session conversion behavior.

A person-scoped funnel asks the same question across all of a user's sessions, even ones that span devices and weeks. This is the right question for journeys with long consideration cycles, or for products where users genuinely switch devices mid-journey.

Both are useful, and the right answer depends entirely on the question you're asking. The underlying query is almost identical - the only thing that changes is the GROUP BY key (person ID alone, vs person ID plus session ID).

The Heart of It: ClickHouse's `windowFunnel`

Here's the part that I think is genuinely cool, and it's mostly thanks to ClickHouse rather than anything we did: there's a built-in aggregate function called windowFunnel that does basically all of the heavy lifting for us.

The signature looks like this:

windowFunnel(window_seconds)(
  timestamp,
  step_1_condition,
  step_2_condition,
  ...
  step_N_condition
)

For each group (person, or person + session), it scans the events in timestamp order and returns the maximum number of consecutive steps that were completed within the time window. So if a user did step 1 and step 2 but not step 3, it returns 2. If they did all four steps in order, it returns 4.

The whole funnel report comes down to:

Build per-event boolean columns for each step's conditions.
Group by the chosen scope (person, or person + session).
Run windowFunnel(window) over the timestamp and the step booleans.
Bucket the results: how many users got to step 1, step 2, step 3, etc.

The actual SQL we generate looks roughly like this (simplified):

WITH funnel_cte AS (
  SELECT
    person_id,
    _session_id,
    event_timestamp,
    -- One boolean column per step
    ifNull((page_name = 'home'), 0) AS step_1,
    ifNull((event_type = 'product_view'), 0) AS step_2,
    ifNull((revenue > 0), 0) AS step_3
  FROM unified_events
  WHERE event_timestamp BETWEEN ... AND ...
)
SELECT
  windowFunnel(86400)(
    event_timestamp, step_1, step_2, step_3
  ) AS steps_reached,
  count() AS cnt
FROM funnel_cte
GROUP BY person_id  -- or person_id, _session_id for session scope

That SQL looks satisfyingly simple, and it kind of is, but only because it's sitting on top of every layer we've already built. unified_events is shorthand: the funnel doesn't run against the raw events table directly. It runs through the same 6-layer query pipeline the table builder uses, so by the time windowFunnel sees a row, sessionization has already happened, array fields have been flattened, and any segments referenced in the funnel have been evaluated as boolean columns.

The step conditions themselves are leaning on upstream work too. Expressions like page_name = 'home' or revenue > 0 aren't pointing at raw warehouse columns. They resolve through the metrics and dimensions semantic layer, which means a user can drop any defined metric or dimension - including computed ones - into a step condition, and the funnel block doesn't have to reimplement any of that logic. It just composes step booleans out of expressions the semantic layer already knows how to resolve.

This is the part I want to underline: the funnel SQL builder gets to be genuinely small because every layer underneath it is doing real work. The funnel block itself just assembles step conditions, picks the right GROUP BY key, and calls windowFunnel. Sessionization, identity stitching, the semantic layer, segment evaluation - all of it comes for free from the layers we built in earlier posts. If we'd tried to stand up a parallel funnel-only SQL builder, we would have ended up reimplementing huge swaths of that logic and inevitably letting it drift out of sync with the table builder. Layering everything on top of one shared pipeline is the thing that makes the funnel block feel cheap rather than expensive.

A small thing that took longer to figure out than I'd like to admit: those ifNull(..., 0) wrappers around each step condition. ClickHouse's windowFunnel requires non-nullable UInt8 inputs, and metric expressions like revenue > 0 happily return NULL when revenue is NULL. The first time I ran a funnel without the wrappers, the query just silently returned zero for users who had any null metric values along the way. Wrapping every step condition in ifNull(condition, 0) fixes it cleanly: if the underlying data is null, the step didn't match, end of story.

A Separate API, Not an Overloaded One

The SQL pipeline is shared, but the API endpoint for funnels is intentionally its own thing rather than a mode flag on the table builder API. We could have added a type: 'funnel' parameter to the existing /createtable endpoint and let one route handle both, but it didn't feel like the right call.

The request and response shapes for a funnel are genuinely different from a table report. Steps and conversion windows don't map cleanly onto dimensions and metrics, and the response is shaped like a funnel chart rather than a flat result set. One endpoint trying to serve both would have meant a sprawling request schema with a lot of "only required if type = X" caveats and a response that's essentially a discriminated union. Two clean endpoints with their own typed contracts is much easier to consume on the React side.

The AI side benefits even more. The Ask Trevor assistant has a much easier time picking between a few well-named tools than reasoning about flags on a single mega-tool. "Run a funnel" and "build a table" are different verbs, and giving the AI two distinct tools (each with its own description and examples) results in noticeably better tool selection than asking it to pick between modes of the same tool. The pipeline underneath is shared; the API surface is deliberately not.

Same Block, Two Surfaces

Just like the table builder, the funnel block doesn't care whether it's rendering inside a notebook or inside an Ask Trevor chat message. Same React component, same config shape, same API endpoint. Ask Trevor can drop a funnel into a chat or write one straight into a notebook, and you can move it between the two without re-querying anything.

What's Next

The funnel block is in good shape, so I'm moving on to the next block type. The one I'm working on now is focused on trends and date-over-date comparisons - the kind of "how did this week look vs last week" or "how is this metric trending over the last 90 days" view that every analytics tool needs to have. More on that in a future post once it's further along.

If you want to follow along, come find me on LinkedIn. I'd love to hear what kinds of funnels you'd want to build first - especially the cross-device ones that traditional funnel tools struggle with. 🙌

Trevor Paulsen is a data product professional at UKG and a former product leader of Adobe's Customer Journey Analytics. All views expressed are his own.