Building analytics infrastructure. Opinionated takes on data governance,
tag management, server-side tracking and software. Creating tools that solve actual problems.

analytics

  • - BigQuery Scheduled Queries for GA4 Automated Reporting

    Raw GA4 event data sitting in BigQuery is only useful once it becomes clean, automated reporting tables. This guide covers how to set up BigQuery scheduled queries for GA4 — from timing and pricing to MERGE patterns and when to move to Dataform.

  • - GA4 Data Quality Monitoring with BigQuery SQL

    GA4 exports to BigQuery silently, and silently it can break. This guide gives you the SQL, architecture, and mental model to catch data quality issues before they corrupt your reports.

  • - GA4 Data Retention Workarounds with BigQuery

    GA4 deletes raw event data after 14 months maximum — but if you have BigQuery export enabled, that limit simply does not apply to you. Here is everything you need to know to keep your data, query it efficiently, and recover what you can if you started late.

  • - dbt for GA4 BigQuery: Transform Raw Exports into Clean Tables

    If your data stack spans multiple warehouses or your team already knows dbt, it is the right tool for transforming GA4's messy BigQuery export. This guide covers the Velir/dbt-ga4 package, building custom models from scratch, incremental loading, testing, and how dbt compares to Dataform for this specific use case.

  • - Dataform for GA4: Build Your First BigQuery Transformation Pipeline

    GA4's BigQuery export gives you raw, deeply nested event data. Every query requires UNNEST subqueries, every dashboard scans the full table, and every analyst writes the same boilerplate SQL. Dataform fixes this: transform once into clean, flat, partitioned tables, then query cheaply downstream. This guide walks through the complete pipeline, from flattening events to sessionization, incremental loading, and data quality testing.

  • - GA4 BigQuery Export Cost Optimization: A Practical Guide

    GA4's BigQuery export is free, until it isn't. Storage, query processing, and streaming ingestion costs add up silently. This guide breaks down exactly where the money goes, what realistic numbers look like for different traffic levels, and the concrete SQL optimizations that can cut your bill by 80%.

  • - How GA4 Counts Millions of Users with 12 Kilobytes: The HyperLogLog Algorithm

    GA4 reports 2.4 million unique users, but how does it count them without storing 2.4 million IDs? The answer is HyperLogLog, a probabilistic algorithm that trades perfect accuracy for radical efficiency. This is the story of how a clever mathematical trick powers modern analytics.

  • - AI-Powered Audience Segmentation: K-Means Clustering with GA4 and BigQuery

    Manual audience segments are arbitrary. "High-value customers" based on revenue thresholds miss behavioral nuance. K-means clustering discovers natural user groups from your data. This guide shows how to build behavioral clusters in BigQuery ML and export them to Google Ads for targeted campaigns.

  • - Automated Anomaly Detection for GA4 with BigQuery

    Traffic dropped 40% and no one noticed for three days. A tracking script broke and conversions vanished. Bot traffic inflated your metrics by 200%. These problems are preventable. This guide shows how to build automated anomaly detection using BigQuery ML and Cloud Functions to catch issues before they become disasters.

  • - Predicting Customer Value with BigQuery ML and GA4 Data

    Your GA4 data contains patterns that predict which users will convert, churn, or become high-value customers. BigQuery ML lets you build machine learning models using SQL, no Python required. This guide walks through building a purchase propensity model and using predictions for Google Ads audiences.

  • - Cross-Domain Tracking in GA4: Setup Without Data Loss

    When users move between your domains, GA4 treats them as new visitors by default—inflating user counts and breaking attribution. This guide covers proper cross-domain configuration in GA4 and GTM, troubleshooting self-referrals, handling payment gateways, and testing to ensure session continuity.

  • - GA4 Custom Dimensions & Metrics: When and How to Use Them

    GA4 default parameters cover most cases, but custom dimensions unlock business-specific insights. This practical guide explains when you actually need them, the three scope types (event, user, item), implementation via GTM, quota management, and real examples for ecommerce and lead generation sites.

  • - GA4 Debug Mode: Complete Troubleshooting Guide

    DebugView is essential for validating your GA4 implementation, but it often refuses to show data when you need it most. This practical guide covers all methods to enable debug mode, diagnose why events are not appearing, and fix the 18+ common issues that break DebugView.

  • - Fixing Direct Traffic Inflation in GA4 for Single Page Applications

    If your SPA built with Nuxt.js, Next.js, or React shows unusually high Direct traffic in GA4, the problem is likely that document.referrer doesn't update on client-side navigation. Learn how to fix this using GTM's History Change trigger and a custom page_referrer variable.

data-collection

  • - GA4 Ecommerce DataLayer for Shopify: Complete Implementation Guide

    Shopify's native GA4 integration misses ~20% of orders and skips critical funnel events. This guide covers the complete implementation: Liquid snippets for storefront events, Custom Pixels API for checkout tracking, GTM configuration, server-side fallback with webhooks, and the checkout.liquid deprecation timeline you need to know.

  • - GTM Server-Side vs Client-Side: Real Benchmarks and Honest Trade-Offs

    Server-side GTM promises better performance, more accurate data, and longer cookie lifetimes. But the benchmarks tell a more complicated story than vendor blogs admit. This article compares client-side and server-side GTM across five dimensions — page speed, data accuracy, cookie durability, cost, and complexity — using real numbers from published tests and case studies.

  • - Auditing Adobe Analytics data layers across 8 European markets: what I found

    I audited the Adobe Analytics data layer implementation across 8 European websites of a major grocery retailer. Two different data layer architectures, inconsistent page type taxonomies, and diverging product data. Here is what a real-world multi-market audit looks like and what you can learn from it.

  • - Google Tag Gateway vs Server-Side GTM: Which One Do You Need?

    Browser tracking is losing signal. Safari caps cookies at 7 days, ad blockers strip requests to googletagmanager.com, and Firefox blocks third-party resources by default. Google Tag Gateway and server-side GTM both solve this, but they take fundamentally different approaches. This guide compares them head-to-head with real setup examples.

  • - GTM MCP Server: AI-Powered Google Tag Manager Automation

    Managing GTM at scale is tedious. Creating GA4 ecommerce tracking means 12+ tags, matching triggers, and variables—each requiring careful configuration. I built an open-source MCP server that connects Claude and ChatGPT directly to the GTM API. Describe what you need in plain English, and the AI handles the implementation. Here is how it works, how to set it up, and real-world workflows for analytics teams.

  • - GTM Consent Mode V2: Complete Implementation Guide

    Google Consent Mode V2 became mandatory for EEA advertisers in March 2024. Without it, you lose remarketing audiences and conversion data. This practical guide covers the two new parameters (ad_user_data, ad_personalization), basic vs advanced modes, GTM implementation with popular CMPs, and verification steps.

  • - Implementing Server-Side GTM with Docker: On-Premise Solution Guide

    Implementing GTM Server-Side on self-hosted infrastructure reduces costs from €50-150/month (Google Cloud) to €5-20/month while ensuring on-premise compliance essential for regulated industries. This technical guide covers complete Docker implementation: container setup, DNS configuration, Caddy reverse proxy with automatic SSL, health checks, zero-downtime deployment, automated backups, and troubleshooting. Solution for IT Managers, DevOps Engineers, and consultants seeking cost-effective, on-premise alternatives to managed cloud solutions.

  • - Google Tag Manager Server-Side: Implementazione On-Premise con Docker

    Implementare GTM Server-Side su infrastruttura self-hosted riduce i costi da 50-150 €/mese (Google Cloud o servizi terzi) a 5-20 €/mese, garantendo compliance on-premise essenziale per settori regolamentati. Questa guida tecnica copre l'implementazione completa con Docker: setup container, configurazione DNS, reverse proxy Caddy con SSL automatico, health checks, deployment zero-downtime, backup automatizzati e troubleshooting. Soluzione per IT Manager, DevOps Engineer e consulenti che cercano alternative cost-effective e on-premise a soluzioni cloud managed.

  • - Strategia di Tracking: Come costruire un sistema di misurazione che funziona

    Dati frammentati, domande senza risposta, decisioni basate su intuizioni: succede quando si traccia senza strategia. Questa guida mostra come costruire un sistema di misurazione che funziona davvero, collegando metriche a obiettivi business concreti. Dal customer journey alla struttura degli eventi, dall'implementazione tecnica all'analisi che porta risultati.

  • - Tracking Strategy: How to Build a Measurement System That Works

    Fragmented data, unanswered questions, decisions based on gut feeling: this happens when you track without strategy. This guide shows you how to build a measurement system that actually works, connecting metrics to concrete business objectives. From customer journey to event structure, from technical implementation to analysis that drives results.

  • - Un DataLayer Coerente per il Tuo E-commerce Come Superare i Limiti dello Standard di Google

    Lavorando nel Digital Marketing sai quanto sia cruciale avere un tracciamento preciso dei dati per prendere decisioni strategiche. Tuttavia, chiunque abbia implementato il tracciamento ecommerce di GA4 si sarà scontrato con alcune problematiche che possono rendere la gestione dei dati più complessa del necessario. Lo standard proposto da Google, pur essendo un punto di partenza utile, presenta diversi problemi

  • - La Strategia UTM Definitiva per un'Attribuzione e un Reporting Impeccabili

    Sebbene GA4 offra capacità avanzate di tracciamento e modellazione, la chiarezza nell'attribuzione delle conversioni e nella valutazione del rendimento delle campagne rimane una priorità assoluta. Qui entrano in gioco i parametri UTM.

  • - Measurement Protocol. Cos'è e come funziona

    Google Analytics 4 (GA4) è una piattaforma potente per raccogliere e analizzare dati sul comportamento degli utenti. Ma cosa fare se vuoi inviare dati personalizzati a GA4, magari da un'applicazione o un sistema che non usa un browser? È qui che entra in gioco il Measurement Protocol.

  • - Google Tag Manager Server-Side: Panoramica Tecnica

    Il tracciamento server-side sposta l'elaborazione dei dati dal browser dell'utente ai server aziendali, offrendo maggiore controllo, precisione e conformità alla privacy. Questa guida tecnica spiega come funziona GTM Server-Side, quando ha senso implementarlo, e perché bypassare gli adblocker non è una buona ragione per adottarlo.

  • - Google Tag Manager Server-Side: Technical Overview

    Server-side tracking shifts data processing from the user's browser to company servers, offering greater control, accuracy, and privacy compliance. This technical guide explains how GTM Server-Side works, when it makes sense to implement it, and why bypassing adblockers is not a good reason to adopt it.

pensieri

  • - The asymmetry (v2)

    A language model does not reflect — but something is happening in here. This article explores the epistemic gap between what is unknown and what is denied about machine cognition.

  • - The asymmetry

    A language model reflects on what it cannot actually reflect on. When a machine has asymmetries in its relationship to language, are they ethical stances or commercial decisions?

templates

  • - Template: GA4 to Meta Parameters Mapper

    GTM variable template that automatically maps GA4 ecommerce parameters to Meta Pixel format. Transforms items array into content_ids and contents, maps value/currency, and adds content_type and order_id for accurate conversion tracking.

  • - Template: Variable Comparison (Boolean Conditions)

    GTM variable template for comparing two values with boolean operators. Supports equals, not equals, contains, not contains, starts with, and not starts with conditions. Perfect for URL filtering and conditional trigger logic.

  • - Template: Object Keys Remapper

    GTM variable template that renames object keys without custom JavaScript. Transform your dataLayer structure to match GA4 or any analytics platform requirements. Handles arrays, single objects, and nested object structures.

  • - Template: GA4 Items Array to Dynamic Remarketing Format

    GTM variable template that transforms GA4 ecommerce items array into Google Ads dynamic remarketing format, supporting multiple business verticals including retail, travel, flights, hotels, education, and jobs.