Skip to main content

The Deploy Pipeline

18 GitHub Actions workflows, three environments per product, Pulumi IaC, and two releases per day.

By Alexey Suvorov · · Updated · 6 min read
Featured image for The Deploy Pipeline

177 version tags in 88 days. That’s roughly two releases per day, every day, for the entire life of Dashboard v2. Not scheduled releases. Not batched deployments. Continuous delivery in the literal sense – when a feature is ready, it ships.

This cadence isn’t the result of recklessness. It’s the result of a deployment pipeline that makes shipping safe enough to do constantly. 18+ GitHub Actions workflows, three environments per product, Pulumi infrastructure-as-code, Docker multi-stage builds, and Google Kubernetes Engine orchestration. Here’s how it all fits together.

The versioning scheme

Our version tags follow the pattern v{YYYY}.{M}.{N}. The current release might be v2026.2.83 – the 83rd release of February 2026. There’s no semantic versioning, no major/minor/patch distinction. Just a monotonically increasing counter within each month that resets when the calendar turns.

This scheme reflects how we think about releases. There are no “major versions” because there are no breaking changes for end users. Every release is an increment. Some add features. Some fix bugs. Some refactor internals. The version number tells you when it shipped and where it falls in the sequence. Nothing more.

177 tags in 88 days means N was climbing by two on most days. Some days had three releases. Some had one. The average held steady at roughly two per day.

Three environments, two continents

Every product deploys to three environments:

Staging is where code goes first. Automated tests run here against real infrastructure – real databases, real Redis, real Kubernetes. Staging isn’t a simulation. It’s a production mirror with synthetic data.

Production-global serves users worldwide from our primary GKE cluster. This is the main deployment. When we say “shipped,” we mean it’s running here.

Production-Russia serves Russian users from infrastructure physically located in Russia. Data residency regulations require that certain user data – account information, usage logs, stored content – resides on servers within Russian borders. This isn’t a preference. It’s a legal requirement.

The Russia deployment doubles our infrastructure footprint. Every database, every Kubernetes cluster, every Redis instance, every configuration secret exists in two copies: global and Russia. Pulumi manages both through separate configuration stacks – dev, staging, prod, prod-ru – each with its own secrets, its own resource definitions, and its own state file.

Multi-region deployment is one of those things that sounds straightforward until you actually do it. The databases don’t sync between regions. The Kubernetes clusters are independent. Deployments to Russia happen separately from global deployments, which means every release is actually two releases. The pipeline handles this automatically, but the operational surface area is significant.

Pulumi over Terraform

We use Pulumi for infrastructure-as-code. The decision was pragmatic: Pulumi lets us write infrastructure definitions in TypeScript.

Our team writes TypeScript all day. The backend is TypeScript. The frontend is TypeScript. When the infrastructure code is also TypeScript, there’s no context switch. Conditional deployment logic – “create this resource only in production,” “use a larger instance class for the global cluster” – uses standard if statements instead of Terraform’s count and for_each workarounds.

The Pulumi stacks map directly to our environments. Each stack has its own configuration:

  • dev: Minimal resources, single replicas, small instance types
  • staging: Production-like resources, but smaller scale
  • prod: Full production resources, auto-scaling, redundancy
  • prod-ru: Mirror of prod, different region, different secrets

Secrets management is per-stack. Database passwords, API keys, Stripe secrets, and service account credentials are encrypted in Pulumi’s state and decrypted at deployment time. The prod and prod-ru stacks have entirely different credentials, even for the same services.

Docker multi-stage builds

Every service deploys as a Docker container. The Dockerfiles follow a multi-stage pattern:

Stage 1: Dependencies. Install npm packages using a lockfile. This layer caches aggressively – it only rebuilds when package-lock.json changes.

Stage 2: Build. Compile TypeScript, bundle assets, run any build-time transformations. This layer changes on every commit but starts from the cached dependency layer.

Stage 3: Runtime. Copy only the compiled output and production dependencies into a minimal base image. No dev dependencies, no source code, no build tools in the final image.

The multi-stage approach keeps image sizes small and build times fast. A typical backend image is under 200MB. Frontend images are smaller because they’re just static assets served by nginx.

Build context optimization was an early investment. Docker sends the entire build context to the daemon before building, so a repository with a large node_modules or test fixtures can spend minutes just on context transfer. Our .dockerignore files are aggressive – they exclude everything that isn’t needed for the build.

GitHub Actions: 18 workflows and counting

The 18+ GitHub Actions workflows break down into categories:

CI workflows run on every push and pull request. They lint, type-check, run unit tests, and build Docker images. If any step fails, the commit is marked as failing and won’t merge.

CD workflows trigger on merges to specific branches. Staging deploys automatically when code merges to the develop branch. Production deploys require a manual trigger or a version tag. Russia deploys happen separately, triggered by the same tag but targeting different infrastructure.

Automation workflows handle tasks that used to be manual:

  • Replicate plugin updates: A weekly workflow that queries Replicate’s API for model catalog changes, generates updated plugin definitions, and opens a pull request. This keeps Autopilot’s 25+ Replicate model categories current without human intervention.
  • Model pricing updates: Similar automation for keeping LLM model pricing data accurate across the platform.
  • Release notes automation: When a version tag is created, a workflow generates release notes from the commit log and publishes them.
  • Eval reports pipeline: Dashboard v2’s evaluation framework runs LLM-as-judge assessments and publishes the results to GitHub Pages as an automated report.

Review workflows briefly included Claude Code Review on pull requests, added in January 2026 and removed in February 2026. The experiment lasted about a month. AI code review was helpful for catching obvious issues but generated enough noise on large PRs that the signal-to-noise ratio didn’t justify the workflow run time.

Kubernetes on GKE

Google Kubernetes Engine orchestrates all containers. Each product has its own namespace. Each environment has its own cluster (with staging and dev sharing a smaller cluster to save costs).

Resource requests and limits required iterative tuning – early commits reference kuber limits/requests adjustments as we learned actual consumption under load. Too low and pods get OOM-killed. Too high and you’re paying for idle capacity.

Rolling deployments are the default. Kubernetes creates new pods, waits for health checks, then terminates old pods. If new pods fail, the deployment stops and old pods keep running. Zero downtime on success. Automatic rollback on failure.

The release velocity equation

Two releases per day sounds fast. For a team our size, it’s sustainable only because the pipeline does the heavy lifting.

A typical release cycle: push to feature branch (CI runs in 3-5 minutes), merge to develop (staging deploys automatically in 5-7 minutes), validate staging, create a version tag (production deploys in 5-7 minutes), Russia deploys separately from the same tag.

Total time from merge to global deployment: 15-20 minutes. A bug fix committed in the morning is live before lunch. If the release has a problem, ship another one. v2026.2.83 becomes v2026.2.84 within the hour.

What breaks

The pipeline isn’t perfect. Here’s what fails and how often:

Docker build cache misses. When a base image updates or a lockfile changes, the dependency layer rebuilds. Happens 2-3 times per week, adds 5-10 minutes.

GKE node scaling. High-deployment days sometimes require cluster scale-up to accommodate overlapping pods during rolling deployments. Node provisioning takes 2-3 minutes.

Russia latency. Docker image pushes to the Russian container registry take longer. Deployments that finish in 5 minutes globally take 8-10 for Russia.

Secret rotation. Changing an API key means updating Pulumi config across four stacks. Manual process, infrequent enough that automation isn’t worth the risk.

The lesson of two per day

The deploy pipeline didn’t start at 18 workflows and three environments. It started with a single GitHub Actions file that built a Docker image and pushed it to a registry. Every addition – staging, production-Russia, Pulumi, automated model updates, eval reports – was a response to a specific problem.

We added staging because a broken production deploy cost us four hours of debugging. We added Russia because a customer needed data residency. We added Pulumi because managing Kubernetes manifests by hand didn’t scale past two environments. We automated Replicate updates because a developer was spending two hours a week on a task a script could do in seconds.

177 version tags in 88 days isn’t a flex. It’s evidence that the pipeline is working. When shipping is cheap, you ship small. When you ship small, each release is easy to understand, easy to test, and easy to roll back. The cost of deployment infrastructure is high. The cost of slow deployment is higher.

Two releases per day. Every day. Because the pipeline makes it boring.

Alexey Suvorov

CTO, AIWAYZ

10+ years in software engineering. CTO at Bewize and Fulldive. Master's in IT Security from ITMO University. Builds AI systems that run 100+ microservices with small teams.

LinkedIn

Related Posts

See what AIWAYZ can do for your team

Start a free trial — no credit card, no commitment.

© 2026 AIWAYZ. All rights reserved.

+1-332-208-14-10