Async-First Remote Teams: Leading 35 Engineers Across Time Zones

Async-First Remote Teams: Leading 35 Engineers Across Time Zones

Synchronous meetings don’t scale across six countries and four time zones. Someone is always on a call at 2am or missing context from the 9am standup they couldn’t attend.

Async-first communication solves this. Write decisions down. Document context. Use tools like Clarity for status visibility. Make meetings the exception, not the default. Result: 97% retention over 4 years at Digital Turbine leading 35 remote engineers across 6 countries.

Write Everything Down#

Meetings generate decisions. Conversations surface insights. Neither persists unless written.

Document decisions immediately:

# Decision: Use PostgreSQL for Feature Store

**Date**: 2024-12-15
**Participants**: @alice, @bob, @carol
**Context**: Need persistent storage for ML features

**Decision**: PostgreSQL 14+ with jsonb columns

**Rationale**:
- Team has 5 years PostgreSQL experience
- Handles 10K writes/sec (current: 2K)
- JSON support for flexible schemas
- Lower cost than DynamoDB at our scale

**Consequences**:
- Need to manage backups ourselves
- Requires database expertise on team
- Cannot use serverless patterns easily

This document answers “why PostgreSQL?” for the next three years. No need to ask the person who made the decision. No need to schedule a meeting to explain it.

Async Standups with Clarity#

Daily standups via Clarity instead of synchronous meetings:

  1. Nightly: AI synthesizes yesterday’s activity (commits, PRs, Jira updates) into status report
  2. Morning: Each engineer reviews their pre-generated report, adds context
  3. Anytime: Management reads all reports, asks questions async via text
  4. Zero meetings: Everyone gets status visibility without scheduling
# Daily Status - Alice - Dec 23, 2025

## Completed
- Merged PR #234: Rate limiter implementation
- Reviewed Bob's authentication refactor
- Fixed production bug: memory leak in worker

## In Progress
- API redesign for model serving (60% complete)
- Performance testing on GPU cluster

## Blocked
- Waiting on DevOps for Kubernetes upgrade

## Notes
- Pair programmed with Charlie on testing strategy
- Updated API documentation

Engineers write once. Stakeholders read anytime. No meetings. No interruptions.

Code Review Across Time Zones#

Traditional code review: tag someone, wait for them to wake up, wait for review, respond to comments, wait for approval. 24+ hour cycles.

Async code review: write detailed PR descriptions, self-review first, provide context.

PR Template:

## What Changed
Brief summary of the change.

## Why
Link to issue/ticket. Explain the problem being solved.

## How
Key technical decisions. Non-obvious approaches explained.

## Testing
What tests were added/modified. How to verify manually.

## Risks
What could go wrong. What to watch in production.

## Screenshots
For UI changes, before/after screenshots.

## Deployment Notes
Migration scripts, config changes, rollback procedure.

Reviewers have full context. Can review thoroughly without asking questions. Faster approval cycles.

Decision Making Without Meetings#

Use RFC (Request for Comments) process:

  1. Author writes proposal in Markdown with context, options, recommendation
  2. Post in shared channel (Slack, Discord, email thread)
  3. Set deadline (e.g., “comments by Friday EOD”)
  4. Team comments async whenever convenient
  5. Author synthesizes feedback, makes final decision
  6. Document decision with rationale

No meeting scheduled. No calendars coordinated. Decision made with full team input.

Example RFC:

# RFC-042: Adopt Playwright for E2E Testing

## Problem
Current E2E tests use Selenium. Flaky. Slow. Hard to debug.

## Proposal
Migrate to Playwright for all E2E tests.

## Options Considered
1. Fix Selenium tests (est: 2 weeks)
2. Migrate to Playwright (est: 1 week)
3. Migrate to Cypress (est: 1.5 weeks)

## Recommendation
Option 2: Playwright

**Pros:**
- Faster than Selenium
- Better debugging tools
- Built-in retry logic
- Works with our MCP integration

**Cons:**
- Learning curve for team
- Need to rewrite existing tests

## Timeline
Week 1: Migrate critical user flows
Week 2: Migrate remaining tests
Week 3: Remove Selenium

## Feedback Requested By
December 20, 2025 EOD

## Comments
(Team adds comments here)

Time Zone Coordination#

Core hours: Everyone overlaps 2-4 hours per day. Use for:

  • Critical incidents
  • Architecture discussions that need real-time debate
  • Team bonding

Everything else: Async.

Calendar transparency:

# Team Availability (all times UTC)

Alice (US East): 13:00-21:00
Bob (Europe): 08:00-16:00
Carol (Asia): 23:00-07:00

Overlap: 13:00-16:00 UTC (3 hours)

Schedule critical discussions during overlap. Everything else happens async.

Building Trust Remotely#

Visibility through transparency:

Clarity shows what everyone is working on. No need to ask. No need to report in meetings. Trust builds when work is visible.

Written communication shows thought:

Meetings hide who’s contributing. Text makes it obvious. Thoughtful comments, detailed PRs, helpful documentation - all visible. Trust follows contribution.

Bias for action:

Don’t wait for approval. Ship. Document decisions. If someone disagrees, they can comment async. Better to move fast and adjust than wait for consensus.

Onboarding Remote Engineers#

Documentation over meetings:

New hire reads:

  • Architecture decisions (why things are the way they are)
  • Team workflows (how we work)
  • Code review standards (what we care about)
  • Production runbooks (how things break and how we fix them)

Can read anytime. Can reference later. Scales to any number of new hires.

Buddy system:

Pair new hire with experienced engineer. Not for meetings. For async questions.

New hire: "Why do we use PostgreSQL instead of DynamoDB here?"
Buddy: "See ADR-023. TLDR: cost at our scale. Questions on the details?"

Measuring Remote Team Health#

Response time metrics:

  • PR review: Target <24 hours
  • Slack questions: Target <4 hours during work hours
  • RFC feedback: Target 3 days for proposals

Track these. Intervene when patterns emerge.

Retention as proxy:

97% retention over 4 years means async-first is working. People leave when communication is frustrating. They stay when it’s effective.

Written output as indicator:

Teams that write well communicate well. Track:

  • Decision documents created
  • PR descriptions quality
  • Confluence pages updated

More writing = better communication = healthier team.

Tools That Enable Async#

Clarity: Team status visibility without meetings

GitHub/GitLab: Code review, issues, PRs - all async

Confluence/Notion: Shared knowledge base

Twist: Async-first communication by Doist. Threaded by default. Designed to reduce real-time pressure. Alternative: Slack/Discord configured for async use.

Loom: Video for complex explanations

Miro/Figma: Async collaboration on designs

All text-first or recording-based. No synchronous requirement.

When to Meet Synchronously#

Architecture debates: Real-time discussion for complex tradeoffs

Incidents: Production down, need immediate coordination

Team bonding: Social connection matters, schedule it explicitly

Onboarding: First week, some real-time for relationship building

Conflict resolution: Text escalates conflict, voice de-escalates

Everything else: async.

High Retention Through Async-First#

Autonomy: Engineers make decisions without waiting for approval

Visibility: Clarity shows everyone’s work without status reports

Documentation: Written communication creates shared context

Time zone respect: No meetings outside core overlap hours

Bias for action: Ship first, adjust based on feedback

Trust through transparency: Work is visible, trust follows

Async-first isn’t about avoiding meetings. It’s about respecting people’s time and enabling global collaboration.