Sync vs. Async: How Services Talk to Each Other
Every feature you build in a distributed system requires services to communicate. An API server calls a database. A checkout service notifies an email service. An AI agent calls an inference endpoint. The architectural choice you make — synchronous or asynchronous — determines how tightly coupled those services are, how they behave under failure, and how well they scale.
This section covers the two fundamental communication models and the three protocols you'll encounter in practice:
- Synchronous: REST and gRPC — the caller sends a request and waits for a response
- Asynchronous: Message Queues — the caller publishes a message and continues immediately; a separate consumer processes it later
In practice, most production systems use both — synchronous where an immediate response is required, and asynchronous where it is not. The key skill is knowing which to reach for.
The Core Distinction#
Synchronous communication means the caller blocks — it waits for the response before doing anything else. The two services must both be available at the same time. If the downstream service is slow, the caller is slow. If it crashes, the caller gets an error.
Asynchronous communication means the caller publishes a message to a queue or broker, then immediately continues with its next task. The message sits in the queue until a consumer picks it up and processes it — possibly milliseconds later, possibly minutes later. The caller and consumer never need to be available simultaneously.
A common source of confusion: Using
async/awaitin your code does not make your architecture asynchronous. When you writeawait fetch(url), you are writing non-blocking code — your thread is released while waiting — but the system is still synchronous: the HTTP request is still sent, and your function still waits for the response before continuing. From the perspective of the overall system, the caller is blocked until the downstream service responds. True asynchronous architecture requires a message broker to fully decouple the sender from the receiver: the sender publishes a message and returns immediately, with no knowledge of when or whether the receiver has processed it.
Synchronous Communication#
In synchronous communication, the caller and the downstream service are temporally coupled — they must both be online at the same time, and the response time of the downstream service directly affects the caller's response time. This makes synchronous communication simple to reason about but fragile under failure.
REST: The Universal Standard#
REST (Representational State Transfer) is the most widely used communication style in web systems. It uses HTTP methods (GET, POST, PUT, DELETE) to operate on resources identified by URLs, typically exchanging JSON.
REST is stateless: every request carries all the information needed to process it. The server holds no session state between requests.
REST: Request-Response Over HTTP
REST is the default for any public-facing API and for internal services where simplicity matters more than raw performance. Its text-based JSON payloads are human-readable and easy to debug, but larger and slower than binary alternatives.
gRPC: High-Performance Internal Communication#
gRPC is a Remote Procedure Call (RPC) framework built by Google, designed for efficient service-to-service communication. Instead of JSON over HTTP/1.1, it uses Protocol Buffers (a compact binary format) over HTTP/2 (which multiplexes multiple requests over a single connection).
The result is roughly 3–10× better throughput and lower latency than REST for the same workload, with the gap widening at high concurrency and with larger payloads — at the cost of human-readability and more complex tooling.
gRPC: Binary RPC Over HTTP/2
gRPC compiles your API contract from a .proto schema file into type-safe client/server code in any language. HTTP/2 multiplexing lets multiple requests share one TCP connection, eliminating the per-request connection cost that REST suffers.
Asynchronous Communication: Message Queues#
Asynchronous communication removes the temporal coupling entirely. The producer (the service that generates work) publishes a message to a broker and immediately continues. The consumer (the service that does the work) reads from the broker and processes the message independently — on its own schedule, at its own pace.
The broker holds the message durably until it is processed. If the consumer is offline, messages accumulate and are delivered when it comes back. If traffic spikes, messages buffer in the queue and consumers drain the backlog gradually — instead of crashing under load.
Message Queue: Decoupled Producer and Consumer
A message queue decouples when work is submitted from when it is executed. The producer is never blocked waiting for processing to complete. The consumer scales independently, processes messages at its own rate, and receives built-in retry and dead-letter handling if processing fails.
Queue vs. Event Stream: Two Flavors of Async#
Not all message brokers work the same way. There are two distinct models:
| Traditional Queue (RabbitMQ, AWS SQS) | Event Stream (Apache Kafka, AWS Kinesis) | |
|---|---|---|
| Model | Point-to-point: each message is delivered to one consumer and deleted after acknowledgment | Pub/Sub log: messages are appended to a persistent log and multiple consumer groups can read the same message independently |
| Retention | Messages are removed once consumed | Messages are retained for a configurable period (hours, days, forever) regardless of consumption |
| Use case | Task queues: send one email, process one payment, run one job | Event-driven architectures: audit logs, real-time analytics, event sourcing, feeding multiple independent consumers |
| Ordering | Standard queues (SQS Standard, RabbitMQ with multiple consumers) offer no ordering guarantee. FIFO queues (SQS FIFO, RabbitMQ with a single consumer) provide strict ordering at lower throughput. | Strict ordering within a partition (a shard of the topic); no ordering guarantee across partitions |
| Replay | Not supported — once consumed, the message is gone | Supported — consumers can re-read from any point in the log, enabling historical replay and backfill |
| Complexity | Lower — simpler to set up and operate | Higher — Kafka requires more infrastructure and operational expertise |
Rule of thumb: Use a traditional queue (SQS, RabbitMQ) when you need to distribute tasks to workers — one message, one processor, then deleted. Use an event stream (Kafka) when multiple independent systems need to react to the same event, or when you need a persistent, replayable log of what happened. Note that Kafka can also distribute work across workers via consumer groups, but its operational overhead is significantly higher — favor a traditional queue for simple task distribution unless you specifically need event streaming capabilities. Event streams are covered in depth in the next section.
Cascading Failure: The Risk of Synchronous Chains#
The most dangerous failure mode in synchronous systems is the cascading failure: a slow downstream service makes every upstream service slow — even services that have nothing wrong with them.
Synchronous Chain: How One Slow Service Brings Down Everything
In a synchronous call chain, each service waits for the next. If Service C slows down, Service B's threads fill up waiting. Then Service A's threads fill up waiting on B. The entire chain becomes unresponsive — even though Services A and B have no problem of their own.
Asynchronous communication is the structural remedy for this risk: move anything that does not need to complete before the user sees a response into a message queue. The checkout service no longer holds a thread open waiting for email and analytics — it publishes a message and returns immediately, regardless of what those downstream services are doing.
Decision Framework: Sync vs. Async#
| Scenario | Recommended Approach | Why |
|---|---|---|
| User submits a search query and waits for results | Synchronous (REST or gRPC) | The result is needed before anything else can happen — there is nothing to defer |
| User completes checkout; confirmation email must be sent | Asynchronous (Queue) | The email does not need to complete before the user sees the 'Order confirmed' page |
| Internal service calls another at 50,000 req/s | Synchronous (gRPC) | Low latency, high throughput, same datacenter — the overhead of a broker adds latency without benefit |
| Checkout event triggers email + inventory + fraud check + analytics | Asynchronous (Queue/Fan-out) | Multiple independent consumers need the same event; none of them should block checkout |
| AI batch inference for 1M records overnight | Asynchronous (Queue) | Processing time is long; no user is waiting; consumers can be throttled to stay within API rate limits |
| Public API called by third-party developers | Synchronous (REST) | Third parties expect standard HTTP request-response; introducing a queue requires them to poll for results |
| Real-time dashboard updated from many data sources | Asynchronous (Event Stream / Kafka) | Events flow continuously; multiple dashboard components consume the same stream independently |
| Payment processing (must complete before order is confirmed) | Synchronous (REST or gRPC) | Payment status is required immediately — the order cannot be confirmed until payment is known |
What AI Agents Get Wrong#
AI Agents and Communication Pattern Defaults
AI agents default to synchronous REST for every service-to-service interaction, regardless of whether the operation needs to be synchronous. This creates unnecessarily slow critical paths and brittle systems that fail when any downstream service becomes slow.
Protocol Comparison at a Glance#
| REST | gRPC | Message Queue | |
|---|---|---|---|
| Communication style | Synchronous | Synchronous | Asynchronous |
| Protocol | HTTP/1.1 or HTTP/2 | HTTP/2 | AMQP, proprietary |
| Payload format | JSON (text) | Protocol Buffers (binary) | JSON, binary, or custom |
| Typical latency | 50–200ms | 5–20ms | Milliseconds to minutes (by design) |
| Coupling | Temporal (caller waits) | Temporal (caller waits) | None — fully decoupled |
| Failure handling | Caller receives error immediately | Caller receives error immediately | Broker retries; DLQ for unprocessable messages |
| Best for | Public APIs, CRUD, simple request-response | High-throughput internal calls, streaming | Background jobs, fan-out, long-running tasks |
| Browser support | Full | Requires gRPC-Web proxy | Not applicable |
| Debugging | Easy — curl, Postman, browser DevTools | Needs grpcurl / Postman gRPC | Needs broker dashboard (RabbitMQ UI, AWS Console) |
Summary#
| Concept | Key Takeaway |
|---|---|
| Synchronous | Caller blocks and waits for a response. Simple to reason about but creates temporal coupling — a slow downstream service makes every caller slow. |
| Asynchronous | Caller publishes a message and immediately continues. Decoupled but eventually consistent — the work may happen milliseconds or minutes later. |
| REST | HTTP + JSON. Universal and simple. Best for public APIs, CRUD, and any case where simplicity beats raw performance. |
| gRPC | HTTP/2 + Protocol Buffers. 3–10× faster than REST. Best for high-throughput internal service-to-service calls where performance matters more than simplicity. |
| Message Queue | Broker-mediated. Best for background work, fan-out to multiple consumers, rate-limited processing, and anything that does not need to complete before the user sees a response. |
| Cascading failure | The key risk of synchronous chains — a slow downstream service blocks every upstream service. Add timeouts and circuit breakers to every synchronous dependency. |
| Critical path rule | Ask: 'Does the user need this result before they see a response?' If yes → synchronous. If no → asynchronous. Most systems need both. |
| AI agent default | AI generates synchronous REST for everything. Always specify which operations are critical path (synchronous) and which should be moved to a queue (asynchronous). |
A practical starting rule: use REST by default, then move an operation to a message queue when you can answer yes to any of these questions: Does it take more than a few hundred milliseconds to complete? Does it need to fan out to multiple independent services? Does it depend on an external system with rate limits? Can it fail without affecting what the user sees right now? If yes to any of these, it belongs in a queue, not on the synchronous critical path.
Sources:
- Microservices Communication Patterns: When to Use REST, gRPC, or Message Queues — DEV Community
- Synchronous by Design: Why async/await in REST or gRPC Doesn't Make Your System Asynchronous — DEV Community
- Performance Comparison: REST vs gRPC vs Asynchronous Communication — l3montree
- Microservices Patterns: Synchronous vs Asynchronous — greeeg.com
- REST vs. Messaging for Microservices — DZone
- Microservices Messaging: Why REST Isn't Always the Best Choice — CloudBees