Serverless SMS Delivery Verification: Webhook Patterns for Sinch and Lambda
Current Situation Analysis
In serverless communication architectures, a common anti-pattern is treating the HTTP 200 OK response from an SMS provider as proof of delivery. When you invoke a messages:send endpoint, the provider acknowledges receipt of your request, not the recipient's device. This creates a "black box" scenario where critical notifications—MFA codes, password resets, fraud alerts—may silently fail due to carrier filtering, disconnected numbers, or network outages.
This gap is frequently overlooked because initial development focuses on the happy path. However, in production, the inability to verify delivery leads to:
- Operational Blindness: Support teams cannot distinguish between "user didn't receive" and "user ignored."
- Compliance Risks: Financial and healthcare sectors often require audit trails proving a message reached the intended party.
- Inefficient Retries: Without status feedback, systems cannot automatically retry failed messages or fall back to alternative channels like email.
Data from the Sinch Conversation API indicates that a single outbound message generates multiple state transitions. Relying solely on the send acknowledgment ignores the DELIVERED, FAILED, and READ states that occur asynchronously. Implementing a webhook-based receipt system transforms SMS from a fire-and-forget mechanism into a verifiable communication channel.
WOW Moment: Key Findings
The shift from send-only to receipt-driven architectures introduces measurable improvements in reliability and operational control. The following comparison highlights the impact of implementing delivery receipts.
| Strategy | Delivery Visibility | Failure Recovery | Compliance Audit | Invocation Overhead |
|---|
| Send-Only | None | Manual investigation required | No proof of receipt | Low (1 invocation per message) |
| Receipt-Driven | Full state tracking | Automated retry/fallback | Yes (status logs) | Moderate (2–3 invocations per message) |
Why this matters: The invocation overhead increases by 200–300% because Sinch emits callbacks for QUEUED_ON_CHANNEL, DELIVERED, or FAILED. However, this cost is negligible compared to the business impact of undelivered transactional messages. The receipt-driven approach enables idempotent state updates, automated alerting on failure codes, and precise correlation of messages to business entities.
Core Solution
The architecture leverages AWS Lambda Function URLs for direct HTTPS ingress, eliminating the need for API Gateway overhead. The solution prioritizes security via HMAC validation and reliability via asynchronous processing.
Architecture Decisions
- Lambda Function URL: Provides a native HTTPS endpoint for the webhook. This reduces latency and cost compared to API Gateway while maintaining serverless scalability.
- HMAC-SHA256 Validation: Sinch signs every webhook payload. The Lambda must verify the signature to prevent spoofing. The signature covers the body, a nonce, and a timestamp to prevent replay attacks.
- SQS Decoupling: Webhook handlers must return
200 OK rapidly. Sinch expects low latency and retries on 5xx or 429 responses. Heavy processing (database writes, external API calls) should be offloaded to an Amazon SQS queue to ensure the callback is acknowledged immediately.
- SSM Parameter Store: The webhook secret is stored as a
SecureString in AWS Systems Manager Parameter Store. The Lambda caches the secret in the execution environment to minimize SSM API calls.
Implementation
The following TypeScript implementation demonst
rates a robust handler with signature verification, payload parsing, and SQS offloading.
import { createHmac, timingSafeEqual } from 'crypto';
import { SSMClient, GetParameterCommand } from '@aws-sdk/client-ssm';
import { SQSClient, SendMessageCommand } from '@aws-sdk/client-sqs';
import { APIGatewayProxyEvent, APIGatewayProxyResult } from 'aws-lambda';
// Interfaces for type safety
interface SinchHeaders {
'x-sinch-webhook-signature': string;
'x-sinch-webhook-signature-nonce': string;
'x-sinch-webhook-signature-timestamp': string;
}
interface DeliveryReport {
message_id: string;
status: 'QUEUED_ON_CHANNEL' | 'DELIVERED' | 'FAILED' | 'READ';
reason?: {
code: string;
description: string;
sub_code?: string;
channel_code?: string;
};
channel_identity?: {
channel: string;
identity: string;
};
}
interface WebhookPayload {
message_delivery_report: DeliveryReport;
app_id: string;
project_id: string;
event_time: string;
}
// Singleton clients for reuse across invocations
const ssmClient = new SSMClient({});
const sqsClient = new SQSClient({});
let cachedSecret: string | null = null;
async function retrieveWebhookSecret(paramName: string): Promise<string> {
if (cachedSecret) return cachedSecret;
const response = await ssmClient.send(
new GetParameterCommand({
Name: paramName,
WithDecryption: true,
})
);
if (!response.Parameter?.Value) {
throw new Error('Webhook secret not found in SSM');
}
cachedSecret = response.Parameter.Value;
return cachedSecret;
}
function verifyIntegrity(
body: string,
signature: string,
nonce: string,
timestamp: string,
secret: string
): boolean {
const payload = `${body}.${nonce}.${timestamp}`;
const expectedSignature = createHmac('sha256', secret)
.update(payload)
.digest('base64');
// Timing-safe comparison to prevent timing attacks
return (
signature.length === expectedSignature.length &&
timingSafeEqual(Buffer.from(signature), Buffer.from(expectedSignature))
);
}
export const handler = async (
event: APIGatewayProxyEvent
): Promise<APIGatewayProxyResult> => {
try {
const headers = event.headers as unknown as SinchHeaders;
const body = event.body || '';
// 1. Validate required headers
const signature = headers['x-sinch-webhook-signature'];
const nonce = headers['x-sinch-webhook-signature-nonce'];
const timestamp = headers['x-sinch-webhook-signature-timestamp'];
if (!signature || !nonce || !timestamp) {
return { statusCode: 400, body: 'Missing signature headers' };
}
// 2. Retrieve secret and verify HMAC
const secretParam = process.env.WEBHOOK_SECRET_PARAM!;
const secret = await retrieveWebhookSecret(secretParam);
const isValid = verifyIntegrity(body, signature, nonce, timestamp, secret);
if (!isValid) {
console.error('HMAC verification failed');
return { statusCode: 401, body: 'Invalid signature' };
}
// 3. Parse and validate payload
const payload: WebhookPayload = JSON.parse(body);
if (!payload.message_delivery_report) {
return { statusCode: 400, body: 'Invalid payload structure' };
}
// 4. Offload to SQS for async processing
const queueUrl = process.env.DELIVERY_QUEUE_URL!;
await sqsClient.send(
new SendMessageCommand({
QueueUrl: queueUrl,
MessageBody: JSON.stringify(payload),
MessageGroupId: payload.message_delivery_report.message_id, // FIFO ordering if needed
})
);
// 5. Acknowledge immediately
return { statusCode: 200, body: 'OK' };
} catch (error) {
console.error('Webhook processing error:', error);
// Return 500 to trigger Sinch retry on transient errors
return { statusCode: 500, body: 'Internal error' };
}
};
Rationale:
- Timing-Safe Comparison:
timingSafeEqual prevents timing side-channel attacks during signature verification.
- SQS Integration: Decoupling ensures the Lambda returns
200 within milliseconds, satisfying Sinch's latency requirements and preventing unnecessary retries.
- Error Handling: Returning
500 on internal errors triggers Sinch's exponential backoff retry mechanism. Returning 400 or 401 indicates a permanent failure, stopping retries.
Pitfall Guide
-
Treating the First Callback as Final
- Explanation: Sinch sends multiple callbacks per message. The first is often
QUEUED_ON_CHANNEL. Assuming this is the final status leads to incorrect reporting.
- Fix: Implement a state machine or update logic that tracks the latest status. Only treat
DELIVERED or FAILED as terminal states.
-
Blocking the Callback Handler
- Explanation: Performing database writes or external API calls inside the Lambda handler increases latency. If the handler times out or responds slowly, Sinch may retry the callback, causing duplicates.
- Fix: Always return
200 OK immediately after validation. Push processing to SQS or EventBridge.
-
Assuming Sequential Callback Order
- Explanation: Network conditions can cause callbacks to arrive out of order. A
DELIVERED event might arrive before QUEUED_ON_CHANNEL.
- Fix: Design idempotent updaters that rely on the
message_id and event_time. Do not assume callbacks arrive in chronological order.
-
Missing Deduplication Logic
- Explanation: Sinch may deliver duplicate callbacks due to network glitches or retry mechanisms.
- Fix: Use
message_id combined with status as a deduplication key. Check if the status has already been processed before updating downstream systems.
-
Misinterpreting 4xx Responses
- Explanation: Sinch retries on
5xx and 429 but treats 4xx (except 429) as permanent failures. If your handler returns 400 due to a parsing error, Sinch stops retrying, and you lose the receipt.
- Fix: Ensure robust parsing. If a payload is malformed but the signature is valid, log the error and return
200 to acknowledge receipt, or return 500 to trigger a retry if the error is transient.
-
Hardcoding Secrets
- Explanation: Embedding the webhook secret in code or environment variables without encryption risks exposure.
- Fix: Store secrets in AWS SSM Parameter Store or Secrets Manager. Use
SecureString types and IAM roles to restrict access.
-
Ignoring Failure Reason Codes
- Explanation:
FAILED status includes a reason object with codes like RECIPIENT_NOT_REACHABLE or CHANNEL_FAILURE. Ignoring these prevents targeted error handling.
- Fix: Parse the
reason.code to implement specific logic. For example, RECIPIENT_INVALID_CHANNEL_IDENTITY might indicate a typo in the user's phone number, triggering a data correction workflow.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High Volume (>10k msgs/day) | SQS Decoupling + FIFO Queue | Prevents Lambda throttling, ensures ordering, handles bursts | SQS pricing + Lambda invocations |
| Low Volume (<1k msgs/day) | Inline Processing | Simpler architecture, lower latency, easier debugging | Lambda invocations only |
| Strict Compliance Requirements | DynamoDB with Audit Table | Immutable logs, point-in-time recovery, query flexibility | DynamoDB storage + write capacity |
| Complex Routing Logic | EventBridge + Multiple Targets | Decouples routing from ingestion, supports fan-out patterns | EventBridge rules + target costs |
| Advanced Security Needs | API Gateway + WAF | IP filtering, rate limiting, request validation | API Gateway + WAF costs |
Configuration Template
AWS SAM template snippet for deploying the webhook infrastructure.
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Parameters:
WebhookSecretParam:
Type: String
Description: SSM Parameter name for the webhook secret
Default: /sinch/webhook-secret
Resources:
DeliveryReceiptFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: webhook.handler
Runtime: nodejs20.x
Environment:
Variables:
WEBHOOK_SECRET_PARAM: !Ref WebhookSecretParam
DELIVERY_QUEUE_URL: !Ref DeliveryQueue
Policies:
- SSMParameterReadPolicy:
ParameterName: !Ref WebhookSecretParam
- SQSSendMessagePolicy:
QueueName: !GetAtt DeliveryQueue.QueueName
FunctionUrlConfig:
AuthType: NONE
Cors:
AllowOrigins:
- "*"
DeliveryQueue:
Type: AWS::SQS::Queue
Properties:
VisibilityTimeout: 30
MessageRetentionPeriod: 1209600 # 14 days
DeliveryProcessor:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: processor.handler
Runtime: nodejs20.x
Events:
QueueEvent:
Type: SQS
Properties:
Queue: !GetAtt DeliveryQueue.Arn
BatchSize: 10
Outputs:
FunctionUrl:
Description: "Lambda Function URL for Sinch Webhook"
Value: !GetAtt DeliveryReceiptFunctionUrl.FunctionUrl
Quick Start Guide
- Deploy Infrastructure: Run
sam build and sam deploy to provision the Lambda, SQS queue, and Function URL. Note the output URL.
- Store Secret: Use the AWS CLI to create the SSM parameter:
aws ssm put-parameter \
--name /sinch/webhook-secret \
--value "your-generated-secret" \
--type SecureString
- Register Webhook: In the Sinch Dashboard, navigate to Conversation API > Apps > Webhooks. Add a new webhook with the Function URL, paste the secret, and select
MESSAGE_DELIVERY as the trigger.
- Send Test Message: Trigger an SMS via your application or the Sinch dashboard.
- Verify Receipts: Check CloudWatch Logs for the Lambda function to confirm
200 responses. Inspect the SQS queue or downstream database to verify the delivery status was processed correctly.
🎉 Mid-Year Sale — Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all 635+ tutorials.
Sign In / Register — Start Free Trial7-day free trial · Cancel anytime · 30-day money-back