Why Kairon runs a separate gRPC authorization service
Architecting Fail-Closed Authorization for Autonomous Trading Systems
Current Situation Analysis
The transition from human-driven interfaces to autonomous agent execution fundamentally changes how authorization must be engineered. In traditional web applications, authentication and capability checks live inside middleware layers or route guards. When a check fails, the user sees a 403 error, refreshes, or logs in again. The failure is contained, idempotent, and financially neutral.
Autonomous trading agents operate under entirely different constraints. They execute continuously, interact with live market data, and deploy real capital without human intervention. When authorization logic is coupled directly to the execution runtime, failure modes shift from UX friction to financial incidents. A stale session cache might permit an expired strategy to trade. A race condition between concurrent agent workers could bypass quota limits. An unhandled exception in a middleware chain might inadvertently default to allow. In a UI context, these are bugs. In an agent runtime, they are compliance violations and capital losses.
The industry routinely underestimates this distinction because authorization is treated as a cross-cutting concern rather than a critical control plane. Teams assume that adding a capability flag to a procedure or wrapping a route in a guard is sufficient. This assumption breaks down when execution paths are non-deterministic, highly concurrent, and financially irreversible. The core oversight is treating authorization as a fast-path optimization instead of a deterministic gate.
Data from production agent deployments consistently shows that inline auth checks introduce unpredictable failure semantics. Network partitions, memory pressure on the API server, and cache invalidation delays create windows where unauthorized executions slip through. The solution requires decoupling authorization from execution, enforcing explicit failure semantics, and maintaining an immutable audit trail. The latency cost of a separate authorization service is negligible compared to the financial and regulatory risk of silent authorization bypasses.
WOW Moment: Key Findings
Decoupling authorization into an isolated service fundamentally changes failure semantics, auditability, and operational resilience. The following comparison illustrates why inline checks fail under autonomous workloads while isolated gateways succeed.
| Approach | Failure Mode | Audit Granularity | Latency Impact |
|---|---|---|---|
| Inline Middleware | Default-allow on exception | Request-level logs | <0.5ms |
| Isolated gRPC Gateway | Fail-closed on timeout | Execution-level records | ~2.0ms |
The isolated gateway trades approximately 1.5ms of additional latency for deterministic failure behavior. In trading systems, a 2ms authorization round-trip is operationally irrelevant compared to market execution latency, which typically ranges from 50ms to 500ms depending on exchange topology. What this trade enables is a single, auditable decision point that cannot be bypassed by runtime exceptions, cache staleness, or memory pressure. Every authorization attempt becomes a structured record rather than an unstructured log line, enabling compliance replay, incident forensics, and automated policy enforcement.
Core Solution
Building a fail-closed authorization gateway requires strict separation of concerns, explicit failure semantics, and deterministic audit logging. The implementation below demonstrates a production-ready TypeScript gRPC authorization service designed for autonomous agent execution.
Step 1: Define the gRPC Contract
The service contract must enforce explicit capability and quota validation before permitting execution. Using Protocol Buffers ensures strict typing, backward compatibility, and low serialization overhead.
syntax = "proto3";
package authgate.v1;
service ExecutionGate {
rpc VerifyRun(ExecutionRequest) returns (ExecutionResponse);
rpc CheckPermissions(PermissionRequest) returns (PermissionResponse);
rpc ValidateLimits(LimitRequest) returns (LimitResponse);
}
message ExecutionRequest {
string agent_id = 1;
string tenant_id = 2;
string strategy_hash = 3;
int64 requested_at = 4;
}
message ExecutionResponse {
bool authorized = 1;
string reason = 2;
string audit_id = 3;
}
message PermissionRequest {
string agent_id = 1;
string tenant_id = 2;
repeated string required_capabilities = 3;
}
message PermissionResponse {
bool granted = 1;
string missing_capability = 2;
}
message LimitRequest {
string tenant_id = 1;
string limit_type = 2;
int32 requested_amount = 3;
}
message LimitResponse {
bool within_limit = 1;
int32 remaining = 2;
}
Step 2: Implement the Fail-Closed Server
The server must reject requests when dependencies are unavailable, enforce atomic quota checks, and write audit records before responding.
import * as grpc from '@grpc/grpc-js';
import * as protoLoader from '@grpc/proto-loader';
import { v4 as uuidv4 } from 'uuid';
import { createPool, Pool } from 'mysql2/promise';
const PROTO_PATH = './authgate.proto';
const packageDefinition = protoLoader.loadSync(PROTO_PATH);
const grpcObject = grpc.loadPackageDefinition(packageDefinition);
const { ExecutionGate } = grpcObject.authgate.v1 as any;
const dbPool: Pool = createPool({
host: process.env.DB_HOST || 'localhost',
user: process.env.DB_USER || 'auth_user',
password: process.env.DB_PASS || '',
database: process.env.DB_NAME || 'auth_audit',
waitForConnections: true,
connectionLimit: 10,
queueLimit: 0,
});
async function checkPermissions(req: any): Promise<boolean> {
const [rows] = await dbPool.query(
'SELECT COUNT(*) as granted FROM agent_capabilities WHERE agent_id = ? AND tenant_id = ? AND capability IN (?)',
[req.agent_id, req.tenant_id, req.required_capabilities]
);
return (rows as any[])[0].granted === req.required_capabilities.length;
}
async function validateLimits(req: any): Promise<{ withinLimit: boolean; remaining: number }> {
const connection = await dbPool.getConnection();
try {
await connection.beginTransaction();
const [rows] = await connection.query(
'SELECT remaining FROM tenant_limits WHERE tenant_id = ? AND limit_type = ? FOR UPDATE',
[req.tenant_id, req.limit_type]
);
const current = (rows as any[])[0]?.remaining ?? 0;
if (current < req.requested_amount) {
await connection.rollback();
return { withinLimit: false, remaining: current };
}
await connection.query(
'UPDATE tenant_limits SET remaining = remaining - ? WHERE tenant_id = ? AND limit_type = ?',
[req.requested_amount, req.tenant_id, req.limit_type]
);
await connection.commit();
return { withinLimit: true, remaining: current - req.requested_amount };
} catch (err) {
await connection.rollback();
throw err;
} finally {
connection.release();
}
}
async function writeAuditRecord(req: any, authorized: boolean, reason: string): Promise<string> {
const auditId = uuidv4();
await dbPool.query(
'INSERT INTO execution_audit (audit_id, agent_id, tenant_id, strategy_hash, authorized, reason, recorded_at) VALUES (?, ?, ?, ?, ?, ?, NOW())',
[auditId, req.agent_id, req.tenant_id, req.strategy_hash, authorized, reason]
);
return auditId;
}
const server = new grpc.Server();
server.addService(ExecutionGate, {
verifyRun: async (call: any, callback: any) => {
const req = call.request;
try {
const hasPermissions = await checkPermissions(req);
if (!hasPermissions) {
const auditId = await writeAuditRecord(req, false, 'missing_capabilities');
return callback(null, { authorized: false, reason: 'missing_capabilities', audit_id: auditId });
}
const limitCheck = await validateLimits(req);
if (!limitCheck.withinLimit) {
const auditId = await writeAuditRecord(req, false, 'quota_exceeded');
return callback(null, { authorized: false, reason: 'quota_exceeded', audit_id: auditId });
}
const auditId = await writeAuditRecord(req, true, 'authorized');
callback(null, { authorized: true, reason: 'success', audit_id: auditId });
} catch (err) {
const auditId = await writeAuditRecord(req, false, 'service_unavailable');
callback(null, { authorized: false, reason: 'service_unavailable', audit_id: auditId });
}
},
});
server.bindAsync('0.0.0.0:50052', grpc.ServerCredentials.createInsecure(), (err, port) => {
if (err) {
console.error('Failed to start ExecutionGate:', err);
process.exit(1);
}
console.log(`ExecutionGate listening on :${port}`);
});
Step 3: Client-Side Integration with Explicit Failure Handling
Agent runtimes must treat authorization as a hard dependency. The client wrapper enforces timeouts, circuit breaking, and explicit rejection on failure.
import * as grpc from '@grpc/grpc-js';
import * as protoLoader from '@grpc/proto-loader';
import { CircuitBreaker } from 'opossum';
const PROTO_PATH = './authgate.proto';
const packageDefinition = protoLoader.loadSync(PROTO_PATH);
const grpcObject = grpc.loadPackageDefinition(packageDefinition);
const { ExecutionGate } = grpcObject.authgate.v1 as any;
const client = new ExecutionGate(
'localhost:50052',
grpc.credentials.createInsecure()
);
const authBreaker = new CircuitBreaker(
(req: any) => new Promise((resolve, reject) => {
client.verifyRun(req, { deadline: Date.now() + 500 }, (err: any, res: any) => {
if (err) return reject(err);
resolve(res);
});
}),
{ timeout: 500, errorThresholdPercentage: 50, resetTimeout: 10000 }
);
export async function authorizeAgentRun(agentId: string, tenantId: string, strategyHash: string) {
const request = { agent_id: agentId, tenant_id: tenantId, strategy_hash: strategyHash, requested_at: Date.now() };
try {
const result = await authBreaker.fire(request);
if (!result.authorized) {
throw new Error(`Authorization rejected: ${result.reason} [${result.audit_id}]`);
}
return { authorized: true, auditId: result.audit_id };
} catch (err: any) {
throw new Error(`Authorization gateway unreachable or failed: ${err.message}`);
}
}
Architecture Decisions and Rationale
Why gRPC over REST? Protocol Buffers enforce strict contract validation at compile time, eliminating ambiguous JSON payloads. Binary serialization reduces payload size by 60-80% compared to JSON, which matters when authorization calls happen thousands of times per minute. gRPC also supports bidirectional streaming, enabling future quota streaming or real-time policy updates without protocol changes.
Why fail-closed semantics? The catch block in the server explicitly returns authorized: false with reason: service_unavailable. This prevents silent pass-through during network partitions or database outages. Autonomous systems must assume denial when control plane state is uncertain.
Why separate process isolation? Running authorization on a dedicated port isolates it from API server memory pressure, garbage collection pauses, and request queue saturation. The audit table has exactly one writer with one responsibility, eliminating lock contention with business logic queries. Horizontal scaling becomes predictable because auth throughput depends only on capability lookups and quota decrements, not trade execution complexity.
Why atomic quota transactions? The FOR UPDATE lock ensures concurrent agent runs cannot bypass limits through race conditions. Rolling back on failure prevents partial state corruption. This is critical when multiple workers process the same tenant simultaneously.
Pitfall Guide
1. Silent Fallbacks (Default-Allow)
Explanation: Catch blocks that return true or skip authorization when the service is unreachable. This creates windows where unauthorized trades execute during outages.
Fix: Always return authorized: false on exception. Implement explicit rejection paths and never assume success when control plane state is uncertain.
2. Race Conditions in Quota Validation
Explanation: Reading quota limits without locking allows concurrent requests to pass validation before decrementing, effectively bypassing limits.
Fix: Use database-level row locking (FOR UPDATE) or distributed locks (Redis SETNX) to serialize quota checks. Validate and decrement in a single atomic transaction.
3. Coupling Auth to Business Logic
Explanation: Embedding capability checks inside trade execution functions creates tight coupling. Business logic becomes responsible for authorization state, making testing and auditing difficult. Fix: Enforce strict separation. The execution layer only receives a boolean or structured response from the auth gateway. All policy evaluation lives exclusively in the control plane.
4. Inadequate Timeout Configuration
Explanation: Waiting indefinitely for authorization responses blocks agent workers and causes cascading failures during auth service degradation. Fix: Set hard client-side timeouts (e.g., 500ms). Pair with circuit breakers that open after repeated failures, failing fast until the service recovers.
5. Non-Idempotent Audit Writes
Explanation: Retrying authorization requests without idempotency keys creates duplicate audit records, corrupting compliance reports and skewing quota metrics.
Fix: Generate deterministic audit IDs based on request hashes (agent ID + tenant ID + strategy hash + timestamp). Use INSERT IGNORE or upsert logic to prevent duplicates.
6. Ignoring Clock Skew in Token Validation
Explanation: Agent runtimes and auth servers operating on unsynchronized clocks cause valid tokens to appear expired or not-yet-valid.
Fix: Enforce NTP synchronization across all nodes. Add a 5-second tolerance window for exp and nbf claims. Prefer stateless JWT validation with strict issuer and audience checks.
7. Over-Provisioning Auth Service Resources
Explanation: Assuming auth will always be lightweight leads to under-provisioned databases or connection pools, causing bottlenecks during traffic spikes.
Fix: Monitor auth throughput independently. Use connection pooling with waitForConnections: true and queueLimit: 0. Scale horizontally based on p99 latency, not average load.
Production Bundle
Action Checklist
- Define explicit gRPC contract: Establish strict proto3 definitions for capability, quota, and execution validation with clear success/failure semantics.
- Implement fail-closed server: Ensure all exception paths return
authorized: falsewith descriptive rejection reasons and audit IDs. - Enforce atomic quota checks: Use database row locking or distributed locks to prevent race conditions during concurrent limit validation.
- Configure client timeouts: Set hard 500ms deadlines on authorization calls and integrate circuit breakers to prevent cascading failures.
- Generate idempotent audit records: Derive audit IDs from request hashes to prevent duplicate compliance entries during retries.
- Synchronize system clocks: Deploy NTP across all nodes and add tolerance windows for token expiration validation.
- Monitor auth independently: Track p99 latency, error rates, and quota throughput separately from business logic metrics.
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Low-frequency UI actions | Inline middleware | Latency sensitivity outweighs audit requirements | Low infrastructure cost |
| High-frequency agent runs | Isolated gRPC gateway | Fail-closed semantics and atomic quota enforcement prevent financial incidents | Moderate compute cost, high risk mitigation |
| Compliance-heavy environments | Dedicated auth service with immutable audit log | Regulatory requirements demand replayable, tamper-evident authorization trails | Higher storage cost, mandatory for audits |
| Startup/MVP phase | Inline checks with feature flag | Faster iteration, defer complexity until scale justifies separation | Low initial cost, high technical debt later |
Configuration Template
# authgate.config.yaml
grpc:
port: 50052
maxReceiveMessageLength: 4194304
maxSendMessageLength: 4194304
database:
host: ${DB_HOST}
user: ${DB_USER}
password: ${DB_PASS}
database: auth_audit
connectionLimit: 20
waitForConnections: true
queueLimit: 0
acquireTimeout: 5000
circuitBreaker:
timeout: 500
errorThresholdPercentage: 50
resetTimeout: 10000
volumeThreshold: 10
audit:
table: execution_audit
retentionDays: 365
idempotency: true
index:
- agent_id
- tenant_id
- recorded_at
Quick Start Guide
- Initialize the project: Run
npm init -y && npm install @grpc/grpc-js @grpc/proto-loader mysql2 uuid opossumto install core dependencies. - Define the contract: Create
authgate.protowith the service definitions forExecutionGate,PermissionRequest, andLimitRequest. - Deploy the server: Start the Node.js gRPC service on port
50052. Verify connectivity usinggrpcurlor a test client. - Integrate the client: Import
authorizeAgentRuninto your agent runtime. Wrap trade execution calls with the authorization check and handle rejection explicitly. - Validate failure semantics: Simulate database outages and network partitions. Confirm that the system rejects execution requests and logs
service_unavailableaudit records without silent pass-through.
Mid-Year Sale β Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register β Start Free Trial7-day free trial Β· Cancel anytime Β· 30-day money-back
