fine the Receipt Schema
The receipt must enforce strict separation between four domains: infrastructure promise, capacity verification, execution trace, and output state. Each domain maps to a distinct cryptographic boundary.
interface InfrastructureClaim {
providerId: string;
gpuModel: string;
vramCapacity: number;
region: string;
reservationWindow: { start: string; end: string };
quoteSignature: string;
}
interface CapacityVerification {
challengeId: string;
vramAllocated: number;
driverVersion: string;
verificationTimestamp: string;
status: 'PASS' | 'FAIL';
}
interface ExecutionTrace {
containerDigest: string;
entryCommand: string;
modelArtifactHash: string;
inputManifestHash: string;
startedAt: string;
terminatedAt: string;
failureClass: string | null;
resourceSnapshot: { cpuPct: number; gpuMemUsed: number; gpuMemLimit: number };
}
interface OutputState {
artifactHash: string | null;
evaluationStatus: 'NOT_STARTED' | 'PENDING' | 'COMPLETED' | 'FAILED';
qualityScore: number | null;
}
interface ComputeReceipt {
jobId: string;
infrastructure: InfrastructureClaim;
capacity: CapacityVerification;
execution: ExecutionTrace;
output: OutputState;
settlementState: 'HELD' | 'RELEASED' | 'REFUNDED' | 'ESCALATED';
ledgerHash: string;
}
Phase 2: Implement Lifecycle Telemetry Hooks
Telemetry must be captured at container creation, memory allocation, and termination. The execution trace should never be inferred; it must be emitted by the runtime orchestrator.
class TelemetryCollector {
private trace: Partial<ExecutionTrace> = {};
onContainerPull(digest: string): void {
this.trace.containerDigest = digest;
}
onExecutionStart(command: string, modelHash: string, manifestHash: string): void {
this.trace.entryCommand = command;
this.trace.modelArtifactHash = modelHash;
this.trace.inputManifestHash = manifestHash;
this.trace.startedAt = new Date().toISOString();
}
onTermination(failureClass: string | null, gpuMemUsed: number, gpuMemLimit: number): void {
this.trace.terminatedAt = new Date().toISOString();
this.trace.failureClass = failureClass;
this.trace.resourceSnapshot = {
cpuPct: this.getCurrentCpuUsage(),
gpuMemUsed: gpuMemUsed,
gpuMemLimit: gpuMemLimit
};
}
seal(): ExecutionTrace {
if (!this.trace.startedAt || !this.trace.terminatedAt) {
throw new Error('Incomplete execution trace');
}
return this.trace as ExecutionTrace;
}
private getCurrentCpuUsage(): number {
return process.cpuUsage().user / 1000000;
}
}
Phase 3: Build the Settlement Router
Settlement must be deterministic. The router evaluates the receipt fields against a policy matrix and transitions the payment state without manual intervention.
type SettlementAction = 'RELEASE' | 'HOLD' | 'REFUND' | 'ESCALATE';
class SettlementRouter {
route(receipt: ComputeReceipt): SettlementAction {
const { capacity, execution, output } = receipt;
if (capacity.status === 'FAIL') {
return 'REFUND';
}
if (execution.failureClass === 'CONTAINER_OOM' || execution.failureClass === 'DRIVER_MISMATCH') {
return 'HOLD';
}
if (execution.failureClass === null && output.artifactHash !== null) {
return 'RELEASE';
}
if (execution.failureClass === null && output.artifactHash === null) {
return 'ESCALATE';
}
return 'HOLD';
}
}
Phase 4: Construct the Append-Only Dispute Ledger
Every state transition must be recorded as an immutable row. The ledger stores hashes of seller and buyer packets, not raw payloads, preserving privacy while enabling auditability.
interface LedgerEntry {
jobId: string;
transition: string;
capacityStatus: string;
executionStatus: string;
outputStatus: string;
sellerPacketHash: string;
buyerPacketHash: string;
decision: string;
timestamp: string;
}
class DisputeLedger {
private entries: LedgerEntry[] = [];
append(entry: LedgerEntry): void {
const previousHash = this.entries.length > 0
? this.entries[this.entries.length - 1].sellerPacketHash
: 'genesis';
const chainedEntry = {
...entry,
sellerPacketHash: `${previousHash}:${entry.sellerPacketHash}`
};
this.entries.push(chainedEntry);
}
getAuditTrail(jobId: string): LedgerEntry[] {
return this.entries.filter(e => e.jobId === jobId);
}
}
Architecture Rationale
- Layer Separation: Infrastructure, capacity, execution, and output are isolated to prevent blame diffusion. A driver mismatch should not invalidate a capacity check.
- Hash-Based Immutability: Ledger entries chain hashes to prevent retroactive modification. This removes trust in support representatives.
- State Machine Settlement: Payment decisions are derived from receipt fields, not marketing claims. This eliminates ambiguous escrow holds.
- TypeScript Enforcement: Strict typing prevents impossible states (e.g.,
output.artifactHash cannot exist if execution.failureClass is populated).
Pitfall Guide
1. Conflating Capacity Verification with Execution Success
Explanation: Platforms often treat a passed hardware challenge as proof that the workload ran correctly. Capacity checks only verify that VRAM, drivers, and PCIe lanes are functional. They do not validate container compatibility or model execution.
Fix: Enforce strict schema boundaries. Never allow capacity.status === 'PASS' to auto-settle payment. Require explicit execution.failureClass === null and output.artifactHash !== null before release.
2. Omitting Machine-Readable Failure Classes
Explanation: Logging generic errors like "job failed" or "container exited" forces manual log parsing. Automated systems cannot route settlement or trigger retries without structured failure taxonomy.
Fix: Implement a standardized failure classification enum: CONTAINER_OOM, DRIVER_MISMATCH, MODEL_LOAD_FAIL, INPUT_MANIFEST_INVALID, NETWORK_TIMEOUT. Map each class to a specific settlement action.
3. Leaking Workload Secrets in Seller Telemetry
Explanation: Sellers often include full container logs, environment variables, or model weights in dispute packets to prove compliance. This violates buyer privacy and exposes proprietary inference pipelines.
Fix: Restrict seller packets to infrastructure metadata: worker ID, capacity challenge result, reservation acceptance, container pull log, start/stop timestamps, and failure class. Hash all workload artifacts instead of exposing raw content.
4. Static Settlement Rules for Dynamic AI Workloads
Explanation: Hardcoding settlement logic (e.g., "always refund on failure") ignores nuanced scenarios like buyer-requested memory exceeding declared VRAM, or successful execution with poor output quality.
Fix: Implement a policy matrix that evaluates capacity, execution, and output states independently. Allow configurable thresholds for quality evaluation separate from compute fee release.
5. Mutable Dispute Ledgers
Explanation: Allowing support teams to edit or delete ledger rows after payment processing destroys auditability. Disputes become he-said-she-said rather than cryptographic fact.
Fix: Enforce append-only writes. Use hash-chained entries. Store sensitive payloads off-chain with only their SHA-256 digests in the ledger. Implement write permissions restricted to orchestrator services, not human operators.
Explanation: Without hashing input manifests and output artifacts, buyers can claim a different dataset was used, or sellers can claim output was generated when it wasn't. Reproducibility collapses.
Fix: Require inputManifestHash and outputArtifactHash in every receipt. Validate hashes against pinned storage (IPFS, Arweave, or centralized object storage with immutable versioning).
7. Treating OOM as a Network Failure
Explanation: Out-of-memory crashes are frequently misclassified as infrastructure failures, triggering unnecessary refunds and provider penalties. OOM is usually a workload configuration issue.
Fix: Classify OOM as CONTAINER_OOM and route to HOLD for inspection. Compare resourceSnapshot.gpuMemUsed against infrastructure.vramCapacity. If used memory exceeds declared capacity, attribute failure to buyer workload specification.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High-Value Training Job | Full structured receipt with quality evaluation | Training runs require precise attribution for multi-hour compute spend | +15% overhead for telemetry, -40% dispute cost |
| Batch Inference Pipeline | Lightweight receipt with execution trace only | Throughput matters more than granular dispute routing | +5% overhead, neutral settlement accuracy |
| Debug/Development Run | Minimal receipt with capacity + execution status | Fast iteration requires low latency; disputes are internal | +2% overhead, high developer velocity |
| Multi-Provider Fallback | Structured receipt with provider scoring | Enables automatic routing to reliable nodes based on failure class history | +10% overhead, +25% uptime reliability |
Configuration Template
receipt:
schema_version: "1.0"
layers:
infrastructure:
required_fields: [providerId, gpuModel, vramCapacity, region, reservationWindow]
validation: sha256_signature
capacity:
required_fields: [challengeId, vramAllocated, driverVersion, status]
validation: hourly_challenge_proof
execution:
required_fields: [containerDigest, entryCommand, modelArtifactHash, inputManifestHash, failureClass]
validation: runtime_emitter
output:
required_fields: [artifactHash, evaluationStatus]
validation: content_addressable_storage
settlement:
policy_matrix:
- condition: "capacity.status == FAIL"
action: REFUND
- condition: "execution.failureClass == CONTAINER_OOM"
action: HOLD
- condition: "execution.failureClass == null AND output.artifactHash != null"
action: RELEASE
- condition: "execution.failureClass == null AND output.artifactHash == null"
action: ESCALATE
ledger:
append_only: true
hash_chain: true
privacy_mode: "packet_digest_only"
Quick Start Guide
- Initialize Receipt Builder: Deploy the
TelemetryCollector and SettlementRouter as sidecar containers alongside your AI workload orchestrator. Configure them to listen to container lifecycle events.
- Pin Manifests: Hash your input datasets and model artifacts before job submission. Store the digests in the
ComputeReceipt schema. Use IPFS or S3 with version locking for retrieval.
- Configure Settlement Policy: Load the YAML policy matrix into your payment gateway. Map
HOLD states to escrow contracts and RELEASE states to automatic token transfers.
- Validate with Failure Injection: Submit a test job that intentionally requests 85% of declared VRAM. Verify that the receipt captures
CONTAINER_OOM, the ledger records the transition, and settlement routes to HOLD instead of REFUND.
- Enable Audit Queries: Expose a read-only endpoint that returns the hash-chained ledger trail for any
jobId. Integrate this with your support dashboard to resolve disputes in under four hours.