I Built a Soulver Clone for Android Using Only Claude — Here's the Stack and the Lessons
Beyond Monolithic Grammars: Architecting Offline Natural-Language Parsers with AI-Assisted Workflows
Current Situation Analysis
Building offline, natural-language computation tools for mobile platforms remains one of the most architecturally demanding tasks in modern application development. The core challenge isn’t rendering a UI or handling touch events—it’s designing a parsing engine that can interpret unstructured input like “project budget $12k minus 15% tax” without relying on cloud APIs or external inference services. Traditional approaches rely on monolithic grammars, recursive descent parsers, or heavy regex chains. These work adequately for simple arithmetic but fracture under domain complexity. Every new token type (dates, units, percentages, currency symbols) introduces edge cases that cascade through the entire parsing tree, forcing developers into constant refactoring cycles.
This problem is frequently misunderstood in the AI-assisted development era. Many teams treat large language models as autonomous architects, expecting them to generate complete, production-ready systems from vague prompts. In practice, LLMs excel at implementation but struggle with systemic constraint management. Without explicit architectural guardrails, generated code tends toward over-engineering, defensive null-checking, and fragile abstractions. The data reflects this reality: solo developers using AI for complex mobile projects report initial code rejection rates hovering around 30%, primarily due to unnecessary abstraction layers and hallucinated framework APIs. The bottleneck has shifted from “can I write the code?” to “can I define the constraints clearly enough for the AI to execute them reliably?”
Offline-first mobile applications eliminate backend costs and latency but demand robust local state management and deterministic execution. When combined with AI pair-programming, the development model requires strict role separation. The human must own product logic, UX validation, and architectural boundaries, while the AI handles implementation, boilerplate generation, and regression testing. Attempting to reverse this dynamic—letting the AI choose features or design the architecture—consistently results in bloated codebases and unpredictable behavior.
WOW Moment: Key Findings
The most significant architectural breakthrough in this space comes from abandoning monolithic parsing in favor of a priority-driven recognizer pipeline. When comparing traditional grammar-based approaches against modular domain matchers, the operational differences are stark.
| Architecture | Extension Cost (Lines) | Error Isolation | Refactor Frequency |
|---|---|---|---|
| Monolithic Grammar | 400–600 | Low (cascading failures) | Every 2–3 features |
| Priority Pipeline | 50–80 | High (domain-specific) | Near zero |
This finding matters because it decouples domain logic from execution flow. Instead of rewriting the entire parser when adding temperature, currency, or unit support, developers register a new matcher with a defined priority. Unrecognized tokens simply pass through to the next handler or fall back to a fuzzy scanner. The pipeline approach transforms parsing from a fragile, tightly-coupled system into a composable, testable engine. It also aligns perfectly with AI-assisted workflows: the LLM can generate isolated matchers without risking regression in unrelated domains. The priority system guarantees deterministic behavior while maintaining O(n) execution complexity relative to token count.
Core Solution
Building a production-grade offline calculator requires three interconnected layers: a tokenizer, a priority-based matcher pipeline, and a deterministic state management layer. Below is the implementation strategy, using Flutter 3.5, Riverpod 2, and Hive for local persistence.
Step 1: Tokenization Layer
Raw input strings must be normalized before parsing. Instead of processing the entire string at once, split it into discrete units that preserve type, position, and length. This prevents the parser from making assumptions about string boundaries.
enum TokenCategory { numeric, operator, literal, currency, temporal, unknown }
class ParsedSegment {
final String raw;
final TokenCategory category;
final int offset;
final int span;
const ParsedSegment({
required this.raw,
required this.category,
required this.offset,
required this.span,
});
}
class InputTokenizer {
List<ParsedSegment> segment(String input) {
final segments = <ParsedSegment>[];
var cursor = 0;
final buffer = StringBuffer();
for (var i = 0; i < input.length; i++) {
final char = input[i];
if (_isBoundary(char)) {
if (buffer.isNotEmpty) {
segments.add(_classify(buffer.toString(), cursor));
buffer.clear();
cursor = i;
}
segments.add(ParsedSegment(
raw: char,
category: _mapBoundary(char),
offset: i,
span: 1,
));
} else {
buffer.write(char);
}
}
if (buffer.isNotEmpty) {
segments.add(_classify(buffer.toString(), cursor));
}
return segments;
}
// Boundary detection and classification helpers omitted for brevity
}
Step 2: Priority-Based Matcher Interface
Each domain (currency, math, dates, units) implements a single contract. The priority integer dictates execution order. Higher values execute first, ensuring specific patterns (like currency symbols or temporal markers) are captured before generic text or numeric literals.
abstract class DomainMatcher {
int get priority;
MatchClaim? evaluate(List<ParsedSegment> segments, int position);
}
class MatchClaim {
final int consumed;
final dynamic resolved;
const MatchClaim({required this.consumed, required this.resolved});
}
Step 3: Pipeline Execution Engine
The engine iterates through segments, querying matchers in descending priority order. If a matcher claims segments, the cursor advances. Unclaimed segments are skipped or passed to a fallback handler. This design ensures that adding a new domain requires zero modifications to existing matchers.
class ExpressionRouter {
final List<DomainMatcher> _handlers;
ExpressionRouter(List<DomainMatcher> handlers)
: _handlers = List.unmodifiable(handlers..sort((a, b) => b.priority.compareTo(a.priority)));
List<dynamic> resolve(List<ParsedSegment> segments) {
final results = <dynamic>[];
var cursor = 0;
while (cursor < segments.length) {
var claimed = false;
for (final handler in _handlers) {
final outcome = handler.evaluate(segments, cursor);
if (outcome != null) {
results.add(outcome.resolved);
cursor += outcome.consumed;
claimed = true;
break;
}
}
if (!claimed) cursor++;
}
return results;
}
}
Step 4: State Management & Persistence
Riverpod 2 handles reactive state without boilerplate. Hive provides fast, offline-first serialization. The architecture separates computation from UI state, ensuring the parser remains pure and testable.
@riverpod
class ComputationEngine extends _$ComputationEngine {
late final ExpressionRouter _router;
@override
EngineState build() {
_router = ExpressionRouter([
CurrencyMatcher(),
PercentageMatcher(),
MathMatcher(),
FallbackMatcher(),
]);
return EngineState.initial();
}
void ingest(String raw) {
final segments = InputTokenizer().segment(raw);
final resolved = _router.resolve(segments);
state = state.copyWith(
segments: segments,
resolvedValues: resolved,
fault: null,
);
}
}
Architecture Rationale
- Priority Queue over Regex: Regex chains become unreadable and computationally expensive when handling overlapping patterns. A priority queue provides deterministic execution and linear complexity relative to segment count.
- Riverpod 2 over Provider: Riverpod’s compile-time safety and provider disposal model prevent memory leaks in long-running calculator sessions. It also enables easy testing through provider overrides.
- Hive over SQLite: Hive’s binary serialization is significantly faster for small, frequent writes (like calculation history). It requires zero native dependencies, simplifying the Android build pipeline and reducing APK size.
- Pure Parser Design: The pipeline never mutates UI state directly. It returns resolved values that the Riverpod provider consumes. This separation enables unit testing the parser in isolation from Flutter’s widget tree.
Pitfall Guide
Abstraction Inflation
- Explanation: LLMs default to adding base classes, factories, and defensive null-checks for scenarios that cannot occur in a controlled mobile environment. This bloats the codebase and obscures core logic.
- Fix: Enforce a “no backward-compatibility shims” rule. Explicitly instruct the AI to skip error handling for impossible states and remove unused parameters during code review. Use static analysis tools to flag unused abstractions.
API Hallucination
- Explanation: LLMs confidently generate parameters or methods that do not exist in the current framework version. This is especially common with Flutter widget properties and platform channel signatures.
- Fix: Never trust generated UI code without a compilation check. Run
flutter analyzeimmediately after AI-generated changes. Maintain a pinned SDK version inpubspec.yamland lock dependency versions to prevent silent API drift.
UX Delegation
- Explanation: AI can implement a UI spec flawlessly but cannot evaluate spacing, typography, or animation curves. Delegating design leads to functionally correct but visually broken interfaces that fail user retention.
- Fix: Keep all visual decisions human-authored. Provide the AI with exact pixel values, color hex codes, and animation durations. Treat the AI as a layout engine, not a designer. Conduct visual regression testing on physical devices.
Context Window Drift
- Explanation: As projects grow, LLMs lose track of earlier architectural decisions, introducing inconsistent patterns, duplicating logic, or violating established naming conventions.
- Fix: Maintain a single source-of-truth document (e.g.,
ARCHITECTURE.md) at the project root. Update it after every major refactor. Feed this file to the AI at the start of each session to re-anchor context and enforce consistency.
Test Case Neglect
- Explanation: Developers often let the AI generate both tests and implementation, leading to circular validation where tests only verify the AI’s assumptions rather than actual product requirements.
- Fix: Write test cases as plain English sentences first (e.g., “Input ‘$50 + 10%’ must return 55.00”). Let the AI convert these to code. This forces explicit requirement definition before implementation and catches logical gaps early.
Feature Drift
- Explanation: Asking the AI “what should we build next?” yields generic, low-value suggestions that dilute the product’s core utility and scatter development focus.
- Fix: Maintain a strict product roadmap. The AI should only receive implementation tasks, not strategic direction. Feature selection remains a human responsibility. Use issue trackers with explicit acceptance criteria to constrain AI scope.
Offline State Corruption
- Explanation: Hive boxes can become corrupted if writes occur during app termination or if serialization adapters change without migration logic.
- Fix: Implement explicit box compaction routines. Version your Hive adapters and provide migration paths. Wrap all write operations in try-catch blocks that fallback to in-memory state if disk I/O fails.
Production Bundle
Action Checklist
- Define token categories and boundary characters before writing any parser logic
- Implement the
DomainMatcherinterface with explicit priority values - Register matchers in descending priority order during pipeline initialization
- Write test cases as English sentences, then convert to executable code
- Run
flutter analyzeafter every AI-generated code block - Maintain
ARCHITECTURE.mdand update it after each refactoring cycle - Isolate UI state from computation state using Riverpod providers
- Benchmark Hive serialization speed against expected write frequency
- Implement explicit adapter versioning for local storage migrations
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Simple arithmetic only | Regex chain | Low overhead, fast to implement | Minimal dev time, high maintenance later |
| Multi-domain parsing (currency, dates, units) | Priority Pipeline | Isolates domains, prevents cascading failures | Higher initial setup, near-zero regression cost |
| Cloud-dependent calculator | REST API + SQLite | Offloads computation, enables sync | Server costs, latency, privacy concerns |
| Offline-first calculator | Local Pipeline + Hive | Zero latency, full privacy, no backend | Slightly larger APK, requires careful state management |
Configuration Template
// lib/core/pipeline_setup.dart
import 'package:riverpod_annotation/riverpod_annotation.dart';
import 'package:hive_flutter/hive_flutter.dart';
part 'pipeline_setup.g.dart';
@Riverpod(keepAlive: true)
ExpressionRouter pipeline(PipelineRef ref) {
return ExpressionRouter([
CurrencyMatcher(priority: 80),
PercentageMatcher(priority: 70),
UnitMatcher(priority: 60),
MathMatcher(priority: 50),
FallbackMatcher(priority: 10),
]);
}
@Riverpod(keepAlive: true)
Box calculationHistory(CalculationHistoryRef ref) {
return Hive.box('calc_history');
}
// Initialize in main.dart
Future<void> initializeApp() async {
WidgetsFlutterBinding.ensureInitialized();
await Hive.initFlutter();
await Hive.openBox('calc_history');
// Register adapters here
// Hive.registerAdapter(CalculationRecordAdapter());
}
Quick Start Guide
- Initialize Project: Run
flutter create calc_engine --platforms=androidand addriverpod_generator,hive_flutter, andbuild_runnertopubspec.yaml. Executeflutter pub get. - Define Matchers: Create a
DomainMatcherabstract class and implement at least three concrete matchers (e.g.,NumberMatcher,OperatorMatcher,CurrencyMatcher) with distinct priority values. Ensure each returnsnullwhen it cannot claim tokens. - Wire the Pipeline: Instantiate
ExpressionRouterwith your matchers, sort them by priority, and expose it via a Riverpod provider. Connect the provider to a state notifier that handles input ingestion. - Connect UI: Bind a
TextFieldto the Riverpod state notifier. Callpipeline.resolve()on input changes, rendering results reactively. Ensure the UI never mutates the parser directly. - Validate: Run
flutter testwith English-to-code test cases. Verify that unrecognized tokens are safely skipped without throwing exceptions. Profile memory usage during rapid input to ensure no provider leaks.
Mid-Year Sale — Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register — Start Free Trial7-day free trial · Cancel anytime · 30-day money-back
