Mode Comparison

Compare Modes Guide

Detailed token cost analysis across all context generation modes to help you make informed decisions about token budgets.

Syntax

stamp context --compare-modes [options]

Options

Option	Description
`--stats`	Emit single-line JSON stats (ideal for CI). When combined with `--compare-modes`, writes `context_compare_modes.json` for MCP (Model Context Protocol) integration.
`--quiet`	Suppress non-error output (useful for CI/CD pipelines)

Overview

Context generation supports multiple modes that balance information completeness against token cost:

none

Contracts only (props, state, hooks, dependencies) with no source code

header

Contracts plus JSDoc headers and function signatures

header+style

Header mode plus extracted style metadata (Tailwind, SCSS, animations, layout)

full

Everything including complete source code

The --compare-modes flag generates a detailed comparison table showing token costs for all modes, helping you understand the tradeoffs.

Token Estimation:

LogicStamp Context includes @dqbd/tiktoken (GPT-4) and @anthropic-ai/tokenizer (Claude) as optional dependencies. npm will automatically attempt to install them when you install logicstamp-context. If installation succeeds, you get model-accurate token counts. If installation fails or is skipped (normal for optional dependencies), the tool gracefully falls back to character-based estimation.

Manual installation (optional): The tool works fine without tokenizers (uses approximation). Only install manually if you need accurate token counts AND automatic installation failed: npm install @dqbd/tiktoken @anthropic-ai/tokenizer

When to Use

Optimize token budgets - See exact costs for different modes before committing

Evaluate style overhead - Understand the token impact of including style metadata

Compare against raw source - Calculate savings from using LogicStamp vs raw file dumps

Plan AI workflows - Choose the most cost-effective mode for your use case

Budget for scale - Project token costs for larger codebases

Output Format

The --compare-modes output includes three sections:

1. Token Estimation Method

Shows which tokenizers are being used:

📊 Mode Comparison

   Token estimation: GPT-4o (tiktoken) | Claude (tokenizer)

Or if tokenizers aren't installed (automatic installation failed or skipped):

📊 Mode Comparison

   Token estimation: GPT-4o (approximation) | Claude (approximation)
   💡 Tip: Tokenizers are included as optional dependencies. If installation failed, manually install @dqbd/tiktoken (GPT-4) and/or @anthropic-ai/tokenizer (Claude) for accurate token counts

2. Comparison vs Raw Source

Shows token savings compared to dumping raw source files:

   Comparison:
     Mode         | Tokens GPT-4o | Tokens Claude | Savings vs Raw Source
     -------------|---------------|---------------|------------------------
     Raw source   |       229,087 |       203,633 | 0%
     Header       |        77,533 |        84,245 | 66%
     Header+style |       158,696 |       172,061 | 31%

Interpretation:

Raw source - Baseline showing tokens for all source code without any processing
Header - Typical savings of 60-70% by extracting only contracts and signatures
Header+style - Moderate savings of 25-40% when including style metadata

3. Mode Breakdown vs Full Context

Shows savings compared to the maximum information mode:

   Mode breakdown:
     Mode         | Tokens GPT-4o | Tokens Claude | Savings vs Full Context
     -------------|---------------|---------------|--------------------------
     none         |        46,520 |        50,547 | 85%
     header       |        77,533 |        84,245 | 75%
     header+style |       158,696 |       172,061 | 48%
     full         |       306,620 |       287,878 | 0%

Interpretation:

none - Maximum compression, 80-90% savings, contracts only
header - Balanced compression, 70-80% savings, includes function signatures
header+style - Moderate compression, 40-60% savings, adds visual context
full - No compression, includes all source code

Token Estimation

Default: Character-Based Approximation

By default, token estimation uses character-based approximations:

GPT-4o-mini: ~4 characters per token
Claude: ~4.5 characters per token

These approximations are reasonably accurate for most codebases (typically within 10-15% of actual token counts).

Accurate: Optional Tokenizers

Behavior:

npm automatically tries to install tokenizers when installing logicstamp-context
If installed, automatically detected and used for accurate counts
If not installed (installation failed/skipped), gracefully falls back to approximation
No configuration required - works automatically

Manual Installation (Optional):

The tool works fine without tokenizers (uses approximation). Only install manually if:

You need accurate token counts (not approximations)
AND automatic installation failed or was skipped

# For accurate GPT-4 token counts
npm install @dqbd/tiktoken

# For accurate Claude token counts
npm install @anthropic-ai/tokenizer

# Install both for complete accuracy
npm install @dqbd/tiktoken @anthropic-ai/tokenizer

Mode Selection Guide

none - Maximum Compression

Best for:

CI/CD contract validation
Dependency graph analysis
Architecture reviews without implementation details
Maximum token efficiency

Limitations:

No source code or implementation details
No visual/styling information
Limited context for code generation tasks

Token cost: Typically 15-20% of raw source

header - Balanced Compression

Best for:

General AI chat workflows (default for llm-chat profile)
Code review and refactoring
Understanding component interfaces
Most LLM interactions

Includes:

Full contracts (props, state, hooks)
JSDoc headers and comments
Function signatures and types
Dependency relationships

Token cost: Typically 28-32% of raw source

header+style - Visual Context

Best for:

UI/UX discussions with AI
Design system maintenance
Frontend code generation
Visual consistency reviews

Includes:

Everything from header mode
Tailwind/CSS class patterns
SCSS/CSS module analysis
Animation metadata
Layout patterns (flex/grid)
Color and spacing patterns

Token cost: Typically 58-65% of raw source

full - Complete Context

Best for:

Deep implementation reviews
Complex refactoring tasks
Bug investigation requiring source
When AI needs all implementation details

Includes:

Everything from header+style mode
Complete source code for all components
Full file contents

Token cost: Typically 125-135% of raw source (plus contract overhead)

Example Workflows

Budget Planning

# See costs before generating context
stamp context --compare-modes

# Choose appropriate mode based on budget
stamp context --include-code header --max-nodes 50

Style Cost Analysis

# Compare with and without style metadata
stamp context --compare-modes

# Enable style only if budget allows
stamp context style

Production Optimization

# Audit token costs across repository
stamp context --compare-modes | tee token-analysis.txt

# Switch to more efficient mode if needed
stamp context --include-code none --profile ci-strict

Multi-Repo Comparison

# Compare token costs across multiple projects
for repo in api web mobile; do
  echo "=== $repo ==="
  cd $repo
  stamp context --compare-modes --quiet
  cd ..
done

MCP Integration

# Generate comparison stats for MCP servers
stamp context --compare-modes --stats
# Creates context_compare_modes.json with structured data for MCP integration

Common Questions

Why are my numbers different from raw file sizes?

Token counts are not the same as character counts. Tokenizers split text into semantic units: common words = 1 token, rare words = multiple tokens, code symbols vary widely, and whitespace is typically compressed.

Should I always use accurate tokenizers?

Use approximation when: rough estimates are sufficient, during development/prototyping phase, token costs aren't critical, or you want one-time setup. Use tokenizers when: precise costs matter for budgeting, production deployments, cost-sensitive workflows, or comparing against other tools.

How much overhead do contracts add?

In full mode, contracts add ~30% overhead compared to raw source due to the structured contract metadata (JSON structure, type information, dependency relationships). However, contracts enable structured dependency graphs, semantic component analysis, missing dependency tracking, reproducible builds, and better AI comprehension through structured data. The overhead is typically worth it for AI context generation, but consider using header mode (which adds minimal overhead) for most use cases to maximize token efficiency.

Why do the savings percentages seem generous?

The "savings vs raw source" compares LogicStamp output against a simple concatenation of all source files. Raw source = just the file contents concatenated (no structure, no metadata). Full mode = contracts + complete source = 130% of raw source (due to contract overhead). Header mode = contracts + signatures only = 30% of raw source. So while header mode saves 70% compared to raw source, if you need complete source code, the "full" mode actually costs MORE than raw source due to the structured contract format. The real value is that header mode provides most of the semantic information at a fraction of the cost. In practice: Use header mode for most AI interactions - it provides contracts, types, and function signatures (what you need 90% of the time) at only 30% of the raw source token cost.

Can I compare specific folders?

Yes, use the comparison on a subset: stamp context ./src/components --compare-modes

Does --compare-modes write files?

By default, --compare-modes is analysis-only: generates contracts in memory, computes token estimates, displays comparison tables, and exits without writing files. However, when combined with --stats, it writes context_compare_modes.json for MCP integration. Use stamp context (without the flag) to actually generate context files.

Performance Notes

--compare-modes takes longer than normal generation (2-3x)
Regenerates contracts with/without style for accurate comparison
Uses in-memory processing, no disk writes
Typical execution: 5-15 seconds for medium projects (50-150 files)

Next Steps

Explore best practices or check out the usage guide for more workflows.

Best Practices Usage Guide GitHub Repository

Compare Modes Guide

Syntax

Options

Overview

When to Use

Output Format

1. Token Estimation Method

2. Comparison vs Raw Source

3. Mode Breakdown vs Full Context

Token Estimation

Default: Character-Based Approximation

Accurate: Optional Tokenizers

Mode Selection Guide

none - Maximum Compression

header - Balanced Compression

header+style - Visual Context

full - Complete Context

Example Workflows

Budget Planning

Style Cost Analysis

Production Optimization

Multi-Repo Comparison

MCP Integration

Common Questions

Performance Notes

Related Commands

`stamp context` →

`stamp context style` →

Usage Guide →

Next Steps

Product

Resources

Organization

Legal

Compare Modes Guide

Syntax

Options

Overview

When to Use

Output Format

1. Token Estimation Method

2. Comparison vs Raw Source

3. Mode Breakdown vs Full Context

Token Estimation

Default: Character-Based Approximation

Accurate: Optional Tokenizers

Mode Selection Guide

none - Maximum Compression

header - Balanced Compression

header+style - Visual Context

full - Complete Context

Example Workflows

Budget Planning

Style Cost Analysis

Production Optimization

Multi-Repo Comparison

MCP Integration

Common Questions

Performance Notes

Related Commands

stamp context →

stamp context style →

Usage Guide →

Next Steps

`stamp context` →

`stamp context style` →