Performance Tuning Guide
Optimize spectryn for large-scale projects with thousands of stories.
Performance Overview
spectryn is designed to handle large workloads efficiently. This guide covers:
- Parallel processing configuration
- Caching strategies
- Network optimization
- Memory management
- Monitoring and profiling
Quick Wins
1. Enable Parallel Processing
# spectryn.yaml
performance:
parallel_sync: true
max_workers: 4 # Adjust based on CPU cores# Or via CLI
spectryn sync --parallel --workers 4 --markdown EPIC.mdImpact: 3-4x faster for large epics with many stories.
2. Use Incremental Sync
Only sync changed items instead of everything:
# Sync only changed stories
spectryn sync --incremental --markdown EPIC.md
# Check what would be synced
spectryn diff --markdown EPIC.md --epic PROJ-123Impact: 10-50x faster for subsequent syncs.
3. Enable Caching
# spectryn.yaml
cache:
enabled: true
ttl: 3600 # 1 hour
backend: memory # or "redis" for multi-processCached data:
- Project metadata (issue types, statuses, fields)
- User lists
- Recently accessed issues
- Schema validation results
Impact: Reduces API calls by 40-60%.
Configuration Reference
Full Performance Configuration
# spectryn.yaml
performance:
# Parallel processing
parallel_sync: true
max_workers: 4
# Batching
batch_size: 50
batch_delay_ms: 100
# Rate limiting
rate_limit: 100 # requests per second
rate_limit_burst: 20
# Timeouts
request_timeout: 30
connect_timeout: 10
# Memory
max_memory_mb: 512
streaming_threshold_mb: 10
cache:
enabled: true
backend: memory # memory, file, redis
ttl: 3600
max_size: 1000
# Redis settings (if backend: redis)
redis:
url: redis://localhost:6379
db: 0
prefix: spectryn:
network:
# Connection pooling
pool_size: 10
pool_maxsize: 20
pool_block: false
# Retries
max_retries: 3
retry_backoff: exponential
retry_backoff_factor: 0.5Parallel Processing
How It Works
┌─────────────────────────────────────────────────────────────┐
│ Main Process │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Thread Pool │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Worker 1 │ │ Worker 2 │ │ Worker 3 │ │ Worker 4 │ │ │
│ │ │ Epic A │ │ Epic B │ │ Story 1 │ │ Story 2 │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘Tuning Workers
| System | Recommended Workers |
|---|---|
| 2 cores | 2-3 |
| 4 cores | 4-6 |
| 8+ cores | 6-8 |
WARNING
More workers ≠ faster. Too many workers can cause:
- Rate limiting from APIs
- Memory exhaustion
- Context switching overhead
Per-Tracker Limits
Different trackers have different rate limits:
# spectryn.yaml
trackers:
jira:
rate_limit: 100 # Jira Cloud: ~100 req/s
max_workers: 4
github:
rate_limit: 30 # GitHub: 5000/hour ≈ 83/min
max_workers: 2
linear:
rate_limit: 60 # Linear: ~60 req/s
max_workers: 3Caching Strategies
Memory Cache (Default)
Best for single-process usage:
cache:
backend: memory
ttl: 3600
max_size: 1000Pros: Fast, no setup Cons: Lost on restart, not shared between processes
File Cache
Persists between runs:
cache:
backend: file
path: .spectryn/cache
ttl: 86400 # 24 hoursPros: Survives restarts, simple Cons: Slower than memory, single machine
Redis Cache
Best for CI/CD and multi-instance:
cache:
backend: redis
redis:
url: redis://localhost:6379
db: 0
prefix: spectryn:
ttl: 3600# Docker Redis for local development
docker run -d -p 6379:6379 redis:alpinePros: Shared across processes/machines, fast Cons: Requires Redis setup
Cache Invalidation
# Clear all caches
spectryn cache clear
# Clear specific cache
spectryn cache clear --type metadata
# Clear for specific project
spectryn cache clear --project PROJNetwork Optimization
Connection Pooling
Reuse HTTP connections for better performance:
network:
pool_size: 10 # Initial connections
pool_maxsize: 20 # Maximum connections
pool_block: false # Don't block when pool exhaustedTimeout Configuration
network:
connect_timeout: 10 # Connection establishment
request_timeout: 30 # Full request completion
read_timeout: 60 # For large responsesRetry Strategy
network:
max_retries: 3
retry_backoff: exponential # exponential, linear, constant
retry_backoff_factor: 0.5
retry_on:
- 429 # Rate limited
- 500 # Server error
- 502 # Bad gateway
- 503 # Service unavailableMemory Management
Streaming Parser
For very large markdown files (>10MB):
performance:
streaming_threshold_mb: 10
chunk_size_kb: 64# Force streaming mode
spectryn sync --streaming --markdown huge-epic.mdMemory Limits
performance:
max_memory_mb: 512When limit is approached:
- Caches are cleared
- Processing switches to streaming mode
- Batch sizes are reduced
Batching Operations
Batch Sync
Process stories in batches to optimize API calls:
performance:
batch_size: 50
batch_delay_ms: 100 # Pause between batches# Explicit batch mode
spectryn sync --batch --batch-size 100 --markdown EPIC.mdGraphQL Batching
For GitHub and Linear, batch multiple queries:
github:
graphql_batch: true
graphql_batch_size: 20
linear:
graphql_batch: true
graphql_batch_size: 50Monitoring & Profiling
Built-in Stats
# Show sync statistics
spectryn stats --markdown EPIC.md
# Output:
# Stories: 150
# Subtasks: 423
# API calls: 45
# Cache hits: 312 (87%)
# Duration: 12.3s
# Throughput: 12.2 stories/sDetailed Timing
# Enable timing breakdown
spectryn sync --verbose --timing --markdown EPIC.md
# Output:
# Parse markdown: 0.2s
# Fetch tracker state: 2.1s
# Diff calculation: 0.1s
# Create stories: 4.5s
# Update subtasks: 3.2s
# Total: 10.1sProfiling
# Profile CPU usage
python -m cProfile -o profile.out -m spectryn sync --markdown EPIC.md
python -m pstats profile.out
# Profile memory
python -m memory_profiler -m spectryn sync --markdown EPIC.mdBenchmarks
Test Your Setup
# Run benchmark suite
spectryn benchmark --stories 100 --subtasks 500
# Output:
# Benchmark Results
# ─────────────────
# Parse (100 stories): 45ms
# Diff (100 stories): 12ms
# Serialize (100 stories): 8ms
# Full sync (dry-run): 1.2s
# Estimated real sync: 8.5sReference Benchmarks
| Operation | 100 stories | 1000 stories | 5000 stories |
|---|---|---|---|
| Parse | 45ms | 350ms | 1.8s |
| Diff | 12ms | 95ms | 450ms |
| Sync (parallel) | 8s | 45s | 3.5min |
| Sync (sequential) | 25s | 4min | 20min |
Tested on M1 MacBook Pro, Jira Cloud, 50ms latency
Environment-Specific Tuning
CI/CD Pipelines
# spectryn.yaml for CI
performance:
parallel_sync: true
max_workers: 2 # CI runners often have 2 cores
cache:
backend: file
path: /tmp/spectryn-cache
ttl: 300 # 5 minutes (single pipeline)Local Development
# spectryn.yaml for development
performance:
parallel_sync: true
max_workers: 4
cache:
backend: memory
ttl: 3600Production Server
# spectryn.yaml for server deployment
performance:
parallel_sync: true
max_workers: 8
max_memory_mb: 2048
cache:
backend: redis
redis:
url: ${REDIS_URL}
prefix: spectryn:prod:Troubleshooting Performance
Identifying Bottlenecks
# Detailed diagnostics
spectryn doctor --performance
# Output:
# Performance Diagnostics
# ─────────────────────────
# ✓ Python version: 3.11.5 (optimal)
# ✓ Available memory: 8GB
# ✓ CPU cores: 4
# ✓ Network latency to Jira: 45ms
# ⚠ Cache backend: memory (consider redis for CI)
# ✓ Connection pool: healthyCommon Issues
| Symptom | Cause | Solution |
|---|---|---|
| Sync takes >1min for 50 stories | Sequential processing | Enable parallel_sync |
| High API call count | No caching | Enable cache |
| Memory spikes | Large files | Enable streaming |
| Rate limit errors | Too many workers | Reduce workers |
| Timeouts | Network issues | Increase timeouts |
Best Practices
Recommended Setup
- Enable parallel processing with 4 workers
- Enable caching (memory for local, Redis for CI)
- Use incremental sync after initial sync
- Set appropriate rate limits per tracker
- Monitor with
--timingflag periodically
Avoid
- Running >8 workers (diminishing returns)
- Disabling retries (transient failures happen)
- Very large batch sizes (>200)
- Ignoring rate limit warnings