Jujutsu VCS Integration: PoC Plan
🚀 Executive Summary
Goal: Guys, let's figure out if Jujutsu VCS integration can actually make things better for our Agentic QE Fleet before we jump in headfirst.
Approach: We're going to use evidence, manage risks, and take things one step at a time. Timeline: 2 weeks for the PoC, then a decision point. If it's a go, we're looking at a 12-week full implementation. Budget: 80 hours (one engineer for two weeks).
📊 Baseline Metrics (Current System)
System State (as of 2025-11-15)
- Total Test Files: 1,256
- QE Agents: 102 agent definitions
- Monthly Commits: 142 commits (last 30 days)
- Version Control: Git (the usual)
- No VCS abstraction layer
- No concurrent agent workspace isolation
Current Pain Points (TO BE VALIDATED)
- ❓ Agent workspace conflicts (how often?)
- ❓ Git staging overhead (how long does it take?)
- ❓ Concurrent execution bottlenecks (do they even happen?)
- ❓ Conflict resolution time (what's the baseline? We need to know!)
IMPORTANT: We absolutely need to measure these things before we can say anything is better.
🎯 Phase 1: Proof of Concept (2 Weeks)
Week 1: Installation & Validation
Day 1-2: Environment Setup
Goal: Get Jujutsu working in our world.
Tasks:
[ ] Install Jujutsu via package manager
[ ] Test basic jj commands in DevPod
[ ] Install agentic-jujutsu crate
[ ] Verify WASM bindings compile
[ ] Test basic WASM operations
Success Criteria:
- ✅ Jujutsu CLI works in DevPod
- ✅ WASM bindings compile without errors
- ✅ Can create/commit/query changes via WASM
Failure Exit: If the WASM bindings don't work, we stop, document the problems, and report what we found.
Day 3-5: Performance Baseline
Goal: Compare how fast Git is versus Jujutsu.
Benchmark Tests:
// Test 1: Create workspace
measure(() => git.clone(repo))
measure(() => jj.init(repo))
// Test 2: Commit changes
measure(() => {
git.add('.');
git.commit('message');
})
measure(() => {
jj.commit('message'); // Auto-staging
})
// Test 3: Concurrent operations
measure(() => {
// 3 agents editing simultaneously
git.branch('agent-1'); git.checkout('agent-1');
git.branch('agent-2'); git.checkout('agent-2');
git.branch('agent-3'); git.checkout('agent-3');
})
measure(() => {
// 3 Jujutsu workspaces
jj.workspace.create('agent-1');
jj.workspace.create('agent-2');
jj.workspace.create('agent-3');
})
// Test 4: Conflict scenarios
// Create intentional conflicts, measure resolution time
Deliverable: A performance report with REAL numbers.
Example Output:
## Performance Benchmark Results
| Operation | Git (ms) | Jujutsu (ms) | Improvement |
| ----------------- | -------- | ------------ | ----------- |
| Create workspace | 450ms | 120ms | 3.75x faster |
| Commit changes | 85ms | 15ms | 5.67x faster |
| 3 concurrent workspaces | 1,200ms | 180ms | 6.67x faster |
| Resolve conflict (auto) | N/A | 450ms | New capability |
**Overall**: 4-7x performance improvement in tested scenarios
**Caveat**: Tested on DevPod with sample repo (50MB)
Week 2: Integration Prototype
Day 6-8: Minimal VCS Adapter
Goal: Build the simplest abstraction layer possible.
Code:
// /src/vcs/base-adapter.ts
interface VCSAdapter {
commit(message: string): Promise<void>;
createWorkspace(name: string): Promise<Workspace>;
getCurrentChanges(): Promise<Change[]>;
}
// /src/vcs/jujutsu-adapter.ts
class JujutsuAdapter implements VCSAdapter {
// Minimal implementation using agentic-jujutsu
}
// /src/vcs/git-adapter.ts
class GitAdapter implements VCSAdapter {
// Wrapper around existing Git calls
}
Success Criteria:
- ✅ Both adapters implement the same interface.
- ✅ Tests pass for both implementations.
- ✅ Can swap adapters via config.
Day 9-10: Single Agent Integration
Goal: Test with ONE agent (qe-test-generator).
Integration:
// Modify qe-test-generator to use VCS adapter
const adapter = VCSAdapterFactory.create(); // Auto-detect
await adapter.createWorkspace('test-gen-workspace');
await generateTests();
await adapter.commit('Generated tests for UserService');
Test Scenarios:
- Generate tests with Git adapter → measure time.
- Generate tests with Jujutsu adapter → measure time.
- Run 3 concurrent test generations → measure conflicts.
Success Criteria:
- ✅ Agent works with both adapters.
- ✅ Jujutsu shows measurable performance improvement.
- ✅ No breaking changes to the existing workflow.
📈 Success Metrics (Evidence-Based)
Minimum Viable Success (PoC)
We need these to move to Phase 2:
| Metric | Target | Measurement |
|---|---|---|
| WASM Bindings Work | 100% | Can execute jj commands via WASM |
| Performance Improvement | ≥2x | Benchmarked commit/workspace operations |
| No Breaking Changes | 0 | Existing tests still pass |
| Single Agent Integration | Works | qe-test-generator uses adapter |
Decision Rule:
- ✅ ALL metrics met → Proceed to Phase 2 (full implementation).
- ⚠️ Performance <2x → Re-evaluate: Is it worth it?
- ❌ WASM doesn't work → Stop, document, and close the issue.
Stretch Goals (Nice to Have)
- Concurrent workspace isolation working.
- Auto-commit reducing overhead by >50%.
- Conflict detection API functional.
⚠️ Risk Assessment (Data-Driven)
Risk Matrix
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| WASM bindings fail in DevPod | Medium | High | Test in Week 1 Day 1, exit early if fails |
| Performance <2x improvement | Medium | Medium | Measure in Week 1, decide if worth continuing |
| Jujutsu API changes (pre-1.0) | High | Medium | Pin version, document API used |
| Integration complexity | Low | Low | Start with 1 agent, keep it simple |
| Team learning curve | Low | Low | Optional feature, comprehensive docs |
Mitigation Strategies
Technical Risks:
- Pin
agentic-jujutsuversion inpackage.json. - Feature flag to disable if issues arise.
- Git fallback always available.
- Minimal changes to existing code.
Adoption Risks:
- Make it opt-in via
.aqe-ci.yml. - Document both Git and Jujutsu workflows.
- Internal dogfooding before external release.
📋 Deliverables (Evidence Required)
Week 1 Deliverables
[ ] Installation report (works/doesn't work)
[ ] Performance benchmarks (with real numbers)
[ ] WASM compatibility report
[ ] Go/No-Go decision document
Week 2 Deliverables
[ ] VCS adapter code (base + 2 implementations)
[ ] Single agent integration (qe-test-generator)
[ ] Test results (adapter tests + integration tests)
[ ] Final recommendation report
Final PoC Report Template
## Jujutsu VCS PoC - Final Report
### Executive Summary
* PoC Goal: Validate Jujutsu performance and feasibility
* Outcome: [Success / Partial Success / Failure]
* Recommendation: [Proceed / Revisit / Abandon]
### Measured Results
| Claim | Actual Result | Evidence |
| ---------------------- | ------------- | ---------------------------- |
| "23x faster" | X.Xx faster | Benchmark: tests/vcs-benchmark.ts |
| "95% conflict reduction" | Not tested | N/A |
| "WASM works in DevPod" | [Yes/No] | Installation log |
### Blockers Encountered
1. [Issue description + resolution/workaround]
### Lessons Learned
1. [What worked well]
2. [What didn't work]
3. [Unexpected findings]
### Recommendation
[Detailed reasoning for proceed/stop decision]
### Next Steps (if proceeding)
[Specific actions for Phase 2]
🚀 Phase 2: Full Implementation (IF PoC Succeeds)
Timeline: 12 weeks (not 4 weeks - a realistic estimate). Scope: Extend to all 18 QE agents.
Week 3-6: Adapter Layer (4 weeks)
- Complete VCS abstraction layer.
- Implement all operations (commit, merge, rebase, log).
- Add operation logging to AgentDB.
- 90%+ test coverage.
- Documentation.
Week 7-10: Agent Integration (4 weeks)
- Extend to all 18 QE agents (1-2 agents/week).
- Add workspace isolation per agent.
- Implement concurrent execution tests.
- Performance validation across all agents.
Week 11-12: Configuration & Rollout (2 weeks)
- Add
.aqe-ci.ymlVCS configuration. - Feature flags for gradual rollout.
- Documentation (setup, migration, troubleshooting).
- Internal dogfooding.
- Beta release announcement.
Week 13-14: Monitoring & Iteration (2 weeks)
- Monitor production usage.
- Collect feedback.
- Fix bugs.
- Performance tuning.
- Case study documentation.
Total: 14 weeks (PoC + Implementation)
💰 Cost-Benefit Analysis
Investment
- PoC: 80 hours (2 weeks × 1 engineer)
- Full Implementation (if approved): 480 hours (12 weeks × 1 engineer)
- Total: 560 hours
Expected Benefits (IF claims are validated)
Measured after PoC:
- Performance improvement: X.Xx faster (TBD).
- Workspace isolation: Yes/No (TBD).
- Auto-commit savings: Y% overhead reduction (TBD).
Theoretical benefits (cannot validate until full implementation):
- Conflict reduction: Unknown (requires AI conflict resolution).
- Cost savings: Unknown (requires learning system).
- Audit trail: Yes (Jujutsu operation log exists).
Break-Even Analysis
If PoC shows 2x improvement:
- Time saved per pipeline: ~5 seconds (estimated).
- Pipelines per day: ~20 (estimated).
- Time saved per day: 100 seconds = 1.67 minutes.
- Time saved per week: 8.35 minutes.
- Break-even: ~670 weeks (12+ years).
If PoC shows 5x improvement:
- Time saved per pipeline: ~15 seconds.
- Break-even: ~250 weeks (5 years).
If PoC shows 10x improvement:
- Time saved per pipeline: ~35 seconds.
- Break-even: ~110 weeks (2 years).
Conclusion: This is a long-term investment, not a quick win.
🎯 Decision Framework
After Week 1 (Go/No-Go Decision Point)
Proceed to Week 2 if:
- ✅ WASM bindings work in DevPod.
- ✅ Performance improvement ≥2x.
- ✅ No major blockers discovered.
Stop if:
- ❌ WASM bindings don't work.
- ❌ Performance <1.5x (marginal gain, high effort).
- ❌ Major blocker (API instability, compatibility).
After Week 2 (Full Implementation Decision)
Proceed to Phase 2 if:
- ✅ PoC fully successful (all success criteria met).
- ✅ Performance improvement ≥4x (justifies 12-week investment).
- ✅ Single agent integration works flawlessly.
- ✅ Team capacity available (1 engineer for 12 weeks).
Defer if:
- ⚠️ Performance 2-4x (good but not great).
- ⚠️ Higher priority features exist.
- ⚠️ Team capacity constrained.
Abandon if:
- ❌ PoC failed to meet minimum criteria.
- ❌ Performance <2x.
- ❌ Integration too complex.
📚 Research & References
Jujutsu VCS
- Docs: https://martinvonz.github.io/jj/
- GitHub: https://github.com/jj-vcs/jj
- Status: Pre-1.0 (API may change).
agentic-jujutsu
- Crates.io: https://crates.io/crates/agentic-jujutsu
- Status: Experimental WASM bindings
Comparison with Original Proposal (Issue #47)
| Aspect | Original Claim | This Plan |
|---|---|---|
| Timeline | 4 weeks | 2 weeks PoC + 12 weeks implementation |
| Performance | "23x faster" | Measure in PoC, don't promise |
| Scope | 18 agents + AI + learning | 1 agent in PoC, expand if successful |
| Success criteria | Vague ("learning improves") | Measurable (≥2x perf, WASM works) |
| Risk assessment | Underestimated | Realistic with exit points |
🚦 Next Steps
Immediate (This Week)
- Review this plan with stakeholders.
- Approve 2-week PoC budget (80 hours).
- Assign engineer to PoC work.
- Set up tracking (PoC kanban board).
Week 1 PoC Kickoff
- Install Jujutsu in DevPod.
- Test WASM bindings.
- Run performance benchmarks.
- Document findings.
Decision Points
- End of Week 1: Go/No-Go for Week 2
- End of Week 2: Proceed/Defer/Abandon Phase 2
📞 Contact & Questions
PoC Lead: TBD Stakeholders: Product, Engineering, QE Escalation: If blockers arise, escalate immediately (don't wait 2 weeks).
Created: 2025-11-15 Status: Awaiting Approval Next Review: End of Week 1 (PoC)