PRD-004: Task Scoring & Prioritization
Numeric scoring system with LLM-based reprioritization for intelligent task ordering
Status: Draft Priority: P1 (Should Have) Owner: TBD Last Updated: 2025-01-28
Problem Statement
When agents query "what should I work on?", they need a single, comparable metric to rank tasks. Current approaches fail because:
- No scoring - agents pick arbitrarily or by creation order
- Human-only scoring - doesn't adapt to changing conditions
- Static priorities (P0/P1/P2) - too coarse, can't distinguish between 50 P1 tasks
We need numeric scores that:
- Can be set by humans for strategic priorities
- Can be updated by LLMs based on context
- Automatically factor in dependencies and blocking relationships
Scoring Model
Base Score (0-1000)
Set by humans or agents to indicate inherent importance:
| Range | Meaning | Example |
|---|---|---|
| 900-1000 | Critical / Blocking release | Security fix |
| 700-899 | High priority / Important feature | Core feature |
| 400-699 | Medium priority / Normal work | Standard task |
| 100-399 | Low priority / Nice to have | Polish |
| 0-99 | Backlog / Someday | Future idea |
Dynamic Adjustments
Applied at query time, not stored:
| Factor | Adjustment | Rationale |
|---|---|---|
| Blocking count | +25 per task | Unblocking work is valuable |
| Age > 48 hours | +100 | Old tasks shouldn't rot |
| Age > 24 hours | +50 | Mild age bonus |
| Depth > 2 | -10 per level | Prefer root tasks over deep subtasks |
| Status = blocked | -1000 | Never show blocked tasks as ready |
Final Score Formula
final_score = base_score
+ (blocking_count * 25)
+ age_bonus
- (depth * 10)
+ custom_adjustmentsLLM Score Updates
Single Task Update
tx score-update tx-001 --reason "This is now blocking the release"Batch Recalculation
tx reprioritize --context "We're focusing on performance this sprint"What LLM Considers
- Task title and description
- Current score and dependencies
- Provided context
- What tasks this blocks
Requirements
Scoring Operations
| ID | Requirement | CLI Command |
|---|---|---|
| S-001 | Manually set base score | tx score <id> <value> |
| S-002 | Show current score breakdown | tx score <id> |
| S-003 | LLM recalculates all scores | tx reprioritize |
| S-004 | List sorted by score (default) | tx list --sort=score |
Scoring API
| Method | Description |
|---|---|
TaskService.setScore(id, score) | Set base score |
ScoreService.calculate(task) | Get final score with adjustments |
ScoreService.recalculateAll(context?) | Batch LLM update |
Constraints
| ID | Constraint | Rationale |
|---|---|---|
| S-005 | Scores are integers | No floating point comparison issues |
| S-006 | Base score stored in DB | Adjustments computed at runtime |
| S-007 | ANTHROPIC_API_KEY is optional | Manual tx score <id> <value> and dynamic adjustments work without LLM. Only tx reprioritize requires the API key. If ANTHROPIC_API_KEY is set as env var, use it automatically. If not, LLM scoring commands fail gracefully with a clear error message. |
API Examples
Set Score
# Set base score
$ tx score tx-a1b2c3 800
Score updated: tx-a1b2c3 = 800
# View score breakdown
$ tx score tx-a1b2c3
Task: tx-a1b2c3
Base score: 800
Blocking bonus: +50 (2 tasks)
Age bonus: +100 (>48h)
Depth penalty: -20 (depth 2)
Final score: 930Reprioritize
$ tx reprioritize --context "Focus on auth this week"
Analyzing 15 tasks...
Updated scores:
tx-auth1: 400 → 850 (auth-related)
tx-auth2: 300 → 800 (auth-related)
tx-perf1: 700 → 500 (not priority)List by Score
$ tx list --sort=score
15 tasks (sorted by score):
tx-a1b2c3 [930] Implement JWT validation
tx-d4e5f6 [870] Add login endpoint
tx-g7h8i9 [650] Write auth tests
...Score Calculation Implementation
function calculateFinalScore(task: Task, context: ScoreContext): number {
let score = task.score // Base score from DB
// Blocking bonus: tasks that unblock others are more valuable
score += context.blockingCount * 25
// Age bonus: don't let old tasks rot
const ageHours = (Date.now() - task.createdAt.getTime()) / (1000 * 60 * 60)
if (ageHours > 48) {
score += 100
} else if (ageHours > 24) {
score += 50
}
// Depth penalty: prefer root tasks over deep subtasks
score -= context.depth * 10
return score
}