Understanding task duration is the backbone of planning, scheduling, and time-aware reasoning. Agents that support work, study, training, or rest must understand how long activities take for each user. Variability in task duration is natural—and adapting to this variability is essential for agent utility across education, healthcare, productivity, and recreation.

This module formalizes duration tracking, feedback, and estimation as core components of Chronologue.


Concepts

  • Duration Estimate: Mean or typical time required to complete a task.
  • Range Estimate: Time bounds under known conditions.
  • Preference-Aware Adjustment: User-corrected duration refined through feedback.
  • Task-Instance Memory: Each observed duration stored as a discrete trace.

Test Suite Overview

Chronologue supports structured testing across domain-specific timing datasets:

  • Cooking Tasks – Vary by temperature, ingredient, and equipment.
  • Cleaning Tasks – Vary by square footage and mess level.
  • Exercise Tasks – Vary by capacity, intensity, and environment.
  • Travel Tasks – Vary by stops, mode, and traffic conditions.

Example trace from cooking:

{
  "task": "Roast 1 lb of salmon at 450°F",
  "model_estimate": "12–15 minutes",
  "actual_duration": 14,
  "feedback": "slightly overcooked",
  "preferred_duration": 12
}

Feedback Logging and User Correction

Duration estimation improves with user input. Logging outcomes helps refine future prompts.

Define per-user profile:

{
  "user_id": "u001",
  "preferences": {
    "salmon_doneness": "medium",
    "oven_runs_hot": true
  },
  "history": {
    "salmon_1lb_450F": 12
  }
}

Agents referencing this profile generate context-aware prompts:

“Last time you used 14 minutes for salmon and it was overcooked. Setting timer to 12 min.”


Conditional and Multi-Step Timing Chains

Support for conditional cooking logic:

If skin-on → roast 1 lb salmon at 450°F for 10–12 min
If skinless → check doneness at 9 min

This allows structured timing chains:

  • Initial estimate
  • Mid-task check
  • Final adjustment
  • Timer execution
[
  {"step": "Preheat oven to 450°F", "estimated_time": 8},
  {"step": "Roast salmon (1 lb, skin-on)", "estimated_time": 12},
  {"step": "Rest salmon", "estimated_time": 5}
]

Integrating with Chronologue Systems

The duration/ module integrates across the Chronologue architecture:

  • .ics Generator – Computes DTSTART, DURATION, and DTEND
  • Planner (MCP) – Allocates time blocks based on learned durations
  • Memory Module – Stores task-instance traces for retrieval
  • Feedback Logger – Updates model and user profiles from evaluations

Training Preference-Adaptive Agents

Agents adapt through repeated use and feedback. They can:

  • Parse recipe content into steps
  • Adjust estimates using historical memory
  • Reference internal sources (e.g., USDA guidelines, Bon Appétit)
  • Suggest timers with contextual notes

Use OpenAI function calling:

  • suggest_timer_duration(ingredient, weight, temp, past_outcomes)
  • log_cooking_outcome(task, result, feedback)

Duration as a Core Abstraction

Reasons to isolate duration/ into its own module:

Modularity

  • Prevents logic duplication
  • Keeps estimation separate from storage or UI

Reusability

  • Used across .ics generation, planning, memory replay, agent prompts

Extensibility

  • Add support for:
    • User-specific classifiers
    • Conditional adjustments
    • Duration uncertainty and confidence scores

9. Model Output Schema

Prompt:

Estimate how long it takes to grill a 1.5 lb skirt steak at 425°F.
For each doneness level (rare, medium-rare, medium, well-done), provide:

  • Mean cook time in minutes
  • A 2-value range
  • Recommended internal temperature (°F)
  • A brief description of technique or doneness
{
  "task": "Grill 1.5 lb skirt steak at 425°F",
  "doneness_profiles": {
    "rare": {
      "mean_minutes": 3.0,
      "range_minutes": [2.5, 3.5],
      "recommended_internal_temp_f": 120,
      "notes": "Flip after 1.5 min per side. Very red center."
    },
    "medium_rare": {
      "mean_minutes": 4.0,
      "range_minutes": [3.5, 4.5],
      "recommended_internal_temp_f": 130,
      "notes": "Pink center. Flip every 2 min."
    },
    "medium": {
      "mean_minutes": 5.0,
      "range_minutes": [4.5, 5.5],
      "recommended_internal_temp_f": 140,
      "notes": "Slightly pink center. More firm."
    },
    "well_done": {
      "mean_minutes": 6.0,
      "range_minutes": [5.5, 6.5],
      "recommended_internal_temp_f": 160,
      "notes": "Little or no pink. Can dry out easily."
    }
  }
}

10. Future Directions

Contextual Benchmarks Move from static test sets to user-specific duration evaluations.

Ground Truth Logging Verify ranges using user feedback to shrink distributional uncertainty.

Time-Aware Planning Agents Extend into work/study planning, lab scheduling, and task recovery chains using accurate personal time estimates.