Reasoning Models¶

Flux supports reasoning/chain-of-thought models that show their thinking process alongside tool calling. When enabled, the model's reasoning traces are captured, passed through on subsequent turns, and stored in working memory.

Quick Start¶

from flux.tasks.ai import agent, system_tools
from flux.tasks.ai.memory import working_memory

assistant = await agent(
    "You are a helpful assistant.",
    model="ollama/qwen3",
    tools=system_tools(workspace="/path/to/project"),
    working_memory=working_memory(),
    reasoning_effort="high",  # "low", "medium", "high"
)

answer = await assistant("Analyze the codebase")

How It Works¶

When reasoning_effort is set, the model produces a reasoning trace before its response. Flux automatically:

Captures the reasoning from the model response
Passes it back on subsequent turns (required by Anthropic, OpenAI, Gemini)
Stores it in working memory as a reasoning role for pause/resume persistence

Provider Support¶

Provider	Reasoning Field	Effort Mapping	Passback Required
Ollama	`message.thinking`	`think=True` for all levels	No
Anthropic	`thinking` content blocks	Adaptive thinking with effort level	Yes (with signature)
OpenAI	`message.reasoning_content`	`reasoning_effort` parameter	Yes (where available)
Gemini	`part.thought=True` parts	`thinking_budget` in tokens	Yes

Working Memory¶

Reasoning is stored as reasoning role messages in working memory, positioned before the associated assistant response or tool call. This preserves the reasoning context across pause/resume cycles.

user       → "Find all Python files"
reasoning  → {"text": "I should use find_files...", "opaque": {...}}
tool_call  → {"calls": [...]}
tool_result → {"call_id": "...", "output": "..."}
reasoning  → {"text": "Now I need to read each...", "opaque": {...}}
assistant  → "I found 3 Python files..."

Configuration¶

reasoning_effort accepts:

None — reasoning disabled (default)
"low" — minimal reasoning
"medium" — balanced reasoning
"high" — deep reasoning

The mapping to provider-specific settings is handled internally. Non-reasoning models silently ignore the parameter.

Supported Models¶

Provider	Models
Ollama	Qwen3, DeepSeek-R1, QwQ, Gemma 3
Anthropic	Claude Sonnet 4, Claude Opus 4 (adaptive thinking)
OpenAI	o1, o3, o4-mini (via compatible endpoints)
Gemini	Gemini 2.5 Flash, Gemini 2.5 Pro