HumanAgent Interface

elenchus_env

---
title: ElenchusEnv
emoji: 🧠
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0
base_path: /web
---

ElenchusEnv



Multi-turn Agentic Deductive Reasoning Environment for RLVR

ElenchusEnv trains LLMs to solve logical reasoning problems using a step-by-step agentic loop with MCP tools. Instead of answering in one shot, the agent explores the problem over multiple turns before committing to a final answer.

Tools



| Tool | Description |
|------|-------------|
| check_rule | Returns the rule or principle governing this episode |
| get_facts | Returns the facts / premises to reason over |
| derive | Records an intermediate derived conclusion |
| submit_answer | Submits the final answer and ends the episode |

Episode Flow



reset()
→ receive problem statement (rule + facts + question)

step(list_tools)
→ discover available tools

step(call_tool: check_rule)
→ get the rule to apply

step(call_tool: get_facts)
→ get the premises

step(call_tool: derive, statement="...")
→ record intermediate reasoning (repeat as needed)

step(call_tool: submit_answer, answer="Yes")
→ terminal step: reward = 1.0 (correct) or 0.0 (wrong)


Max steps per episode: 8 (configurable via ELENCHUS_MAX_STEPS)

Reward



Binary reward only:
  • 1.0 — correct final answer

  • 0.0 — wrong answer or step limit exceeded


Datasets



| Driver | Dataset | Task names | Difficulty |
|--------|---------|------------|------------|
| SyllogismGenerator | Procedural (no download) | syllogism_d1–syllogism_d6 | 1–6 |
| KnightsKnavesDriver | Procedural | knights_knaves | 1–4 |
| ProofWriterDriver | [tasksource/proofwriter](https://huggingface.co/datasets/tasksource/proofwriter) | proofwriter_d2–proofwriter_d5 | 3–5 |
| FOLIODriver | [tasksource/folio](https://huggingface.co/datasets/tasksource/folio) | folio | 2–5 |
| FOLNLIDriver | [tasksource/FOL-nli](https://huggingface.co/datasets/tasksource/FOL-nli) | fol_nli | 2–5 |
| LegalBenchDriver | [nguha/legalbench](https://huggingface.co/datasets/nguha/legalbench) | hearsay, abercrombie | 3–4 |

Usage



from openenv.core.env_client import EnvClient
from openenv.core.env_server.mcp_types import CallToolAction, ListToolsAction

env = EnvClient("https://farffadet-elenchus-env.hf.space")
env.connect()

result = env.reset()
print(result.observation) # problem statement

result = env.step(CallToolAction(tool_name="check_rule", arguments={}))
result = env.step(CallToolAction(tool_name="get_facts", arguments={}))
result = env.step(CallToolAction(
tool_name="derive",
arguments={"statement": "Since all A's are B's and X is an A, X is a B."}
))
result = env.step(CallToolAction(
tool_name="submit_answer",
arguments={"answer": "Yes"}
))
print("Reward:", result.reward) # 1.0 or 0.0
env.disconnect()


Environment Variables



| Variable | Default | Description |
|----------|---------|-------------|
| ELENCHUS_TASK_MODE | mixed | mixed or single |
| ELENCHUS_TASK_NAME | ` | Task name when mode=single |
|
ELENCHUS_MAX_STEPS | 8` | Max tool calls per episode |

Take Action

Action type discriminator
Name of the tool to call
Arguments to pass to the tool

Current State

Status: Not initialized
Episode ID: -
Step Count: 0
State Observer

Current Observation

No observation yet

Action History

No actions taken yet