AI & ML Open Release

Independently reproduces OpenAI's gpt-oss-20b scores by reverse-engineering undisclosed tool-calling formats and agent harnesses.

April 2, 2026

Original Paper

In harmony with gpt-oss

Borislav Mavrin

arXiv · 2604.00362

The Takeaway

Democratizes the methodology behind frontier tool-calling models by showing that tool-calling priors exist in training distributions and providing a native 'harmony' harness to bypass lossy Chat Completion conversions.

From the abstract

No one has independently reproduced OpenAI's published scores for gpt-oss-20b with tools, because the original paper discloses neither the tools nor the agent harness. We reverse-engineered the model's in-distribution tools: when prompted without tool definitions, gpt-oss still calls tools from its training distribution with high statistical confidence -- a strong prior, not a hallucination. We then built a native harmony agent harness (this https URL) that encodes messages in the model's native

Read the original paper →

← Back to today's papers