MiroFlow

MiroMind team

Aug 25, 2025

MiroFlow v0.2: A High-Performance and Cost-Effective Open-Source Framework for Deep Research Agents

Deep Research Demo is live - Try Now

TL;DR

An open-source deep research agent framework offering 1) High Performance: SoTA performance on GAIA, HLE, xBench-DeepSearch, and BrowserComp benchmarks, and 2) Cost-Effective deployment: the compact MiroFlow deep research service can run on a single RTX 4090 GPU.

MiroFlow is a powerful open-source deep research agent framework that can upgrade any LLM to have capabilities similar to OpenAI Deep Research. It can call upon various tools on the internet and conduct multi-step, complex research, completing tasks in just minutes that would otherwise take humans hours to complete.


  1. Overall Architecture

  • Frontend

A simple Gradio frontend

  • Backend

MiroFlow automatically handles user queries through multi-tool collaboration (such as web browsers and Python tools), performing multi-step web research, comprehensively analyzing a large number of online resources, and ultimately completing the task. The specific process includes:

Query Augmentation: The user input is first analyzed by a large language model (LLM) to identify the user’s intent and enrich query details, enabling a more accurate understanding of the requirements.

Task Planning: The main agent formulates a detailed execution plan based on the enhanced query content, coordinating the entire workflow, including invoking different tools, assigning tasks to sub-agents, and driving task progress.

Sub-Agent Delegation: For complex or specialized tasks, the main agent delegates parts of the work to sub-agents with relevant expertise (e.g., agent-browsing). These sub-agents can independently plan and execute tasks, as well as call upon necessary tools.

Tool Calling: When external functionalities need to be invoked, agents connect to the MCP (Model Context Protocol) server to obtain and use the corresponding specialized tools.

Result Synthesis: After task completion, the system consolidates results from multiple information sources to ensure the output is of high quality and meets user requirements or preset formats.

  1. MiroFlow v0.2 vs. Existing Deep Research Agent Framework (2025-08-21)

  • One Framework for Multiple Benchmarks: Supports GAIA, HLE, BrowserComp, and xBench-DeepSearch.

  • High Performance: Open-source SoTA results on GAIA, xBench, BrowserComp, and HLE.

Agent Framework

Open-Source Reproducible

GAIA val

GAIA test

HLE

HLE text-only

BrowserComp-EN

BrowserComp-ZH

xBench-DeepSearch

OpenAI Deep Research

-

67.4

-

26.6

-

51.5

42.9

-

Gemini Deep Research

-

-

-

26.9

-

-

-

50.0+

Kimi Researcher

-

-

-

-

26.9

-

-

69.0

Perplexity AI

-

-

-

21.1

-

-

22.6

-

Manus

-

73.3

-

-

-

-


-

Aworld

-

61.8

81.7

-

-

-

-

-

OWL

-

69.1

-

-

-

-

-

-

Grok-4

-

-

-

-

41.0

-

-

-

WebSailor-72B

-

55.4*

-

-

-

-

30.1

55.0

WebShaper-72B

-

60.2*

-

-

-

-

-

-

MiroFlow

✔️

82.4

73.1

27.2

29.5

33.2

47.1

72.0

“Open-Source Reproducible” means that the model, code, and runtime environment are all open source, and by running the provided test scripts, the experimental metric results can be reproduced.

“*” means: the model is evaluated on the GAIA text-103 subset.

Gemini Deep Research “50.0+” on xBench-DeepSearch is from the Kimi Researcher report.

  • Open-Source Reproducible: Unlike commercial frameworks or “partly” open-sourced research projects, all MiroFlow performance metrics can be fully reproduced using the publicly available code https://github.com/MiroMindAI/MiroFlow.

  • Cost-Effective: You can set up and run your own deep research agent with just a single RTX 4090 GPU. The performance is strong, and the solution is built entirely with open-source free tools.

Demo available: https://miromind.ai/demo

Code available: https://github.com/MiroMindAI/MiroThinker

Method

Open-Source Free Tool Set

GAIA Text-103 Best Pass@1

GAIA Text-103 Pass@1 (Avg@8)

GAIA Val Best Pass@1

GAIA Val Pass@1 (Avg@8)

MiroThinker-8B

✔️

46.6

44.8

37.0

35.4

MiroThinker-8B

-

50.5

46.7

38.2

35.9

MiroThinker-14B

✔️

48.5

46.6

42.4

39.2

MiroThinker-14B

-

52.4

48.5

45.5

42.0

MiroThinker-32B

✔️

57.3

54.1

48.5

45.9

MiroThinker-32B

-

60.2

57.9

50.9

48.9