Open Deep Research

Open Deep Research is an experimental, fully open-source research assistant that automates deep research and produces comprehensive reports on any topic. It features two implementations - a workflow and a multi-agent architecture - each with distinct advantages. You can customize the entire research and writing process with specific models, prompts, report structure, and search tools.

Workflow

Multi-agent

🚀 Quickstart

Clone the repository:

git clone https://github.com/langchain-ai/open_deep_research.git
cd open_deep_research

Then edit the .env file to customize the environment variables (for model selection, search tools, and other configuration settings):

cp .env.example .env

Launch the assistant with the LangGraph server locally, which will open in your browser:

Mac

# Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies and start the LangGraph server
uvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.11 langgraph dev --allow-blocking

Windows / Linux

# Install dependencies 
pip install -e .
pip install -U "langgraph-cli[inmem]" 

# Start the LangGraph server
langgraph dev

Use this to open the Studio UI:

- 🚀 API: http://127.0.0.1:2024
- 🎨 Studio UI: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
- 📚 API Docs: http://127.0.0.1:2024/docs

Multi-agent

(1) Chat with the agent about your topic of interest, and it will initiate report generation:

(2) The report is produced as markdown.

Workflow

(1) Provide a Topic:

(2) This will generate a report plan and present it to the user for review.

(3) We can pass a string ("...") with feedback to regenerate the plan based on the feedback.

(4) Or, we can just pass true to the JSON input box in Studio accept the plan.

(5) Once accepted, the report sections will be generated.

The report is produced as markdown.

Search Tools

Available search tools:

Tavily API - General web search
Perplexity API - General web search
Exa API - Powerful neural search for web content
ArXiv - Academic papers in physics, mathematics, computer science, and more
PubMed - Biomedical literature from MEDLINE, life science journals, and online books
Linkup API - General web search
DuckDuckGo API - General web search
Google Search API/Scrapper - Create custom search engine here and get API key here

Open Deep Research is compatible with many different LLMs:

You can select any model that is integrated with the init_chat_model() API
See full list of supported integrations here

Using the package

pip install open-deep-research

See src/open_deep_research/graph.ipynb and src/open_deep_research/multi_agent.ipynb for example usage in a Jupyter notebook:

Open Deep Research Implementations

Open Deep Research features two distinct implementation approaches, each with its own strengths:

1. Graph-based Workflow Implementation (`src/open_deep_research/graph.py`)

The graph-based implementation follows a structured plan-and-execute workflow:

Planning Phase: Uses a planner model to analyze the topic and generate a structured report plan
Human-in-the-Loop: Allows for human feedback and approval of the report plan before proceeding
Sequential Research Process: Creates sections one by one with reflection between search iterations
Section-Specific Research: Each section has dedicated search queries and content retrieval
Supports Multiple Search Tools: Works with all search providers (Tavily, Perplexity, Exa, ArXiv, PubMed, Linkup, etc.)

This implementation provides a more interactive experience with greater control over the report structure, making it ideal for situations where report quality and accuracy are critical.

You can customize the research assistant workflow through several parameters:

report_structure: Define a custom structure for your report (defaults to a standard research report format)
number_of_queries: Number of search queries to generate per section (default: 2)
max_search_depth: Maximum number of reflection and search iterations (default: 2)
planner_provider: Model provider for planning phase (default: "anthropic", but can be any provider from supported integrations with init_chat_model as listed here)
planner_model: Specific model for planning (default: "claude-3-7-sonnet-latest")
planner_model_kwargs: Additional parameter for planner_model
writer_provider: Model provider for writing phase (default: "anthropic", but can be any provider from supported integrations with init_chat_model as listed here)
writer_model: Model for writing the report (default: "claude-3-5-sonnet-latest")
writer_model_kwargs: Additional parameter for writer_model
search_api: API to use for web searches (default: "tavily", options include "perplexity", "exa", "arxiv", "pubmed", "linkup")

2. Multi-Agent Implementation (`src/open_deep_research/multi_agent.py`)

The multi-agent implementation uses a supervisor-researcher architecture:

Supervisor Agent: Manages the overall research process, plans sections, and assembles the final report
Researcher Agents: Multiple independent agents work in parallel, each responsible for researching and writing a specific section
Parallel Processing: All sections are researched simultaneously, significantly reducing report generation time
Specialized Tool Design: Each agent has access to specific tools for its role (search for researchers, section planning for supervisors)
Currently Limited to Tavily Search: The multi-agent implementation currently only works with Tavily for search, though the framework is designed to support additional search tools in the future

This implementation focuses on efficiency and parallelization, making it ideal for faster report generation with less direct user involvement.

Search API Configuration

Not all search APIs support additional configuration parameters. Here are the ones that do:

Exa: max_characters, num_results, include_domains, exclude_domains, subpages
- Note: include_domains and exclude_domains cannot be used together
- Particularly useful when you need to narrow your research to specific trusted sources, ensure information accuracy, or when your research requires using specified domains (e.g., academic journals, government sites)
- Provides AI-generated summaries tailored to your specific query, making it easier to extract relevant information from search results
ArXiv: load_max_docs, get_full_documents, load_all_available_meta
PubMed: top_k_results, email, api_key, doc_content_chars_max
Linkup: depth

Example with Exa configuration:

thread = {"configurable": {"thread_id": str(uuid.uuid4()),
                           "search_api": "exa",
                           "search_api_config": {
                               "num_results": 5,
                               "include_domains": ["nature.com", "sciencedirect.com"]
                           },
                           # Other configuration...
                           }}

Model Considerations

(1) You can use models supported with the init_chat_model() API. See full list of supported integrations here.

(2) The workflow planner and writer models need to support structured outputs: Check whether structured outputs are supported by the model you are using here.

(3) The agent models need to support tool calling: Ensure tool calling is well supoorted; tests have been done with Claude 3.7, o3, o3-mini, and gpt4.1. See here.

(4) With Groq, there are token per minute (TPM) limits if you are on the on_demand service tier:

The on_demand service tier has a limit of 6000 TPM
You will want a paid plan for section writing with Groq models

(5) deepseek-R1 is not strong at function calling, which the assistant uses to generate structured outputs for report sections and report section grading. See example traces here.

Consider providers that are strong at function calling such as OpenAI, Anthropic, and certain OSS models like Groq's llama-3.3-70b-versatile.
If you see the following error, it is likely due to the model not being able to produce structured outputs (see trace):

groq.APIError: Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.

(6) Follow [here[(#75 (comment)) to use with OpenRouter.

(7) For working with local models via Ollama, see here.

Testing Report Quality

To compare the quality of reports generated by both implementations:

# Test with default Anthropic models
python tests/run_test.py --all

# Test with OpenAI o3 models
python tests/run_test.py --all \
  --supervisor-model "openai:o3" \
  --researcher-model "openai:o3" \
  --planner-provider "openai" \
  --planner-model "o3" \
  --writer-provider "openai" \
  --writer-model "o3" \
  --eval-model "openai:o3" \
  --search-api "tavily"

The test results will be logged to LangSmith, allowing you to compare the quality of reports generated by each implementation with different model configurations.

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
examples		examples
src/open_deep_research		src/open_deep_research
tests		tests
.env.example		.env.example
LICENSE		LICENSE
README.md		README.md
langgraph.json		langgraph.json
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Deep Research

Workflow

Multi-agent

🚀 Quickstart

Mac

Windows / Linux

Multi-agent

Workflow

Search Tools

Using the package

Open Deep Research Implementations

1. Graph-based Workflow Implementation (`src/open_deep_research/graph.py`)

2. Multi-Agent Implementation (`src/open_deep_research/multi_agent.py`)

Search API Configuration

Model Considerations

Testing Report Quality

UX

Local deployment

Hosted deployment

About

Releases

Packages

Contributors 13

Languages

License

langchain-ai/open_deep_research

Folders and files

Latest commit

History

Repository files navigation

Open Deep Research

Workflow

Multi-agent

🚀 Quickstart

Mac

Windows / Linux

Multi-agent

Workflow

Search Tools

Using the package

Open Deep Research Implementations

1. Graph-based Workflow Implementation (src/open_deep_research/graph.py)

2. Multi-Agent Implementation (src/open_deep_research/multi_agent.py)

Search API Configuration

Model Considerations

Testing Report Quality

UX

Local deployment

Hosted deployment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 13

Languages

1. Graph-based Workflow Implementation (`src/open_deep_research/graph.py`)

2. Multi-Agent Implementation (`src/open_deep_research/multi_agent.py`)

Packages