MindSearch / README.md
vansin's picture
fix: update the title
60ed594

A newer version of the Gradio SDK is available: 5.12.0

Upgrade
metadata
title: MindSearch
emoji: ๐Ÿ“Š
colorFrom: purple
colorTo: yellow
sdk: gradio
sdk_version: 5.7.1
app_file: app.py
pinned: false

โœจ MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

๐Ÿ“… Changelog

  • 2024/11/05: ๐Ÿฅณ MindSearch is now deployed on Puyu! ๐Ÿ‘‰ Try it ๐Ÿ‘ˆ
    • Refactored the agent module based on Lagent v0.5 for better performance in concurrency.
    • Improved the UI to embody the simultaneous multi-query search.

โšฝ๏ธ Build Your Own MindSearch

Step1: Dependencies Installation

git clone https://github.com/InternLM/MindSearch
cd MindSearch
pip install -r requirements.txt

Step2: Setup Environment Variables

Before setting up the API, you need to configure environment variables. Rename the .env.example file to .env and fill in the required values.

mv .env.example .env
# Open .env and add your keys and model configurations

Step3: Setup MindSearch API

Setup FastAPI Server.

python -m mindsearch.app --lang en --model_format internlm_silicon --search_engine DuckDuckGoSearch --asy 
  • --lang: language of the model, en for English and cn for Chinese.

  • --model_format: format of the model.

    • internlm_server for InternLM2.5-7b-chat with local server. (InternLM2.5-7b-chat has been better optimized for Chinese.)
    • gpt4 for GPT4. if you want to use other models, please modify models
  • --search_engine: Search engine.

    • DuckDuckGoSearch for search engine for DuckDuckGo.
    • BingSearch for Bing search engine.
    • BraveSearch for Brave search web api engine.
    • GoogleSearch for Google Serper web search api engine.
    • TencentSearch for Tencent search api engine.

    Please set your Web Search engine API key as the WEB_SEARCH_API_KEY environment variable unless you are using DuckDuckGo, or TencentSearch that requires secret id as TENCENT_SEARCH_SECRET_ID and secret key as TENCENT_SEARCH_SECRET_KEY.

  • --asy: deploy asynchronous agents.

Step4: Setup MindSearch Frontend

Providing following frontend interfaces,

  • React

First configurate the backend URL for Vite proxy.

HOST="127.0.0.1"  # modify as you need
PORT=8002
sed -i -r "s/target:\s*\"\"/target: \"${HOST}:${PORT}\"/" frontend/React/vite.config.ts
# Install Node.js and npm
# for Ubuntu
sudo apt install nodejs npm

# for windows
# download from https://nodejs.org/zh-cn/download/prebuilt-installer

# Install dependencies

cd frontend/React
npm install
npm start

Details can be found in React

  • Gradio
python frontend/mindsearch_gradio.py
  • Streamlit
streamlit run frontend/mindsearch_streamlit.py

๐ŸŒ Change Web Search API

To use a different type of web search API, modify the searcher_type attribute in the searcher_cfg located in mindsearch/agent/__init__.py. Currently supported web search APIs include:

  • GoogleSearch
  • DuckDuckGoSearch
  • BraveSearch
  • BingSearch
  • TencentSearch

For example, to change to the Brave Search API, you would configure it as follows:

BingBrowser(
    searcher_type='BraveSearch',
    topk=2,
    api_key=os.environ.get('BRAVE_API_KEY', 'YOUR BRAVE API')
)

๐Ÿž Using the Backend Without Frontend

For users who prefer to interact with the backend directly, use the backend_example.py script. This script demonstrates how to send a query to the backend and process the response.

python backend_example.py

Make sure you have set up the environment variables and the backend is running before executing the script.

๐Ÿž Debug Locally

python -m mindsearch.terminal

๐Ÿ“ License

This project is released under the Apache 2.0 license.

Citation

If you find this project useful in your research, please consider cite:

@article{chen2024mindsearch,
  title={MindSearch: Mimicking Human Minds Elicits Deep AI Searcher},
  author={Chen, Zehui and Liu, Kuikun and Wang, Qiuchen and Liu, Jiangning and Zhang, Wenwei and Chen, Kai and Zhao, Feng},
  journal={arXiv preprint arXiv:2407.20183},
  year={2024}
}

Our Projects

Explore our additional research on large language models, focusing on LLM agents.

  • Lagent: A lightweight framework for building LLM-based agents
  • AgentFLAN: An innovative approach for constructing and training with high-quality agent datasets (ACL 2024 Findings)
  • T-Eval: A Fine-grained tool utilization evaluation benchmark (ACL 2024)