License
NOASSERTION — permissive for most products
License · Libraries.ioTransforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
Summary verifiedOne-line summary from the repository description on GitHub
NOASSERTION — permissive for most products
License · Libraries.ioNo known critical CVE in default branch scan
Full report on OSS Insight186 commits / 30d · last push today
Bus factor: healthy (active maintenance)
🔍 Core Parsing Capabilities
DOCX, PPTX, and XLSX parsing🔌 Integration
| Use Case | Solution |
|---|---|
| AI Coding Tools | MCP Server — Cursor · Claude Desktop · Windsurf |
| RAG Frameworks | LangChain · LlamaIndex · RAGFlow · RAG-Anything · Flowise · Dify · FastGPT |
| Development | Python / Go / TypeScript SDK · CLI · REST API · Docker |
| No-Code | mineru.net online · Gradio WebUI · Desktop client |
🖥️ Deployment (Private · Fully Offline)
| Inference Backend | Best For |
|---|---|
| pipeline | Fast & stable, no hallucination, runs on CPU or GPU |
| vlm-engine | High accuracy, supports vLLM / LMDeploy / mlx ecosystem |
| hybrid-engine | High accuracy, native text extraction, low hallucination |
Domestic AI chips: Ascend · Cambricon · Enflame · MetaX · Moore Threads · Kunlunxin · Iluvatar · Hygon · Biren · T-Head
2026/04/18 3.1.0 Released
This release focuses on licensing openness, parsing accuracy, and full-format native support. The main updates include:
AGPLv3 to the MinerU Open Source License, a custom license based on Apache 2.0.MinerU2.5-Pro-2604-1.2B, bringing overall parsing accuracy to a state-of-the-art level.PPTX and XLSX.PDF, DOCX, PPTX, and XLSX, providing a more complete multi-format document understanding workflow.With the 3.1.0 release, MinerU becomes more open, more accurate, and easier to adopt in production. The new license lowers the barrier for both community and commercial use, MinerU2.5-Pro-2604-1.2B improves parsing quality on complex content, and native PPTX / XLSX support completes end-to-end coverage of mainstream document formats.
2026/03/29 3.0.0 Released
This release delivers a systematic upgrade centered on parsing capability, system architecture, and engineering usability. The main updates include:
DOCX parsing
DOCX parsing, delivering high-precision results without hallucinations.DOCX to PDF and then parsing it, end-to-end speed is improved by tens of times, making it better suited for scenarios with high requirements for both accuracy and throughput.pipeline backend upgrade
pipeline backend achieves a score of 86.2 on OmniDocBench (v1.5), surpassing the accuracy of the previous-generation mainstream VLM MinerU2.0-2505-0.9B.API / CLI / Router orchestration upgrade
mineru now runs as an orchestration client based on mineru-api; when --api-url is not provided, it will automatically start a local temporary service.mineru-api adds a new asynchronous task endpoint POST /tasks, supporting task submission, status querying, and result retrieval; meanwhile, it retains the synchronous parsing endpoint POST /file_parse for compatibility with legacy plugins.mineru-router, designed for unified entry deployment and task routing across multiple services and multiple GPUs; its interfaces are fully compatible with mineru-api and support automatic task load balancing.torch >= 2.8; the base image has been upgraded to vllm0.11.2 + torch2.9.0, unifying installation paths across different Compute Capabilities.pipeline now supports streaming writes to disk, allowing completed parsing results to be written out in time and further improving the experience for long-running tasks.mineru-router, this enables one-click multi-GPU deployment and makes it easy to build high-concurrency, high-throughput parsing systems.doclayoutyolo and mfd_yolov8) and one CC-BY-NC-SA 4.0 model (layoutreader).This update is not just a set of feature enhancements, but a key leap forward in MinerU's overall system capabilities. We specifically addressed the peak memory usage issue in long-document parsing. Through optimizations such as sliding windows and streaming writes to disk, ultra-long document parsing has moved from “requiring manual splitting and careful handling” to being “stable, scalable, and ready for production workloads.” At the same time, we completed thread-safety optimization and fully enabled multi-threaded concurrent inference, further improving single-machine resource utilization and runtime stability under high-concurrency workloads. On top of this, with mineru-router and the new API / CLI orchestration framework, MinerU now supports one-click multi-GPU deployment, unified access across multiple services, and automatic task load balancing, significantly reducing the difficulty of large-scale deployment. As a result, MinerU is evolving from a standalone data production tool into a large-scale document parsing foundation for high-concurrency and high-throughput scenarios, providing enterprise-grade document data processing with infrastructure that is more stable, more efficient, and easier to scale.
📝 View the complete Changelog for more historical version information
MinerU is a document parsing tool that converts PDF, image, DOCX, PPTX, and XLSX inputs into machine-readable formats such as Markdown and JSON for downstream retrieval, extraction, and processing.
MinerU was born during the pre-training process of InternLM. We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models.
Compared to well-known commercial products, MinerU is still young. If you encounter any issues or if the results are not as expected, please submit an issue on issue and attach the relevant document or sample file.
PDF, image, DOCX, PPTX, and XLSX inputs.Document parsing is a difficult and complex task. In scenarios such as complex layouts, scanned pages, and handwritten content, the parsing results may fall short of expectations. We recommend trying the online demo first to evaluate MinerU's parsing quality and suitability before choosing an appropriate deployment method based on your actual needs. If you have document samples with unsatisfactory parsing results, feel free to share them in an issue. We will continue improving the parsing capabilities. If you encounter any installation issues, please first consult the FAQ.
The official online version has the same functionality as the client, with a beautiful interface and rich features, requires login to use
A WebUI developed based on Gradio, with a simple interface and only core parsing functionality, no login required
[!WARNING] Pre-installation Notice—Hardware and Software Environment Support
To ensure the stability and reliability of the project, we only optimize and test for specific hardware and software environments during development. This ensures that users deploying and running the project on recommended system configurations will get the best performance with the fewest compatibility issues.
By focusing resources on the mainline environment, our team can more efficiently resolve potential bugs and develop new features.
In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first. Most issues already have corresponding solutions in the FAQ. We also encourage community feedback to help us gradually expand support.
| Parsing Backend | pipeline | *-auto-engine | *-http-client | ||
|---|---|---|---|---|---|
| hybrid | vlm | hybrid | vlm | ||
| Backend Features | Good Compatibility | High Hardware Requirements | For OpenAI Compatible Servers2 | ||
| Accuracy1 | 85+ | 95+ | |||
| Operating System | Linux3 / Windows4 / macOS5 | ||||
| Pure CPU Support | ✅ | ❌ | ✅ | ||
| GPU Acceleration | Volta and later architecture GPUs or Apple Silicon | Not Required | |||
| Min VRAM | 4GB | 8GB | 8GB | 2GB | |
| RAM | Min 16GB, Recommended 32GB or more | Min 16GB | |||
| Disk Space | Min 20GB, SSD Recommended | Min 2GB | |||
| Python Version | 3.10-3.13 | ||||
1 Accuracy metrics are the End-to-End Evaluation Overall scores from OmniDocBench (v1.6), based on the latest version of MinerU.
2 Servers compatible with OpenAI API, such as local model servers or remote model services deployed via inference frameworks like vLLM/SGLang/LMDeploy.
3 Linux only supports distributions from 2019 and later.
4 Since the key dependency ray does not support Python 3.13 on Windows, only versions 3.10~3.12 are supported.
5 macOS requires version 14.0 or later.
pip install --upgrade pip
pip install uv
uv pip install -U "mineru[all]"
git clone https://github.com/opendatalab/MinerU.git
cd MinerU
uv pip install -e .[all]
[!TIP]
mineru[all]includes all core features, compatible with Windows / Linux / macOS systems, suitable for most users.- If CUDA acceleration is unavailable after installing on Windows, see the Windows CUDA acceleration FAQ.
- If you need to specify the inference framework for the VLM model, or only intend to install a lightweight client on an edge device, please refer to the documentation Extension Modules Installation Guide.
MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues.
[!TIP]
- Docker deployment is only supported on Linux and Windows environments with WSL2 support;
- macOS users should refer to the two installation methods above for installation instead of using Docker deployment.
You can get the Docker Deployment Instructions in the documentation.
If your device meets the GPU acceleration requirements in the table above, you can use a simple command line for document parsing:
mineru -p <input_path> -o <output_path>
If your device does not meet the GPU acceleration requirements, you can specify the backend as pipeline to run in a pure CPU environment:
mineru -p <input_path> -o <output_path> -b pipeline
mineru currently supports local PDF, image, DOCX, PPTX, and XLSX file or directory inputs, and can be used for document parsing through the CLI, API, WebUI, and mineru-router. For detailed instructions, please refer to the Usage Guide.
This repository is licensed under the MinerU Open Source License, based on Apache 2.0 with additional conditions.
@article{wang2026mineru2,
title={MinerU2. 5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale},
author={Wang, Bin and He, Tianyao and Ouyang, Linke and Wu, Fan and Zhao, Zhiyuan and Chu, Tao and Qu, Yuan and Jin, Zhenjiang and Zeng, Weijun and Miao, Ziyang and others},
journal={arXiv preprint arXiv:2604.04771},
year={2026}
}
@article{dong2026minerudiffusion,
title={MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding},
author={Dong, Hejun and Niu, Junbo and Wang, Bin and Zeng, Weijun and Zhang, Wentao and He, Conghui},
journal={arXiv preprint arXiv:2603.22458},
year={2026}
}
@article{niu2025mineru2,
title={Mineru2. 5: A decoupled vision-language model for efficient high-resolution document parsing},
author={Niu, Junbo and Liu, Zheng and Gu, Zhuangcheng and Wang, Bin and Ouyang, Linke and Zhao, Zhiyuan and Chu, Tao and He, Tianyao and Wu, Fan and Zhang, Qintong and others},
journal={arXiv preprint arXiv:2509.22186},
year={2025}
}
@article{wang2024mineru,
title={Mineru: An open-source solution for precise document content extraction},
author={Wang, Bin and Xu, Chao and Zhao, Xiaomeng and Ouyang, Linke and Wu, Fan and Zhao, Zhiyuan and Xu, Rui and Liu, Kaiwen and Qu, Yuan and Shang, Fukai and others},
journal={arXiv preprint arXiv:2409.18839},
year={2024}
}
@article{he2024opendatalab,
title={Opendatalab: Empowering general artificial intelligence with open datasets},
author={He, Conghui and Li, Wei and Jin, Zhenjiang and Xu, Chao and Wang, Bin and Lin, Dahua},
journal={arXiv preprint arXiv:2407.13773},
year={2024}
}
Same topic — health-ranked peers. Open the matrix or jump to curves only.