--- title: Grobid OCR emoji: 📄 colorFrom: blue colorTo: green sdk: docker pinned: false --- # Grobid PDF Document Processor This space uses Grobid to extract structured information from PDF documents, particularly academic papers. ## Features - **Header Extraction**: Fast extraction of title, authors, and abstract - **Full Text Processing**: Complete document processing including introduction sections - **Academic Focus**: Optimized for scholarly documents and research papers ## Usage 1. Upload a PDF document 2. Choose extraction type: - **Header Only**: Quick extraction of metadata - **Full Text**: Complete processing including introduction 3. Click "Process PDF" to get structured results ## Technology - [Grobid](https://github.com/kermitt2/grobid): Machine learning library for PDF extraction - [Gradio](https://gradio.app/): Web interface framework - Docker: Containerized deployment Perfect for researchers who need to quickly extract key information from academic papers!