Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,31 +1,50 @@
|
|
| 1 |
-
yaml
|
| 2 |
---
|
| 3 |
-
title:
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
[project]
|
| 9 |
-
---
|
| 10 |
-
name: "parserpdf" - PDF & HTML parser to markdown
|
| 11 |
-
#title: "parserPDF"
|
| 12 |
-
title: "parser2md"
|
| 13 |
sdk: gradio
|
| 14 |
-
#sdk_version: 5.0.1
|
| 15 |
command: python main.py
|
| 16 |
app_file: main.py
|
| 17 |
-
|
| 18 |
-
colorFrom: yellow
|
| 19 |
-
colorTo: purple
|
| 20 |
-
name: "parser2md"
|
| 21 |
-
pinned: false
|
| 22 |
license: mit
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
requires-python: ">=3.12"
|
| 27 |
dependencies: []
|
| 28 |
-
owner:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
---
|
| 30 |
|
| 31 |
# parserPDF
|
|
@@ -184,4 +203,6 @@ MIT License. See [LICENSE](LICENSE) for details.
|
|
| 184 |
## Acknowledgments
|
| 185 |
- Built with [Gradio](https://gradio.app/) for the UI.
|
| 186 |
- PDF parsing via [Marker](https://github.com/VikParuchuri/marker).
|
| 187 |
-
- LLM integrations using Hugging Face Transformers and OpenAI APIs.
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: parser2md - PDF & HTML parser to markdown
|
| 3 |
+
emoji: π
|
| 4 |
+
colorFrom: yellow
|
| 5 |
+
colorTo: purple
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
sdk: gradio
|
|
|
|
| 7 |
command: python main.py
|
| 8 |
app_file: main.py
|
| 9 |
+
python_version: 3.12
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
license: mit
|
| 11 |
+
pinned: true
|
| 12 |
+
short_description: PDF & HTML parser to markdown
|
| 13 |
+
models: [meta-llama/Llama-4-Maverick-17B-128E-Instruct, openai/gpt-oss-120b, openai/gpt-oss-20b]
|
| 14 |
+
tags: [markdown, PDF, parser, converter, extractor]
|
| 15 |
+
#hf_oauth: true
|
| 16 |
+
preload_from_hub: [https://huggingface.co/datalab-to/surya_layout, https://huggingface.co/datalab-to/surya_tablerec, huggingface.co/datalab-to/line_detector0, https://huggingface.co/tarun-menta/ocr_error_detection/blob/main/config.json]
|
| 17 |
+
owner: research-semmyk
|
| 18 |
+
#---
|
| 19 |
+
#
|
| 20 |
+
#[Project]
|
| 21 |
+
#---
|
| 22 |
+
#title: parser2md - PDF & HTML parser to markdown
|
| 23 |
+
#emoji: \U0001F4C4ππ
|
| 24 |
+
#colorFrom: yellow
|
| 25 |
+
#colorTo: purple
|
| 26 |
+
#sdk: gradio
|
| 27 |
+
#python_version: 3.12
|
| 28 |
+
#sdk_version: 5.44.1
|
| 29 |
+
#app_file: main.py
|
| 30 |
+
#command: python main.py
|
| 31 |
+
#models:
|
| 32 |
+
# - meta-llama/Llama-4-Maverick-17B-128E-Instruct
|
| 33 |
+
# - openai/gpt-oss-120b
|
| 34 |
+
#pinned: false
|
| 35 |
+
#license: mit
|
| 36 |
+
#name: parser2md
|
| 37 |
+
#short_description: PDF & HTML parser to markdown
|
| 38 |
+
version: 0.1.0
|
| 39 |
+
readme: README.md
|
| 40 |
requires-python: ">=3.12"
|
| 41 |
dependencies: []
|
| 42 |
+
#owner: research-semmyk
|
| 43 |
+
#preload_from_hub:
|
| 44 |
+
# - https://huggingface.co/datalab-to/surya_layout
|
| 45 |
+
# - https://huggingface.co/datalab-to/surya_tablerec
|
| 46 |
+
# - huggingface.co/datalab-to/line_detector0
|
| 47 |
+
# - https://huggingface.co/tarun-menta/ocr_error_detection/blob/main/config.json
|
| 48 |
---
|
| 49 |
|
| 50 |
# parserPDF
|
|
|
|
| 203 |
## Acknowledgments
|
| 204 |
- Built with [Gradio](https://gradio.app/) for the UI.
|
| 205 |
- PDF parsing via [Marker](https://github.com/VikParuchuri/marker).
|
| 206 |
+
- LLM integrations using Hugging Face Transformers and OpenAI APIs.
|
| 207 |
+
- HuggingFace Spaces Configuration Reference [HF Spaces Configuration Reference](https://huggingface.co/docs/hub/en/spaces-config-reference)
|
| 208 |
+
- IBM Research: [HF Spaces Guide](https://huggingface.co/spaces/ibm-granite/granite-vision-demo/blob/main/DEVELOPMENT.md)
|