AnyParse
Collection
any file to markdown
•
5 items
•
Updated
any file parse to markdown(open source for now: pdf, image, office, html, textbase, more in the future) This is base anyparse, we are training plus.
_ ____
/ \ _ __ _ _| _ \ __ _ _ __ ___ ___
/ _ \ | '_ \| | | | |_) / _` | '__/ __|/ _ \
/ ___ \| | | | |_| | __/ (_| | | \__ \ __/
/_/ \_\_| |_|\__, |_| \__,_|_| |___/\___|
|___/
from anyparse.parser import AnyParse
from anyparse.settings import Settings
args = Settings().model_dump() ## see Settings configs
model = AnyParse(args)
# docx,pptx,xlsx,csv,txt,md,html,jpg,png,pdf
file = '1.pdf'
res = model.invoke(file,ocr_mode = "base", stream = False)
res = model.invoke(file,ocr_mode = "plus", stream = False)