pikuri-extractors 0.0.6
pikuri-extractors plugs additional document formats into pikuri-core's +Pikuri::Extractor+ registry. The bundled +Pikuri::Extractors::DOCUMENTS+ extractor converts office documents (DOCX, ODT, XLSX, legacy XLS, PPTX, EPUB, RTF) to Markdown by piping the bytes through pandoc / markitdown — preferably inside a one-shot, networkless, locally-built docker container (the untrusted bytes never touch the host filesystem or network), falling back to a host-installed pandoc / markitdown CLI when docker is absent. Registration is explicit — +Pikuri::Extractors::DOCUMENTS.register+ — so requiring the gem changes nothing by itself; the host script picks which extractors it wires in.
Gemfile:
=
install:
=
Runtime Dependencies (1):
pikuri-core
= 0.0.6