Dangerzone will convert possibly insecure files like PDF, DOCX into a safe PDF
![](https://beehaw.org/pictrs/image/9eabc017-33f0-4e06-a82f-c1a353cdaa2e.webp)
![](https://beehaw.org/pictrs/image/c0e83ceb-b7e5-41b4-9b76-bfd152dd8d00.png)
![Dangerzone](https://forkk.me/pictrs/image/2cc102e8-6810-4339-812c-e84c5b906075.png?format=jpg&thumbnail=256)
dangerzone.rocks
Literally one of the worst formats I deal with daily, from a security standpoint are PDFs. Very useful and predictable for the end user; yes, but very dangerous for the capabilities it allows.
Dangerzone works like this: You give it a document that you don't know if you can trust (for example, an email attachment). Inside of a sandbox, Dangerzone converts the document to a PDF (if it isn't already one), and then converts the PDF into raw pixel data: a huge list of RGB color values for each page. Then, in a separate sandbox, Dangerzone takes this pixel data and converts it back into a PDF.
You are viewing a single comment
So it basically rasterizes it? I wonder how it affects file size
No mention of OCR? Copy-pasting links or data will be a joy..
There is an optional Ocr pass, from what I understand
Oh, I think you already know.
Yeah, definitely increases the size and removes some functionality that others may rely on. But for presentation of content which is what a PDF SHOULD BE, then it has typically worked fine. I've been using pandoc and some home grown scripts to do this sort of thing for a while.