I use https://gotenberg.dev/ docker image for converting html with img to pdf.
I have a webp image of reasonable size.
I generate pdf with pyhon code like
with io.BytesIO() as tmp_index_html:
tmp_index_html.write(b"""
<html>
<head>
<title>My img</title>
</head>
<body>
<img href="img.webp" />
</body>
</html>
""")
tmp_index_html.seek(0)
with open(img_webp, "rb") as img_webp:
response = requests.post(
HTML_TO_PDF_URL,
files={
"index.html": tmp_index_html,
"img.webp": img_webp,
},
timeout=2400,
)
with open("resulf.pdf", "wb") as pdf:
pdf.write(response.content)
The problem is that size of pdf is rather bigger then size of oroginal webp.
I found that webp is not something native for pdf https://en.wikipedia.org/wiki/PDF#Imaging_model => I gess img is converted converted while "printing to pdf" with not the best for size optimization params.
Question: how should i preprocess my image to get reasonable small size of pdf ?
I did a little research on the issue. The specific solution to your question is to convert the webp to another format I tried converting to jpg and got the same size in pdf relative to the original image in jpg, but relative to webp the size increased by 2 times, which is ultimately reasonable. The format is relatively new and may not be well supported by all applications.