User Tools

Site Tools


pdf:images

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
pdf:images [2016/02/22 09:59] – [Images in PDF] christianpdf:images [2016/03/02 15:19] (current) – [Implementation] christian
Line 2: Line 2:
  
 Bitmapped images are described in chapter 8.9 //Images// on page 203 of the {{pdf32000_2008.pdf#page=211|PDF specification}} on quite readable 14 pages. Bitmapped images are described in chapter 8.9 //Images// on page 203 of the {{pdf32000_2008.pdf#page=211|PDF specification}} on quite readable 14 pages.
 +
 +The implementation is in package ''PDF Images''.
 ===== Usage ===== ===== Usage =====
  
Line 23: Line 25:
     renderer paintXObject: anImage asPDF]].     renderer paintXObject: anImage asPDF]].
 </code> </code>
 +
 +{{demo20_ImagesUsage.pdf}} shows the result with an example image.
 +
 +{{demo21_Images.pdf}} shows some PDF features of images (masking, rotation, interpolation, inverting and alpha blended images).
  
 ===== Object Models ===== ===== Object Models =====
Line 83: Line 89:
 Useful images often have a mask, allowing only the masked pixels to be drawn. Smalltalk has the class ''Graphics.OpaqueImage'' for this. An OpaqueImage contains a ''figure'' for the image and a ''shape'' for the mask. The mask is a 1 bit image with the same dimension as the figure and a ''CoveragePalette'' which interprets ''0'' as ''transparent'' (the background is drawn) and ''1'' as ''opaque'' (the pixel of the figure ist drawn). Useful images often have a mask, allowing only the masked pixels to be drawn. Smalltalk has the class ''Graphics.OpaqueImage'' for this. An OpaqueImage contains a ''figure'' for the image and a ''shape'' for the mask. The mask is a 1 bit image with the same dimension as the figure and a ''CoveragePalette'' which interprets ''0'' as ''transparent'' (the background is drawn) and ''1'' as ''opaque'' (the pixel of the figure ist drawn).
  
-''PDF.ImageXObject'' has an optional attribute ''/Mask'' which can hold a mask. The mask is a 1 bit /DeviceGray ImageXObject with the optional attribute ''/ImageMask'' set to true and without a ''/ColorSpace'' attribute. Interestingly, the resolution of the mask need not be the same as the resolution of the regular image. Since PDF interprets the bits in the opposite way ("0 shall mark the page with the current colour, and a 1 shall leave the previous contents unchanged"), converted masks have a /Decode array of ''[1 0]'' instead of the default ''[0 1]''.+In PDF, an ''ImageXObject'' has an optional attribute ''/Mask'' which can hold a mask. The mask is a 1 bit /DeviceGray ImageXObject with the optional attribute ''/ImageMask'' set to true and without a ''/ColorSpace'' attribute. Interestingly, the resolution of the mask need not be the same as the resolution of the containing image. Since PDF interprets the bits in the opposite way ("0 shall mark the page with the current colour, and a 1 shall leave the previous contents unchanged"), converted masks have a /Decode array of ''[1 0]'' instead of the default ''[0 1]''.
  
 When converting an OpaqueImage to PDF with ''asPDF'', an ImageXObject with a mask is created. Conversely, converting an ImageXObject with a mask with ''asSmalltalkValue'' will produce an OpaqueImage. There are also ''UI.Icon'' objects with a similar layout as OpaqueImage. Icons do understand ''asPDF'', but the reverse conversion will result in an OpaqueImage (from which an icon can be created easily). When converting an OpaqueImage to PDF with ''asPDF'', an ImageXObject with a mask is created. Conversely, converting an ImageXObject with a mask with ''asSmalltalkValue'' will produce an OpaqueImage. There are also ''UI.Icon'' objects with a similar layout as OpaqueImage. Icons do understand ''asPDF'', but the reverse conversion will result in an OpaqueImage (from which an icon can be created easily).
Line 95: Line 101:
  
 The conversion methods are implemented in the ''Graphics.Image'' hierarchy. To convert a Smalltalk image to PDF with ''asPDF'', the method ''writePixelsTo: anImageXObject'' transfers the actual pixels from the Smalltalk to the PDF image. The other direction uses the method ''readPixelsFrom: anImageXObject'' which transfers the PDF pixels to the Smalltalk image. The conversion methods are implemented in the ''Graphics.Image'' hierarchy. To convert a Smalltalk image to PDF with ''asPDF'', the method ''writePixelsTo: anImageXObject'' transfers the actual pixels from the Smalltalk to the PDF image. The other direction uses the method ''readPixelsFrom: anImageXObject'' which transfers the PDF pixels to the Smalltalk image.
 +
 +{{ :pdf:bitfiddling200x266.png?nolink|Work sketch for bit fiddling}}
  
 The default behavior is to transfer the pixels one by one. For each pixel, the bits are read from the specified location in the source image bytes and interpreted as color (''valueAtPoint:''). This color is then converted to the target bits which are written to the specified location in the target image bytes (''valueAtPoint:put:''). While these two pixel accessors are correct and well tested for any kind of image, they are very slow. The default behavior is to transfer the pixels one by one. For each pixel, the bits are read from the specified location in the source image bytes and interpreted as color (''valueAtPoint:''). This color is then converted to the target bits which are written to the specified location in the target image bytes (''valueAtPoint:put:''). While these two pixel accessors are correct and well tested for any kind of image, they are very slow.
  
-This default implementation (''readPixelsByPixelFrom: anImageXObject'' and ''writePixelsByPixelTo: anImageXObject'') is the reference for testing and the baseline for performance benchmarks (see the class ''ImageConversionBenchmarks'' in package ''[PDF Development]'').+This default implementation (''readPixelsByPixelFrom:'' and ''writePixelsByPixelTo:'') is the reference for testing and the baseline for performance benchmarks (see the class ''ImageConversionBenchmarks'' in package ''[PDF Development]'').
  
-Some conversions can be greatly sped up by exploiting the internal byte organization of the image bits and transfering them directly. While this is possible for many useful forms, it is not possible in general (think of a Smalltalk image with a big palette of more than 255 colors). The following conversions are currently optimized:+Some conversions can be greatly sped up (one or two orders of magnitude) by exploiting the internal byte organization of the image bits and transfering them directly. While this is possible for many useful forms, it is not possible in general (a Smalltalk image with a palette of more than 255 colors, for example). 
 + 
 +The following conversions are currently optimized:
   * Depth1Image for Black and white images and masks   * Depth1Image for Black and white images and masks
   * Depth24Image with 8 bit RGB   * Depth24Image with 8 bit RGB
   * Depth32Image for 8 bit RGB and BGR images taken from the ''Screen'' (the first byte is always zero)   * Depth32Image for 8 bit RGB and BGR images taken from the ''Screen'' (the first byte is always zero)
   * Depth32Image for 8 bit ARGB and ABGR   * Depth32Image for 8 bit ARGB and ABGR
-===== Disclaimer =====+  * Depth{2 4 8)Image with a MappedPalette.
  
-Not covered are the special {{pdf32000_2008.pdf#page=22|filters}} for various kinds of images:+The direct conversion of an image with a mapped palette is special. Since RGB color components are represented with 13 bits in Smalltalk, but using 8 bits in PDF, a Smalltalk palette may have more than one entry for one 8 bit RGB color. This is correctly handled when converting the image pixel by pixel, because each color is stored as 8 bit color in the PDF /Indexed colorspace, thereby aligning different 3x13 bit colors to the same 3x8 bit color. 
 + 
 +When converting such image optimized by converting the palette and using the same indexes for the pixels allowing direct reuse of the image bytes, the /Indexed colorspace may contain several entries for the same color. Converting such an ImageXObject back to Smalltalk will not recreate the least significant 5 bits leading to slightly different colors as in the original. But for 8 bit RGB usage, it will not make any difference. Although this does not feel proper, it will not make much difference in practice. But the speed up of the optimization is worth it. 
 +===== To be done ===== 
 + 
 +==== Filter ==== 
 + 
 +Although all Smalltalk images can be used for PDF, not all PDF images can be transformed to Smalltalk images. For one, several {{pdf32000_2008.pdf#page=22|filters}} specific to images are not implemented:
   * **RunLengthDecode** 8 bit monochrome images   * **RunLengthDecode** 8 bit monochrome images
   * **CCITTFaxDecode** CCITT encoded 1 bit monochrome images   * **CCITTFaxDecode** CCITT encoded 1 bit monochrome images
   * **JBIG2Decode** JBIG2 encoded 1 bit monochrome images   * **JBIG2Decode** JBIG2 encoded 1 bit monochrome images
   * **DCTDecode** JPEG encoded 8 bit grayscale or color images   * **DCTDecode** JPEG encoded 8 bit grayscale or color images
-  * **JPXDecode** JPEG2000 encoded grayscale or color images+  * **JPXDecode** JPEG2000 encoded grayscale or color images
 + 
 +This means that it is not possible to extract such images from PDF. Nor is it possible to store images in the most efficient way in a PDF. This feature is valuable and I hope to implement some of the filters in the not too distant future.
  
-These are not implemented (yet)so that it is not possible to extract such images from PDFNor is it possible to store images in the most efficient way in a PDF. Just the basic **FlateDecode** filter is used by default to compress images.+SecondlyPDF can have images in other colorspaces than RGB or Grayscale; most notable is ''/DeviceCMYK'' for print. For correctly extracting such images, proper color conversions to RGB need to be implementedThis feature is not intersting to me at the moment.
  
 +==== Inlined Images ====
  
 +Images in PDF can be inlined in the /Contents stream instead of storing them in the /Resources as /XObject. Only a subset of legal PDF images can be inlined and it is discouraged for large images. Even though, I have not seen such image in a real-world PDF, this feature should be implemented for completeness.
/var/www/virtual/code4hl/html/dokuwiki/data/attic/pdf/images.1456131554.txt.gz · Last modified: 2016/02/22 09:59 by christian