User Tools

Site Tools


pdf:paintingapage

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
pdf:paintingapage [2015/04/01 12:21] – [How does text get onto a PDF page] christianpdf:paintingapage [2016/02/24 17:49] (current) – [Hello World - How text gets onto a page] christian
Line 1: Line 1:
-====== How does text get onto a PDF page ======+====== Hello World - How text gets onto a page ======
  
-A PDF page has a content stream (**/Contents**) containing a list of graphics operator with their parameters. The operators are sequentially executed and can set aspects of the **GraphicsState** or paint in the context of the current GraphicsState.+A PDF page has a content stream (**''/Contents''**) containing a list of graphics operator with their parameters. The operators are sequentially executed and can set aspects of the **GraphicsState** or paint in the context of the current GraphicsState.
  
-The operators in the contents stream have no way to reference PDF objects outside of the contents. With an important exception: complex objects like raster images or fonts are held separately in the **/Resources** dictionary of the page. The **/Resources** dictionary has several entries for differnt kinds of objects like **/XObject** for raster images (not quite supported yet) and embedded graphics and **/Font** for the fonts used on the page. Each entry is a dictionary with named objects which can be used by specific graphics operators. Lets see an example:+The operators in the contents stream have no way to reference PDF objects outside of the contents. With an important exception: complex objects like raster images or fonts are held separately in the **''/Resources''** dictionary of the page. The **''/Resources''** dictionary has several entries for differnt kinds of objects like **''/XObject''** for raster images (not quite supported yet) and embedded graphics and **''/Font''** for the fonts used on the page. Each entry is a dictionary with named objects which can be used by specific graphics operators. Lets see an example:
 <code pdf> <code pdf>
 /Page << /Page <<
Line 15: Line 15:
       /F1 10 Tf       /F1 10 Tf
       10 5 Td       10 5 Td
-      (Hello) Tj+      (Hello World) Tj
       ET       ET
     endstream     endstream
Line 21: Line 21:
 </code> </code>
  
-This paints the string 'Hello' in black at 10@5 with font **/F1** in size 10.+This paints the string ''Hello World'' in black at 10@5 with font **''/F1''** in size 10.
  
 To achieve this in Smalltalk you can write the following: To achieve this in Smalltalk you can write the following:
 <code smalltalk> <code smalltalk>
-page := Page newInBounds: (0 @ 0 corner: 50 @ 20) colorspace: DeviceCMYK new render: [:renderer |+page := Page newInBounds: (0 @ 0 corner: 70 @ 20) colorspace: DeviceCMYK new render: [:renderer |
   renderer fillColor: CmykColor black.   renderer fillColor: CmykColor black.
   renderer textObjectDo: [   renderer textObjectDo: [
     renderer setFont: #Helvetica size: 10.     renderer setFont: #Helvetica size: 10.
-    renderer add: (TextPositioningOperator Td operands: #(10 5)). +    renderer add: (NextLineRelative operands: #(10 5)). 
-    renderer showString: 'Hello']].+    renderer showString: 'Hello World']].
 </code> </code>
-You notice that I did not use the font ID **/F1** but the font directly (referenced as the global #Helvetica). ''renderer'' **setFont:** takes care of that and puts the font into the resources and assigns it to an internal name which is used in the content stream. This mechanism works for all resource types so that the programmer can always use the appropriate object directly and never needs to care about the internal IDs.+{{demo01_HelloWorld.pdf}} See the class method ''demo01_HelloWorld'' in class ''Document''.
  
-The renderer you get when creating a Page takes care of the /Contents stream with its /Resources dictionary.+You notice that I did not use the font ID **''/F1''** but the font directly (referenced as the global #Helvetica). ''renderer **setFont:**'' takes care of that and puts the font into the resources and assigns it to an internal name which is used in the content stream. This mechanism works for all resource types so that the programmer can always use the appropriate object directly and never needs to care about the internal IDs.
  
-Common operators are implemented as renderer methods like **fillColor:** and **showString:**, but not all. In the end, all these methods boil down to expressions creating operators and adding them to the renderer as done with the **Td** operator.+The renderer you get when creating a Page takes care of the **''/Contents''** stream with its **''/Resources''** dictionary.
  
-The reason why **Td** is not covered is that there are several ways to put text on a page - and using **Td** is not a very practical one.+Common operators are implemented as renderer methods like **''fillColor:''** and **''showString:''**, but not all. In the end, all these methods boil down to expressions creating operators and adding them to the renderer as done with the **''NextLineRelative (Td)''** operator.
  
-==== Painting Text ====+**''NextLineRelative''** is not covered by a convenience method of the renderer, because there are several ways to put text on a page - and using **''NextLineRelative''** is not a very practical one. 
 + 
 +===== Painting Text =====
  
  
Line 46: Line 48:
  
   * set the relevant graphics state parameters including the font   * set the relevant graphics state parameters including the font
- 
   * set the position/Matrix   * set the position/Matrix
- 
   * show the string   * show the string
  
-While the state parameters are straight forward (see list below), positioning the string is best done using a matrix. The text matrix is set by **Tm** with 6 numbers as parameters. +While the state parameters are straight forward (see list below), positioning the string is best done using a matrix. The text matrix is set by **''SetTextMatrix (Tm)''** operator with 6 numbers as parameters. 
- +<code smalltalk> 
- (TextPositioningOperator Tm operands: #(1 0 0 1 10 5))+(SetTextMatrix operands: #(1 0 0 1 10 5)) 
 +</code>
 produces produces
- 1 0 0 1 10 5 Tm +  1 0 0 1 10 5 Tm 
-which sets the scaling to 1 horizontally and vertically and adds an offset to point 10@5, i.e. it does the same as `10 5 Td`. A transformation matrix can express scaling, rotation, skewing and translation at once. For example +which sets the scaling to 1 horizontally and vertically and adds an offset to point 10 @ 5, i.e. it does the same as ''10 5 Td''. A transformation matrix can express scaling, rotation, skewing and translation at once. For example 
- 0.95 0 0 1 10 5 Tm+  0.95 0 0 1 10 5 Tm
 would compress the text horizonatally by 5%. would compress the text horizonatally by 5%.
  
 For example Adobe Illustrator sets the font size always to 1 and uses the matrix to scale accordingly. For example Adobe Illustrator sets the font size always to 1 and uses the matrix to scale accordingly.
- /F1 1 Tf +  /F1 1 Tf 
- 9.5 0 0 10 10 5 Tm+  9.5 0 0 10 10 5 Tm
  
-The text state operators: +The text state operators are
- +  * **''TextFont (Tf)''** 
-  * **Tf** text font and size +  * **''TextRenderingMode (Tr)''** 
- +  * **''CharacterSpacing (Tc)''** 
-  * **Tr** rendering mode +  * **''WordSpacing (Tw)''** 
- +  * **''Leading (TL)''** 
-  * **Tc** character spacing +  * **''TextRise (Ts)''** 
- +  * **''HorizontalScaling (Th)''** (should better be done with Tm).
-  * **Tw** word spacing +
- +
-  * **TL** text leading +
- +
-  * **Ts** rise +
- +
-  * **Th** horizontal scaling (should be done with Tm).+
  
 The two relevant text showing operators are: The two relevant text showing operators are:
  
-  * **Tj** show string +  * **''ShowText (Tj)''** 
- +  * **''ShowTextPositioned (TJ)''** show string with individual character positioning
-  * **TJ** show string with individual character positioning+
  
 There are no high-level operations for word wrapping, automatic kerning, hyphenization or even simple justification. This is only about putting characters at specific positions. How you get these positions is up to you or your layout program. There are no high-level operations for word wrapping, automatic kerning, hyphenization or even simple justification. This is only about putting characters at specific positions. How you get these positions is up to you or your layout program.
  
 For justification, you need to know the length of a string. For this you can use For justification, you need to know the length of a string. For this you can use
- aFont stringWidthOf: aString at: aFontsize.+<code smalltalk> 
 +aFont stringWidthOf: aString at: aFontsize. 
 +</code>
 Using our example you would write Using our example you would write
- (Graphics.Fonts.Font fontAt: #Helvetica) stringWidthOf: 'Hello' at: 10.+<code smalltalk> 
 +(Graphics.Fonts.Font fontAt: #Helvetica) stringWidthOf: 'Hello' at: 10. 
 +</code>
 returning 22.78 which is the width in PostScript points in an unscaled coordinate system. returning 22.78 which is the width in PostScript points in an unscaled coordinate system.
  
-==== Program design ====+===== Program design =====
  
  
 The implementation may look at bit clumsy. Why should you use The implementation may look at bit clumsy. Why should you use
- renderer add: (TextPositioningOperator Td operands: #(10 5)) +<code smalltalk> 
-to get the simple string `'10 5 Td'`?+renderer add: (NextLineRelative operands: #(10 5)) 
 +</code> 
 +to get the simple string ''10 5 Td''?
  
-Firstly, I wanted operators as objects and not just as strings you write into the contents stream. The objects can be read from a PDF (try: from the PDFExplorer inspect a **/Contents** object and send it `#operations`) and the list of operator you create can be written to a PDF. There are some things a program could do with operators: +Firstly, I wanted operators as objects and not just as strings you write into the contents stream. The objects can be read from a PDF (try: from the [[PDFExplorer]] inspect a **''/Contents''** object and send it ''#operations'') and the list of operator you create can be written to a PDF. There are some things a program could do with operators:
- +
-  * check the consistency/validity. F.ex. /BT must be written before /ET and must enclose certain text operators; they cannot be nested etc. etc.+
  
 +  * check the consistency/validity. F.ex. **''BeginText (BT)''** must be written before **''EndText (ET)''** and must enclose certain text operators; they cannot be nested etc. etc.
   * implement a GraphicsState object to track the changes to it. With this, unneccessary operators can be avoided (this is on my todo list).   * implement a GraphicsState object to track the changes to it. With this, unneccessary operators can be avoided (this is on my todo list).
  
 In any case, it is good to have operators referable in the development image. In any case, it is good to have operators referable in the development image.
  
-Operators have a name and arguments, the operands. There are subclasses for each group of operators. They don't add state, but allow for different behavior (which is not exploited yet). An operator is created by sending the appropriate class message to the corresponding Operator class. +Secondly, this clumsy interface is meant to be used as backend by a higher level graphics framework. It is expected that you have an abstraction of **Text** which can render itself using the PDF primitives. In [[http://www.smallcharts.com|smallCharts]] I have Texts like 
- TextPositioningOperator Td +<code smalltalk> 
-The operands are added with +ChartText 
- <operator> operands: <anArrayOfValues> +  style: (Textstyle 
-which returns a copy of the operator with the operands. +    color: (CmykColor cyan: 1 magenta: 0.3 yellow: 0 black: 0.3) 
- +    font: #Helvetica 
-I am not sure that this design will stand the time. I will see when I use it in ernest. +    size: 12 
- +    trackKerning: -30 
-Secondly, this clumsy interface is meant to be used as backend by a higher level graphics framework. It is expected that you have an abstraction of **Text** which can render itself using the PDF primitives. In ''smallCharts'' I have Texts like +    withoutLeftSideBearing: true 
- ChartText +    scale: 0.95 @ 1) 
- style: (Textstyle +  string: 'Hello' 
- color: (CmykColor cyan: 1 magenta: 0.3 yellow: 0 black: 0.3) +  position: 10 @ 5 
- font: #Helvetica +</code>
- size: 12 +
- trackKerning: -30 +
- withoutLeftSideBearing: true +
- scale: 0.95 @ 1) +
- string: 'Hello' +
- position: 10 @ 5+
 which renders itself as PDF with which renders itself as PDF with
- ChartText>>renderPDFOn: aPDFRenderer +<code smalltalk> 
- aPDFRenderer textRenderingMode: 0. +ChartText>>renderPDFOn: aPDFRenderer 
- aPDFRenderer fillColor: self style color. +  aPDFRenderer textRenderingMode: 0. 
- aPDFRenderer textObjectDo:+  aPDFRenderer fillColor: self style color. 
- self style renderPDFOn: aPDFRenderer. +  aPDFRenderer textObjectDo:
- aPDFRenderer textMatrix: self pdfMatrixArray. +    self style renderPDFOn: aPDFRenderer. 
- aPDFRenderer showString: (self pdfStringFor: aPDFRenderer)]+    aPDFRenderer textMatrix: self pdfMatrixArray. 
 +    aPDFRenderer showString: (self pdfStringFor: aPDFRenderer)] 
 +</code>
  
-I did not open-source my graphics classes because they are specific to the needs of *smallCharts*. It does vector graphics and a bit of text as it is used for labels and such. For example since charts use mostly vertical and horizontal lines, I have classes for them :-). My objects can only scale and translate, but not rotate - they dont need to... This may be different for others.+I did not open-source my graphics classesbecause they are specific to the needs of smallCharts. It does vector graphics and a bit of text. For example, I have classes for horizontal and vertical lines, since charts use mostly those. My objects can only scale and translate, but not rotate - they don'need to... This may be different for others.
  
 I like to develop my abstractions from the bottom up and try to keep them as simple as possible. Maybe, over time, users will develop abstractions which are generally useful. In the end it should be a community discussion and consensus of what should be included. So far, only the bare metal on the spec will be available and you have to evolve your own abstractions. I like to develop my abstractions from the bottom up and try to keep them as simple as possible. Maybe, over time, users will develop abstractions which are generally useful. In the end it should be a community discussion and consensus of what should be included. So far, only the bare metal on the spec will be available and you have to evolve your own abstractions.
  
-=== Comments ===+===== Comments =====
  
-== Higher level abstractions ==+==== Higher level abstractions ====
  
 Submitted by bobcalco on Tue, 2012-01-24 10:32. Submitted by bobcalco on Tue, 2012-01-24 10:32.
Line 155: Line 149:
 I know about this because I used an earlier version of Prawn to code a PDF generation feature of a content delivery system, which I am now needing to replace in Smalltalk, having decided to make the switch. I am a bit sad at the state of PDF generation in Smalltalk. There are so many other strengths in Smalltalk for the kind of distributed system I am building that wooed me over, but this deficiency is going to cost me some late nights and lamp oil. I know about this because I used an earlier version of Prawn to code a PDF generation feature of a content delivery system, which I am now needing to replace in Smalltalk, having decided to make the switch. I am a bit sad at the state of PDF generation in Smalltalk. There are so many other strengths in Smalltalk for the kind of distributed system I am building that wooed me over, but this deficiency is going to cost me some late nights and lamp oil.
  
-== Here is a PDF manual ==+==== Here is a PDF manual ====
  
 Submitted by bobcalco on Tue, 2012-01-24 12:03. Submitted by bobcalco on Tue, 2012-01-24 12:03.
Line 163: Line 157:
 http://prawn.majesticseacreature.com/manual.pdf external http://prawn.majesticseacreature.com/manual.pdf external
  
-== Re: Higher level abstractions ==+==== Re: Higher level abstractions ====
  
 Submitted by ChristianHaider on Tue, 2012-01-24 11:58. Submitted by ChristianHaider on Tue, 2012-01-24 11:58.
Line 169: Line 163:
 Interesting. I am curious what experiences you have while porting from Prawn to PDF4Smalltalk. Maybe some good concepts can be integrated... If you have any questions, please ask in the forum - sometimes I am responsive :-) Interesting. I am curious what experiences you have while porting from Prawn to PDF4Smalltalk. Maybe some good concepts can be integrated... If you have any questions, please ask in the forum - sometimes I am responsive :-)
  
-== Thanks! ==+==== Thanks! ====
  
 Submitted by bobcalco on Tue, 2012-01-24 12:06. Submitted by bobcalco on Tue, 2012-01-24 12:06.
/var/www/virtual/code4hl/html/dokuwiki/data/attic/pdf/paintingapage.1427883715.txt.gz · Last modified: 2015/04/01 12:21 by christian