PD4ML – V4 Programmers Manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 24

PD4ML v4 Programmers ManualPD4ML v4 Programmers Manual HOME HOME / / PD4ML V4 PROGRAMMERS MANUALPD4ML V4 PROGRAMMERS MANUAL

1. Getting Started

The major diﬀerence with the previous PD4ML versions is that the new API performs the conversion in two phases.

First you need to invoke readHTML() method to read a source document. By default an HTML (styled with CSS) is

expected. However it can be arbitrary XML, where tag nature is speciﬁed with CSS display display property, e.g. <sideblock

style="display: block; float: right">Side content</sideblock> .

After the document is read and parsed you can render and write it as PDFPDF, PDF/APDF/A, RTFRTF or an raster image with

writePDF() or another corresponding method. If you need a conversion result in diﬀerent formats – invoke

corresponding write*() methods multiple times, there is no need to reread the source HTML.

That’s all you need to know about PD4ML to write your ﬁrst converter application…

2. Obtaining PD4ML

Since v4 PD4ML is distributed as an universal library, whose features like TTF embedding support or PDF/A output are

activated depending on license key. So binaries for all license types can be obtained from a single location.

Each PD4ML v4.x release is published to our Maven repositoryour Maven repository. You may either download the library from the

location or, if you use Apache Maven, tune your project build to automatically download the latest library build.

1. PD4ML pd4ml = new PD4ML();

3. String html = "TEST<pd4ml:page.break><b>Hello, World!</b>";

4. ByteArrayInputStream bais =

5. new ByteArrayInputStream(html.getBytes());

7. // read and parse HTML

8. pd4ml.readHTML(bais);

10. File pdf = File.createTempFile("result", ".pdf");

11. FileOutputStream fos = new FileOutputStream(pdf);

12.

13. // render and write the result as PDF

14. pd4ml.writePDF(fos);

15.

16. // alternatively or additionally:

17. // pd4ml.writeRTF(rtfos, false);

18. // BufferedImage[] images = pd4ml.renderAsImages();

19.

20. // open the just-generated PDF with a default PDF viewer

21. Desktop.getDesktop().open(pdf);

Suggest edit

Last updated on September 4, 2018

The binaries in the repository are supplied with Javadoc JARs. The most actual Javadoc is always available online.

PD4ML source code licensees also have an access to debugger-friendly library versions with debug info not stripped oﬀ.

The library versions and their source code JARs are available from the access-restricted Maven repositorythe access-restricted Maven repository.

Nightly builds are available in both repositories as snapshots.

To use PD4ML in Maven environment you’d need to conﬁgure something like that:

1. Access to regular binaries

Project pom.xmlpom.xml

2. Access to the binaries with source code

~/.m2/settings.xml~/.m2/settings.xml

Of course, it is recommended to store the password in encrypted form there. See Maven documentationMaven documentation how to do

that.

Project pom.xmlpom.xml

1. ...

2. <dependncy>

3. <groupId>com.pd4ml</groupId>

4. <artifactId>pd4ml</artifactId>

5. <version>4.0.0</version>

6. </dependency>

7. ...

8. <repository>

9. <id>pd4ml</id>

10. <url>https://pd4ml.tech/maven2/</url>

11. <snapshots>

12. <enabled>true</enabled>

13. </snapshots>

14. </repository>

1. <server>

2. <id>pd4ml-src</id>

3. <username>login_name_from_license</username>

4. <password>password_from_license</password>

5. </server>

1. ...

2. <dependncy>

3. <groupId>com.pd4ml</groupId>

4. <artifactId>pd4ml</artifactId>

5. <version>4.0.0</version>

6. </dependency>

7. ...

8. <repository>

9. <id>pd4ml-src</id>

10. <url>https://pd4ml.tech/maven2-src/</url>

11. <snapshots>

12. <enabled>true</enabled>

13. </snapshots>

14. </repository>

Suggest edit

3. Running PD4ML Converter as a standalone application

PD4ML is not only a software library/component, but also a set of tools, which can be used as standalone applications.

First download the most recent version of PD4ML JAR ﬁle from our Maven repositoryour Maven repository, save it to a directory of your

choice and make sure there is a JRE v1.6+JRE v1.6+ installed on your workstation/server.

There are two tools available:

Pd4Cmd Pd4Cmd – command line wrapper of PD4ML API

PD4Browser PD4Browser – GUI viewer/converter

3.1. PD4ML as a command-line converter tool

The command line tool is implemented as com.pd4ml.tools.Pd4Cmd.class . The class is speciﬁed as a main class of

the JAR, so it can be executed a simple way:

java -jar pd4ml*.jar

where java*.jar is an actual JAR ﬁle name, like pd4ml-4.0.1.jar .

There are four typical usage scenarios:

1. HTML conversion to PDF, RTF or a raster image

2. PDF processing (merging, updating)

3. Reading of document meta information

4. Indexing of TTF fonts

HTML conversion to PDF, RTF or a raster image

HTML-to-PDF conversion with the absolute minimum of parameters

The command line overrides the default Java memory heap size limit with -Xmx512m-Xmx512m. Here it is set to 512Mb.

On UNIX platform -Djava.awt.headless=true allows to run the application on non-graphics-enabled servers or from

remote ssh/telnet sessions.

http://pd4ml.com" 1200 are

HTML source URL

and

htmlWidth

(virtual “browser” frame width) parameters.

Note: on Win32 the URL is enclosed, if needed, to double quotes, on UNIX – to single quotes.

The default PDF document format: A4 / PORTRAITA4 / PORTRAIT

In the example 1200pxpx width of rendered document will be mapped to 595ptpt widths of A4 page format.

As long as an output ﬁle path omitted, the output is sent to

STDOUTSTDOUT

and can be piped to another application.

Customized HTML-to-PDF conversion

Last updated on August 26, 2018

Suggest edit

Last updated on August 27, 2018

java -Djava.awt.headless=true -Xmx512m -jar pd4ml.jar "http://pd4ml.com" 1200

java -Djava.awt.headless=true -Xmx512m -jar pd4ml.jar "http://pd4ml.com" 1200 LETTER -

bookmarks HEADINGS -pdfforms -debug -out pd4ml.pdf

In the examples the generated PDF is written to a ﬁle, deﬁned with -out-out parameter. That makes possible to use

STDOUTSTDOUT

for debug

output (-debug-debug parameter).

The examples also force PD4ML to produce PDF outlines (bookmarks) from <h1>-<h6> hierarchy of the document ( -

bookmarks HEADINGS ) and to convert HTML forms to interactive PDF forms ( -pdfforms ).

PDF processing (merging, updating)

PDF page removal

The call extracts a selected range of document pages and saves them as a new document.

PDF documents merge

Note: -pagerange option is not available by a PDF merge

PDF permissions update

-permissions 28 is a sum of permissions: AllowDegradedPrint = 4 , AllowModify = 8 and AllowCopy =

16 .

See API reference for more details…See API reference for more details…

Reading of document meta information

The call prints to STDOUTSTDOUT basic PDF info: document permissions (as a hex number), document author, document title,

number of document pages (decimal number)

Indexing of TTF fonts

See the command line tool documentation…See the command line tool documentation…

java -Djava.awt.headless=true -Xmx512m -jar pd4ml.jar -tools file:c:/docs/test.pdf -

pagerange 2-3,5+ -out c:/docs/newdoc.pdf

java -Djava.awt.headless=true -Xmx512m -jar pd4ml.jar -tools file:c:/docs/test.pdf -merge

file:c:/docs/tomerge.pdf after -out c:/docs/newdoc.pdf

java -Djava.awt.headless=true -Xmx512m -jar pd4ml.jar -tools file:c:/docs/test.pdf -

permissions 28 -out c:/docs/newdoc.pdf

java -Djava.awt.headless=true -Xmx512m -jar pd4ml.jar -tools file:c:/docs/test.pdf -

printpermissions -printauthor -printtitle -printpagenum

java -Xmx512m -jar pd4ml.jar -configure.fonts <fontdir> [pd4fonts.properties location]

Suggest edit

Last updated on October 27, 2018

3.2. PD4ML as a GUI application

PD4ML library includes a simple GUI tool for HTML document rendering preview and for test conversion to PDF or to RTF.

The tool can be activated by a passing -gui parameter to the command line:

java -jar pd4ml*.jar -gui

where java*.jar is an actual JAR ﬁle name, like pd4ml-4.0.1.jar .

There is also a “legacy” way to run the GUI converter, kept for backward compatibility with PD4ML versions prior to v4.0.0v4.0.0

java -cp pd4ml*.jar org.zefer.pd4ml.tools.PD4Browser

After the ﬁrst start the GUI tool creates pd4browser.properties ﬁle (if no previously created instance found) with the

conversion parameter defaults. The ﬁle can be customized according your needs in a text editor.

In the top section of the application you can ﬁnd the controls:

[•••] – local HTML ﬁle open dialog. Accepts .htm(l) and .svg (and .dxl from v4.1.0v4.1.0 )

[GO] – reloads current document and render it in the main application view as an image

[PDF] – renders current document as PDF and opens it in default PDF viewer application

[RTF] – renders current document as RTF and opens it in default RTF viewer application

In the status bar there are:

A label with the target page format. The target ﬁle format can be changed in pd4browser.properties ﬁle or in the source

HTML/CSS with @page at-rule

A label with the actual htmlWidth conversion parameter value. The value can be changed by a horizontal resizing of the

application window. For a ﬁne tuning there are [+ and -] buttons

Target page margins. The margins can be changed in pd4browser.properties ﬁle or in the source HTML/CSS with @page

at-rule.

Current page number label with page browse buttons [<< and >>]

Mouse position / drag box coordinates

A pointing of the mouse over the status bar labels opens a balloon with additional details.

A clicking [PDF] button converts current document to PDF and open the resulting ﬁle in a separate PDF viewer window:

The conversion defaults can be adjusted in pd4browser.properties . The default properties still include some entries,

inherited from older PD4ML versions, and not always usable in v4.0.0v4.0.0 .

Here we’ll review the important ones.

debug.info.enable=0 – logging/debug output verbosity. The value is a bit mask to control logging aspects

document.author= – a string to override document author metadata

document.title= – a string to override document title metadata

insets.bottom=10 – bottom page margin value

insets.left=10 – left page margin value

insets.right=10 – right page margin value

insets.top=10 – top page margin value

insets.units=mm – margin value units: mm or pt

page.bookmarks.destinations=false – enables a generation of PDF page bookmarks (outlines) from named anchors

(<a name="an1"/> ). Not compatible with page.bookmarks.headings=true

page.bookmarks.headings=false – enables a generation of PDF page bookmarks (outlines) from <H1>-<H6> hierachy.

Not compatible with page.bookmarks.destinations=true

page.format=A4 – speciﬁed target page format (e.g. A6 , LETTER )

page.hyperlinks=true – enables a conversion of HTML hyperlinks to active PDF hyperlinks

page.orientation=false – by true rotates given target page format to 90 degrees (e.g. portrait to landscape)

pdf.forms.enable=false – enabled a conversion of HTML forms to submittable native PDF forms.

pdf.pdfa=false – enables PDF/A output

pdf.protect.pud=false – forces PD4ML to keep HTML object sizes given in physical units (mm, pt, in etc) in resulting PDF,

independently on HTML-to-PDF scale factor

proxy.host= – proxy host address (if needed)

proxy.port=0 – proxy port (if needed)

style.add= – CSS style string to be applied to loaded documents

ttf.fonts.default.monospace= – default monospaced font. Not recommended, use CSS style for that instead

ttf.fonts.default.sansserif= – default Sans Serif font. Not recommended, use CSS style for that instead

ttf.fonts.default.serif= – default Serif font. Not recommended, use CSS style for that instead

ttf.fonts.dir= – location of TTF fonts dir directory with pd4fonts.properties index ﬁle

userSpace.adjustToContent=false – if true , forces PD4ML to set htmlWidth value to the measured width of the

HTML content.

4. PD4ML Java API

API Documentation

Each release of PD4ML comes with actual JavaDocs. When you use Maven for dependency management, your IDE will

automatically download the JavaDocs JARs and when you hover on a PD4ML class or method, show you the corresponding

documentation.

For reference we also publish the API documentation online so you can link to it from emails or websites:

https://pd4ml.tech/javadoc/com/pd4ml/PD4ML.html

Pre-Requisites

The following are the pre-requisites needed to use the PD4ML Java API library:

* JDK installed on your PC

* Apache Maven build automation tool

* Eclipse installed on your PC (While Eclipse is not compulsory it is highly recommended)

Suggest edit

Last updated on October 28, 2018

* PD4ML Java API can be downloaded from our website directly (or let Maven do that automatically)

* PD4ML evaluation license

Developing your ﬁrst application

Creating blank Maven Java project

As a good starting point we recommend to create a blank Java project as described in the tutorial:

https://maven.apache.org/guides/getting-started/maven-in-ﬁve-minutes.html

After you’ve got the example application to run, let’s add to it PD4ML.

First add PD4ML library dependency to project’s pom.xml . See lines 17-22 below:

In order to resolve the dependency, add a location of PD4ML repository. The right location for the repository address is

1. <project xmlns="http://maven.apache.org/POM/4.0.0"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

2. xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-

v4_0_0.xsd">

3. <modelVersion>4.0.0</modelVersion>

4. <groupId>com.mycompany.app</groupId>

5. <artifactId>test-pd4ml</artifactId>

6. <packaging>jar</packaging>

7. <version>1.0-SNAPSHOT</version>

8. <name>test-pd4ml</name>

9. <url>http://maven.apache.org</url>

10. <dependencies>

11. <dependency>

12. <groupId>junit</groupId>

13. <artifactId>junit</artifactId>

14. <version>3.8.1</version>

15. <scope>test</scope>

16. </dependency>

17. <dependency>

18. <groupId>com.pd4ml</groupId>

19. <artifactId>pd4ml</artifactId>

20. <version>[1.0.0,)</version>

21. </dependency>

22. </dependencies>

23. <build>

24. <plugins>

25. <plugin>

26. <groupId>org.apache.maven.plugins</groupId>

27. <artifactId>maven-compiler-plugin</artifactId>

28. <version>3.1</version>

29. <configuration>

30.

31. <source>1.7</source>

32. <target>1.7</target>

33. </configuration>

34. </plugin>

35. </plugins>

36. </build>

37. <repositories>

38. <repository>

39. <id>pd4ml</id>

40. <url>https://pd4ml.tech/maven2/</url>

41. <snapshots>

42. <enabled>true</enabled>

43. </snapshots>

44. </repository>

45. </repositories>

46. </project>

$HOME/.m2/settings.xml (See Maven documentation). We’ll keep the example conﬁguration simpler and add the

repository info to pom.xml , despite it is not a recommended practice. See lines 37-45 in the example above.

If you own PD4ML license with product source code access, use https://pd4ml.tech/maven2-src/ as the

repository URL. You would also need to conﬁgure the repository access credentials (login/password pair) obtained from the

license email.

Also for runtime environment simplicity we’ll add the example PD4ML API calls to the application test suite. A typical test

case class looks like the example:

We are going to extend testApp() method with PD4ML usage example code.

Creating a PD4ML object

When writing an application with PD4ML API, the very ﬁrst thing to do is to create the PD4ML object:

The constructor in the example has no parameters and it expects to ﬁnd license pd4ml.lic ﬁle in the working dir of the

application or in the root folder of the classpath or of pd4ml*.jar .

Alternatively you may pass to PD4ML constructor a string with a serial number (obtained from the license email) or a string

with an URL of an alternative pd4ml.lic location.

Where to obtain pd4ml.lic ? Download it from your license page or save the serial number from the license email to a

text ﬁle and name the ﬁle pd4ml.lic

Reading HTML

After you’ve got an instance of PD4ML you can read and parse a source HTML document with one of a variety of

readHTML() methods.

In the example it reads from a prepared input stream. But it can also read from an arbitrary URL.

If you want to point the converter to a font directory or to apply an additional style, headers, footers, watermarks etc – the

1. public class AppTest extends TestCase {

2. public AppTest(String testName) {

3. super(testName);

4. }

6. public static Test suite() {

7. return new TestSuite(AppTest.class);

8. }

10. public void testApp() {

11. }

12. }

1. try {

2. PD4ML pd4ml = new PD4ML();

3. } catch (Exception e) {

4. e.printStackTrace();

5. }

1. String html = "<b>Hello, <i>World!</i></b>";

2. ByteArrayInputStream bais =

3. new ByteArrayInputStream(html.getBytes());

5. // read and parse HTML

6. pd4ml.readHTML(bais);

corresponding API calls need to be performed before readHTML()

Writing target format

A parsed source document can be rendered as a sequence of images, PDF or RTF documents.

Here it writes a PDF:

Launching default document viewer (optional)

It is rarely needed in real-life applications, but very handy in our case: lookup for a default viewer and launch it for just

generated PDF.

It looks for a viewer, associated with given ﬁle extension. For PDF it launches Adobe Reader (or similar). If we write to a

ﬁle with .rtf extension it likely launches MS Word or WordPad

First PD4ML application complete

The application, in its entirety, is as follows:

1. File pdf = File.createTempFile("result", ".pdf");

2. FileOutputStream fos = new FileOutputStream(pdf);

4. // render and write the result as PDF

5. pd4ml.writePDF(fos);

1. // open the just-generated PDF with a default PDF viewer

2. Desktop.getDesktop().open(pdf);

1. package com.mycompany.app;

3. import java.awt.Desktop;

4. import java.io.ByteArrayInputStream;

5. import java.io.File;

6. import java.io.FileOutputStream;

8. import com.pd4ml.PD4ML;

10. import junit.framework.Test;

11. import junit.framework.TestCase;

12. import junit.framework.TestSuite;

13.

14. public class AppTest extends TestCase {

15. public AppTest(String testName) {

16. super(testName);

17. }

18.

19. public static Test suite() {

20. return new TestSuite(AppTest.class);

21. }

22.

23. public void testApp() {

24. try {

25. PD4ML pd4ml = new PD4ML();

26.

27. String html = "<b>Hello, <i>World!</i></b>";

28. ByteArrayInputStream bais =

29. new ByteArrayInputStream(html.getBytes());

30.

31. // read and parse HTML

32. pd4ml.readHTML(bais);

33.

34. File pdf = File.createTempFile("result", ".pdf");

Running the application

Now we are ready to run the demo application.

Go to the Maven project root (where pom.xml is) and execute the command line:

The test target builds the projects and runs its test suite. As our example code is implemented as a test case, it will be

executed. If you did everything correct you’ll see a PDF viewer window with a generated PDF ﬁle in it:

35. FileOutputStream fos = new FileOutputStream(pdf);

36.

37. // render and write the result as PDF

38. pd4ml.writePDF(fos);

39.

40. // alternatively or additionally:

41. // pd4ml.writeRTF(rtfos, false);

42. // BufferedImage[] images = pd4ml.renderAsImages();

43.

44. // open the just-generated PDF with a default PDF viewer

45. Desktop.getDesktop().open(pdf);

46.

47. } catch (Exception e) {

48. e.printStackTrace();

49. }

50. }

51. }

1. mvn test

The document is minimalistic and built using PD4ML defaults: A4A4 page format, 10mm10mm margins, 727px727px htmlWidth, no TTF

fonts.

727px are calculated from (A4.width - margins.left - margins.right) * PIXELS_PER_POINT , which is

(595pt - (10mm + 10mm) * 2.835pt/mm) * 1.35px/pt

Customizing the conversion

PDF page format and content scaling

There are 3 important PD4ML properties, impact resulting PDF page format and layout: pageSize , pageMargins ,

htmlWidth .

pageSize deﬁnes output paper format: A4 , ISOB4 , LETTER , LEGAL etc.

Corresponding constants are deﬁned in java.pd4ml.Constants class, but you may deﬁne any non-standard page

dimension with, for instance, new com.pd4ml.PageSize(400,400) (by default the dimensions are in typographic

points) and pass it to pd4ml.setPageSize() method.

Default page format is A4 , orientation portrait .

PDF ﬁle format does not explicitly speciﬁes on meta level if a particular page is landscapelandscape– or portraitportrait-oriented. If a page

has its width greater than height – it is landscape, otherwise it is portrait. com.pd4ml.PageSize implements a page

rotation by a simple swapping of page height and width.

pageMargins deﬁne blank page area width around the page content. The default value for it new PageMargins(10,

10, 10, 10, Units.MM)

The page margins can be additionally increased by HTML document margins (e.g. <body style="margin: 50px">).

The HTML document margins have some speciﬁcs, when converted to PDF: the top margin is applied to the top of the

document (in other words, on the ﬁrst page only), the bottom margin on the last page only.

Document with no margins:Document with no margins:

htmlWidth value deﬁnes "virtual web browser" frame width (in screen pixels), and by default for A4 it is 727px727px (to

keep 1:1 scale factor by 72dpi).

PD4ML renders source HTML page using htmlWidth parameter and maps the resulting layout to the eﬃcient PDF page

width (which is pageFormat.width - pageMargins.left - pageMargins.right ).

That makes HTML-to-PDF scale factor computed like that:

scale = (pageFormat.width - pageInsets.left - pageInsets.right) / htmlWidth

From the above it is obvious, that an increasing of htmlWidth makes the resulting document content appears smaller.

Examples:

htmlWidth = 920htmlWidth = 920

htmlWidth = 1320htmlWidth = 1320

Deﬁning page header and footer

setPageHeader() and setPageFooter() methods are used to deﬁne page headers/footers using HTML as a layout

deﬁnition.

Optional scopescope parameter allows to apply particular header or footer to a speciﬁed range of pages.

The HTML code may include the placeholders: $[page]$[page] – to be substituted with current page number; $[total]$[total] – total

number of pages; $[title]$[title] – document title as deﬁned in <title><title> HTML tag or overridden with setDocumentTitle() API

call.

1. // define page header for the first page 30px high

2. pd4ml.setPageHeader("$[title]", 30, "1");

E003SetPageHeaderFooter.java

Protecting resulting document

setPermissions() method allows to apply the standard PDF security options: deﬁne a document password or restrict

particular document actions (like a hi-res print).

See a list of applicable permission ﬂags ( Allow* and DefaultPermissions ).

It is possible to deﬁne a positive list of permissions:

or to disable only selected ones:

E009SetDocumentPassword.java

Reading resources from non-standard sources

If some HTML resources like images or stylesheets are not accessible with the standard methods (ﬁle read, HTTP(S), etc),

you may deﬁne your own resource reading “driver”.

First, deﬁne a resource addressing syntax, that matches your needs. For example <a

src="database:table=pictures;id=4711">

Second, implement a resource loader, which knows what to do with “database:table=pictures;id=4711”“database:table=pictures;id=4711” URI.

The loader has to be derived from com.pd4ml.ResourceProvidercom.pd4ml.ResourceProvider class and to implement two methods: boolean

canLoad(String resource, FileCache cache) to test if it can read the URL; BufferedInputStream

getResourceAsStream(String resource, FileCache cache) to actually read the resource bytes.

4. // define page footer for the first page

5. pd4ml.setPageFooter("Total pages: $[total]", 30, "1");

7. // define page header for the second page

8. pd4ml.setPageHeader("<b>$[title]</b> $[page]/$[total]", 30, "2+");

10. // define page footer for the second page

11. pd4ml.setPageFooter("<div style='width: 100%; text-align: right'>Page: $[page]</div>",

30, "2+");

1. pd4ml.setPermissions(null, Constants.AllowAnnotate | Constants.AllowDegradedPrint);

1. pd4ml.setPermissions(null, Constants.DefaultPermissions ^ Constants.AllowModify);

1. // protect the document with "test" password. No permission restrictions applied

2. pd4ml.setPermissions("test", Constants.DefaultPermissions);

1. public class DummyProvider extends ResourceProvider {

3. public final static String PROTOCOL = "dummy";

5. @Override

6. public BufferedInputStream getResourceAsStream(String resource, FileCache cache)

throws IOException {

7. if (!resource.toLowerCase().startsWith(PROTOCOL)) {

8. return null;

9. }

10.

11. // interpret the "resource" parameter according to your protocol (e.g. as a key to

a database record etc)

A005AddCustomResourceLoader.java

5. PD4ML JSP taglib and Web applications

Using PD4ML custom tags in JSP

PD4ML distribution among other features also includes JSP custom tag library. PD4ML JSP custom tags provide a simple

mechanism for a converting of HTML or JSP page output into PDF in Web scenarios.

PD4ML JSP taglib deployment

PD4ML tag library classes and the tag library descriptor (TLD) are packaged together in PD4ML JAR ﬁle. To make the classes

and the TLD accessible for your Web application, you only need to copy pd4ml*.jar library to the lib directory WEB-

INF/lib

Using <pd4tl:transform> tag

Once PD4ML custom tags are deployed, its actions can be called from your HTML using an XML syntax.

First, reference a JAR ﬁle containing a tag library from your JSP: <%@ taglib uri="http://pd4ml.com/tlds/4.0"

prefix="pd4tl" %>

The uri attribute is a unique TLD identiﬁer. It does not point to an existing Web document, but refers to pd4tl.tld

ﬁle, packaged to pd4ml*.jar and registered there under the same uri identiﬁer.

In PD4ML v4.0.0 we changed the taglib structure and removed some wrapper classes. Now we recommend to use

pd4tl: tag preﬁx instead of previous pd4ml: in order to distinguish the custom JSP tags from PD4ML proprietary

tags (e.g. <pd4ml:page.break> ).

Next, you need to surround your HTML or JSP content with <pd4tl:transform> and </pd4tl:transform> tags.

That’s it:

12.

13. // in the example we simply dump the resource parameter string

14. String buf = "[" + resource.substring(PROTOCOL.length()+1) + "]";

15. ByteArrayInputStream baos = new ByteArrayInputStream(buf.getBytes());

16. return new BufferedInputStream(baos);

17. }

18.

19. @Override

20. public boolean canLoad(String resource, FileCache cache) {

21. if (resource.toLowerCase().startsWith(PROTOCOL)) {

22. return true;

23. }

24. return false;

25. }

26. }

27.

28. pd4ml.addCustomResourceProvider("advanced.DummyProvider");

Suggest edit

Last updated on November 10, 2018

1. <%@ taglib uri="http://pd4ml.com/tlds/4.0"

2. prefix="pd4tl"%><%@page

3. contentType="text/html; charset=ISO8859_1"%><pd4tl:transform

Comments:

Line 1.Line 1. Make sure there is no whitespaces before the directive start (before <%@ ). The taglib is intended to return a PDF

(binary) ﬁle. Any content or whitespace (incl. new line characters) before <pd4tl:transform> implicitly switches HTTP

output to text mode and preﬁxes the PDF bytes with unexpected characters. As the last resort PD4ML taglib tries to reset

such undesired content (if any), but it is not possible in all environments and java.lang.IllegalStateException:

getOutputStream() has already been called for this response is thrown.

Other symptoms of the undesired content are either corrupted PDF, or PDF, which is always being reindexed when opened

by a PDF Viewer.

Line 2 and 3Line 2 and 3 No characters or whitespaces between tags or directives allowed (See %><% and %>

<pd4tl:transform ). A reason is as in the comment above.

Line 8Line 8 From this point there is no whitespace restriction – you may use an arbitrary HTML or JSP code.

Line 23Line 23 The proprietary tag forces PD4ML converter to insert a page break to the output PDF.

Line 31Line 31 The closing transform tag. Any content follows the tag is ignored.

Click here to see PD4ML taglib documentation

Prompt to save just-generated PDF

By default if you open a PD4ML-enabled JSP with a Web browser, it opens a generated PDF inline (within the browser

frame). If you’d like the browser to propose to save the PDF under particular name or open it with the system PDF viewer,

add two transform attributes inline and fileName

4. screenWidth="400"

5. pageFormat="A5"

6. pageOrientation="landscape"

7. pageInsets="100,100,100,100,points">

8. <html>

9. <head>

10. <title>pd4ml test</title>

11. <style type="text/css">

12. body {

13. color: red;

14. font-family: Tahoma, "Sans-Serif";

15. font-size: 10pt;

16. }

17. </style>

18. </head>

19. <body>

20. <p>

21. Hello, World!

22. </p>

23. <pd4ml:page.break />

24. <table style="border: 1px solid gray; border-radius: 5px; background-color: #f8f8f8;

color: #000000">

25. <tr>

26. <td>Hello, New Page!</td>

27. </tr>

28. </table>

29. </body>

30. </html>

31. </pd4tl:transform>

1. <%@ taglib uri="http://pd4ml.com/tlds/4.0"

2. prefix="pd4tl"%><%@page

3. contentType="text/html; charset=ISO8859_1"%><pd4tl:transform

4. screenWidth="800"

5. pageFormat="A4"

6. pageOrientation="landscape"

7. inline="false"

Deﬁning PDF document footer (or header) with JSP custom tag.

Since PD4ML v4.0.0v4.0.0 a dedicated header/footer JSP custom tags are not supported anymore. Instead of it use PD4ML

proprietary <pd4ml:page.header> and <pd4ml:page.footer> tags (they take eﬀect not only in JSP context, but

also in regular HTML).

Identically you may deﬁne a watermark with <pd4ml:watermark> tag.

Comments:Comments:

Line 13Line 13 Page header deﬁnition, whose scope is limited only by the ﬁrst page. $[title] placeholder is to be substituted

with a title value taken from HTML’s <title> > tag.

Line 14.Line 14. Page footer deﬁnition to be applied to all pages (as there is no scope attribute). Placeholder $[page] and

$[total] to be substituted with the current page number and total number of pages correspondingly.

How to add dynamic data (like current date) to PDF header or footer

It is JSP. Forget about PDF and do it you would normally do it in a regular JSP:

8. fileName="file.pdf"

9. pageInsets="10,10,10,10,mm">

10. <html>

11. <head>

12. <title>pd4ml test</title>

13. </head>

14. <body>

15. Hello, World!

16. </body>

17. </html>

18. </pd4tl:transform>

1. <%@ taglib uri="http://pd4ml.com/tlds/4.0"

2. prefix="pd4tl"%><%@page

3. contentType="text/html; charset=ISO8859_1"%><pd4tl:transform

4. screenWidth="800"

5. pageFormat="A4"

6. pageOrientation="landscape"

7. pageInsets="10,10,10,10,mm">

8. <html>

9. <head>

10. <title>Page title.</title>

11. </head>

12. <body>

13. <pd4ml:page.header height=30 scope="1">$[title] Visible only on the first

page</pd4ml:page.header>

14. <pd4ml:page.footer height=30>Page $[page] of $[total]</pd4ml:page.footer>

15. Page 1

16. <pd4ml:page.break>

17. Page 2

18. </body>

19. </html>

20. </pd4tl:transform>

1. <%

2. String template = getFormattedDate() ;

3. %>

4. <pd4ml:page.footer height=30><span style="color: tomato"><%=template%></span>, page

$[page]</pd4ml:page.footer>

Temporary saving generated PDF to hard drive.

With <pd4tl:savefile> tag you have a possibility to store just generated PDF to a hard drive (on server side) and

redirect user’s browser to read the PDF as a static document or to redirect the request to another URL for PDF post-

processing.

Note: the tag should be nested to <pd4ml:transform> and have no body.

Usage 1.

The tag above forces PD4ML to save the generated PDF to D:/spool/generated_pdfs with an autogenerated name.

It is expected, that local directory D:/spool/generated_pdfs corresponds to URL

http://yourserver.com/WEB/savefile/saved/ (as given in uri attribute)

After generation PD4ML will send to client’s browser a redirect command with URL like that:

Usage 2.

The tag above forces PD4ML to save the generated PDF to D:/spool/generated_pdfs with an autogenerated name.

After that it forwards to /mywebapp/send_pdf_by_email.jsp with a parameter filename=<pdfname> .

So send_pdf_by_email.jsp can read ﬁle name with

build the full path

read the just-generated PDF ﬁle and perform post-processing or other actions (like a sending of the PDF by email).

In both cases above you can predeﬁne PDF ﬁle name with name attribute. If a ﬁle with the name is already exists in

D:/spool/generated_pdfs , then the new ﬁle name is appended with an auto-incremented numeric value.

Using PD4ML custom tags with Struts or with any other J2EE UI frameworks, if JSP taglib integration is

problematic

The speciﬁcs of many UI frameworks (like StrutsStruts) is that in some casesin some cases they take control and open output stream in text

mode before PD4ML-enabled JSP page is loaded. On the other hand PD4ML needs an exclusive control over the output

stream: it outputs binary data and any other output writers can corrupt it.

1. <pd4ml:savefile

2. uri="/WEB/savefile/saved/"

3. dir="D:/spool/generated_pdfs"

4. redirect="pdf"

5. debug="false"/>

http://yourserver.com/WEB/savefile/saved/generated_name.pdf

1. <pd4ml:savefile

2. dir="D:/spool/generated_pdfs"

3. redirect="/mywebapp/send_pdf_by_email.jsp"

4. debug="false"/>

String fileName = request.getParameter("filename");

String path = "D:/spool/generated_pdfs" + "/" + fileName;

In order to solve the problem there are workarounds.

As described in the above chapter, a just generated PDF can be temporally stored to the hard drive and the current HTTP

request can be forwarded to the new static PDF.

Alternatively PD4ML transformer JSP pages can be deployed to a separate web application outside of the framework (e.g.

Struts) context. It can be even the same web application, but the PD4ML-enabled JSP pages should be out of control of the

Struts dispatching servlet. (Can be achieved by web.xml settings).

For our example the separate Web application is associated with name separate_web_app .

Create a dedicated PD4ML transformer JSP page transformer.jsp in separate_web_app like the following.

Replace url attribute value of <pd4ml:transform> with the actual URL of your Struts (JSP) page, that should be

converted to PDF.

Because of url attribute presence (line 11), the HTML content of transformer.jsp is ignored completely. But as

usually: it should be no any whitespace character before <pd4ml:transform> tag.

At this point you can access the transformer.jsp and force PDF conversion of the speciﬁed source. However in most of

the cases a session ID ( JSESSIONID ) propagation is important for proper Struts (or other JSP-based frameworks)

functionality.

If the url attribute of <pd4ml:transform> is set, then PD4ML takes its value and appends to it JSESSIONID

parameter. The value for the parameter it takes from HTTP request variable pd4session

A request for PDF generation from Struts context can look like that:

6. PD4ML and IBM Notes/Domino

1. <%@ taglib uri="http://pd4ml.com/tlds/4.0"

2. prefix="pd4tl"%><%@page

3. contentType="text/html; charset=ISO8859_1"%><pd4tl:transform

4. screenWidth="400"

5. pageFormat="A5"

6. pageOrientation="landscape"

7. pageInsets="15,15,15,15,points"

8. enableImageSplit="false"

9. inline="true"

10. fileName="myreport.pdf"

11. url="http://mainserver/struts/report.jsp">

12. <pd4ml:footer

13. titleTemplate="$[title]"

14. pageNumberTemplate="page $[page] of $[total]"

15. titleAlignment="left"

16. pageNumberAlignment="right"

17. fontSize="12"

18. areaHeight="14"/>

19. </pd4ml:transform>

1. <%

2. response.sendRedirect("/separate_web_app/transformer.jsp?pd4session=" +

3. session.getId() );

4. %>

Suggest edit

Last updated on November 4, 2018

v3.x.xv3.x.x v4.1.0v4.1.0

PD4ML can be used to convert IBM Notes documents to PDF (as well as to RTF or to a raster image) a variety of ways.

The most straightforward method is to capture HTML documents, returned by Notes Domino via HTTP: an URL like one of

the following is to be passed to render() method of PD4ML.

http://Host/Database/PageUNID?OpenPage

(i.e. http://www.acme.com/discussion.nsf/35AE8FBFA573336A852563D100741784?OpenPage)

http://Host/Database/View/DocumentUniversalID?OpenDocument

(http://www.acme.com/leads.nsf/By+Rep/35AE8FBFA573336A852563D100741784?OpenDocument)

http://Host/Database/FormUniversalID?ReadForm

(http://www.acme.com/products.nsf/625E6111C597A11B852563DD00724CC2?ReadForm)

More IBM Notes URL syntax info…

If you run JSP infrastructure on IBM Domino server, another online method of PDF generation would be to use PD4ML

JSP custom tag library (TODO)(TODO).

If an online document generation method (involving HTTP, JSP etc) is undesired, there is a possibility to generate PDF

documents from their XML representations (so called DXL).

DXL is an adaptation of XML used to describe, in detail, the structure and data contained in a Domino database. It describes

data and design elements such as views, forms, and documents and provides a basis for importing or exporting XML

representations of data to and from a Domino/Notes application.

Schematically a process of Notes document conversion to PDF can be seen like that:

All conversion steps can be implemented at once as a IBM Lotus Notes Java agent, or can be implemented as two-step

batch: DXL export request followed by a conversion with Pd4Cmd command-line tool

The command line overrides the default Java memory heap size limit with -Xmx512m . Here it is set to 512Mb.

On UNIX platform -Djava.awt.headless=true allows to run the application on non-graphics-enabled servers or from

remote ssh/telnet sessions.

test.dxl 1200 are DXL locationDXL location and htmlWidthhtmlWidth (virtual “browser” frame width) parameters.

On Win32 the path is enclosed, if needed, to double quotes, on UNIX – to single quotes.

-xsl notesdefault applies XSL stylesheet to the input document. In the case it refers to the built-in default DXL-to-

HTML XSL, but it can be an URI of an arbitrary external stylesheet.

The default PDF document format: A4 / PORTRAITA4 / PORTRAIT

In the example 1200pxpx width of rendered document will be mapped to 595ptpt widths of A4 page format.

As long as an output ﬁle path omitted, the output is sent to STDOUTSTDOUTand can be piped to another application.

See more command line arguments…See more command line arguments…

PD4ML as Notes Agent…PD4ML as Notes Agent…

java -Djava.awt.headless=true -Xmx512m -jar pd4ml.jar test.dxl 1200 -xsl notesdefault

ProductsProducts LicensesLicenses SupportSupport Try FreeTry Free Buy NowBuy Now [[]]

7. Usage Examples

Go to Usage Examples section…Go to Usage Examples section…

Suggest edit

Last updated on November 4, 2018

Suggest edit

Last updated on September 4, 2018

PD4ML – V4 Programmers Manual

Navigation menu

Versions of this User Manual:

Views

Navigation