pdftk

Package: WA2L/WinTools 1.2.08
Section: Library Functions Manual (3)
Updated: 19 October 2017
Index Return to Main Contents

 

NAME

pdftk - a Handy Tool for Manipulating PDF Documents

 

SYNOPSIS

pdftk <input PDF files | - | PROMPT> [ input_pw <input PDF owner passwords | PROMPT> ][ <operation> <operation arguments> ][ output <output filename | - | PROMPT> ][ encrypt_40bit | encrypt_128bit ][ allow <permissions> ] [ owner_pw <owner password | PROMPT> ][ user_pw <user password | PROMPT> ][ flatten ] [ compress | uncompress ] [ keep_first_id | keep_final_id ] [ drop_xfa ] [ verbose ] [ dont_ask | do_ask ]

Where:

<operation> may be empty, or: [ cat | attach_files | unpack_files | burst | fill_form | background | stamp | generate_fdf | dump_data | dump_data_fields | update_info ]

pdftk --help

 

AVAILABILITY

WA2L/WinTools

 

DESCRIPTION

If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. pdftk is a simple tool for doing everyday things with PDF documents.

Use it to:
 * Merge PDF Documents
 * Split PDF Pages into a New Document
 * Rotate PDF Documents or Pages
 * Decrypt Input as Necessary (Password Required)
 * Encrypt Output as Desired
 * Fill PDF Forms with X/FDF Data and/or Flatten Forms
 * Generate FDF Data Stencil from PDF Forms
 * Apply a Background Watermark or a Foreground Stamp
 * Report PDF Metrics such as Metadata and Bookmarks
 * Update PDF Metadata
 * Attach Files to PDF Pages or the PDF Document
 * Unpack PDF Attachments
 * Burst a PDF Document into Single Pages
 * Uncompress and Re-Compress Page Streams
 * Repair Corrupted PDF (Where Possible)

 

OPTIONS

--help, -h
Show summary of options.

<input PDF files | - | PROMPT>
A list of the input PDF files. If you plan to combine these PDFs (without using handles) then list files in the order you want them combined. Use - to pass a single PDF into pdftk via stdin. Input files can be associated with handles, where a handle is a single, upper-case letter:

<input PDF handle>=<input PDF filename>

Handles are often omitted. They are useful when specifying PDF passwords or page ranges, later.

For example: A=input1.pdf B=input2.pdf

input_pw <input PDF owner passwords | PROMPT>
Input PDF owner passwords, if necessary, are associated with files by using their handles:

<input PDF handle>=<input PDF file owner password>

If handles are        not given, then passwords are associated with
input files by order.

Most pdftk features require that encrypted input PDF are accompanied by the ~owner~ password. If the input PDF has no owner password, then the user password must be given, instead.        If the
input PDF has no passwords, then no password should be given.

When running in do_ask mode, pdftk will prompt you for a password if the supplied password is incorrect or none was given.

<operation> <operation arguments>
If this optional argument is omitted, then pdftk runs in 'filter' mode. Filter mode takes only one PDF input and creates a new PDF after applying all of the output options, like encryption and compression.

Available operations are: cat, attach_files, unpack_files, burst,fill_form, background, stamp, dump_data, dump_data_fields, generate_fdf, update_info. Some operations takes additional arguments, described below.

cat [ <page ranges> ]
Catenates pages from input PDFs to create a new PDF. Page order   in the new PDF is specified by the order of the given
page ranges. Page ranges are described like this:

<input  PDF  handle>[<begin  page  number>[-<end  page   num-
ber>[<qualifier>]]][<page rotation>]

Where   the handle identifies one of the input PDF files, and
the beginning and ending page numbers   are one-based  refer-
ences to pages in the PDF file, and the qualifier can be even or odd, and the page rotation can be N, S, E, W, L, R, or D.

If the handle is omitted from the page range, then the pages are taken from the first input PDF.

The even qualifier causes pdftk to use only the even-numbered PDF pages, so 1-6even yields pages 2, 4 and 6 in that order. 6-1even yields pages 6, 4 and 2 in that order.

The odd qualifier works similarly to the even.

The page rotation setting can cause pdftk to rotate pages and documents. Each option sets the page rotation as follows (in degrees): N: 0, E: 90, S: 180, W: 270, L: -90, R: +90, D: +180. L, R, and D make relative adjustments to a page's rota- tion.

If no arguments are passed to cat, then pdftk combines all input PDFs in the order they were given to create the output.

NOTES:
 * <end page number> may be less than <begin page number>.
 * The keyword end may be used to reference the final page  of a document instead of a page number.
 * Reference a single page by omitting the ending page number.
 * The handle may be used alone to represent  the  entire  PDF document, e.g., B1-end is the same as B.

Page Range Examples w/o Handles:

 1-endE - rotate entire document 90 degrees
 5 11 20
 5-25oddW - take odd pages in range, rotate 90 degrees
 6-1

Page Range Examples Using Handles:
 Say A=in1.pdf B=in2.pdf, then:

 A1-21
 Bend-1odd
 A72
 A1-21 Beven A72
 AW - rotate entire document 90 degrees
 B
 A2-30evenL  -  take  the even pages from the range, remove 90
               degrees from each page's rotation
 A A
 AevenW AoddE
 AW BW BD

attach_files <attachment filenames | PROMPT> [ to_page <page number | PROMPT> ]
Packs arbitrary files into a PDF using PDF's file attachment features. More than   one attachment may be listed after
attach_files. Attachments are added at the document level unless the optional   to_page option is given, in which case
the files are attached to the given page number (the first page is 1, the final page is end). For example:

pdftk in.pdf attach_files table1.html table2.hml to_page 6
output out.pdf

unpack_files
Copies all of the attachments from the input   PDF into the
current folder or to an output directory given after output. For example:

pdftk report.pdf unpack_files output ~/atts/

or, interactively:

pdftk report.pdf unpack_files output PROMPT

burst Splits a single, input PDF document into individual    pages.
Also creates a report named doc_data.txt which is the same as the output from dump_data. If the output section is omitted, then PDF pages are   named: pg_%04d.pdf, e.g.: pg_0001.pdf,
pg_0002.pdf, etc. To name these pages yourself, supply a printf-styled   format  string  via the output section. For
example, if you want pages named: page_01.pdf, page_02.pdf, etc.,   pass output page_%02d.pdf to
pdftk. Encryption can be applied to the output by appending output options such as owner_pw, e.g.:

pdftk in.pdf burst owner_pw foopass

fill_form <FDF data filename | XFDF data filename | - | PROMPT>
Fills  the single input PDF's form fields with the data from
an FDF file, XFDF file or stdin. Enter the   data filename
after   fill_form, or  use - to pass the data via stdin, like
so:

pdftk form.pdf fill_form data.fdf output form.filled.pdf

After filling a form,   the form fields remain interactive
unless you also use the flatten output option. flatten merges the form fields with the PDF   pages.  You can use flatten
alone, too, but only on a single PDF:

pdftk form.pdf fill_form data.fdf output out.pdf flatten

or:

pdftk form.filled.pdf output out.pdf flatten

If the input FDF file includes Rich Text formatted data in addition to plain text, then the Rich   Text data is  packed
into the form fields as well as the plain text. pdftk also sets a flag that cues Acrobat/Reader to generate new field appearances based on the Rich Text data. That way, when the user opens the PDF, the viewer will create the Rich Text fields on the spot.   If the user's PDF viewer does not sup-
port Rich Text, then the user will see the plain text data instead. If   you flatten this form before Acrobat has a
chance to create (and save) new field appearances, then the plain text field data is what you'll see.

background <background PDF filename | - | PROMPT>
Applies a PDF watermark to the background of a single input PDF. Pass the background PDF's filename after background like so:

pdftk in.pdf background back.pdf output out.pdf

pdftk uses only the first page from the background PDF and applies it to every page of the input PDF.    This page is
scaled and rotated as needed to fit the input page. You can use - to pass a background PDF into pdftk via stdin.

If the input PDF does not have a transparent background (such as a   PDF created from page scans) then the resulting back-
ground won't be visible -- use the stamp feature instead.

stamp <stamp PDF filename | - | PROMPT>
This behaves just like the background feature except it over- lays the stamp PDF page on top of the input PDF document's pages. This works best if the stamp PDF page has a transpar- ent background.

dump_data
Reads  a single, input PDF file and reports various statis-
tics, metadata, bookmarks (a/k/a outlines), and page   labels
to the given output filename or (if no output is given) to stdout. Does not create a new PDF.

dump_data_fields
Reads a single, input PDF file and reports form field statis- tics to the given output filename or (if no output is given) to stdout. Does not create a new PDF.

generate_fdf
Reads a single, input PDF file and generates a FDF file suit- able for fill_form out of it to the given output filename or (if no output is given) to stdout. Does not   create  a new
PDF.

update_info <info data filename | - | PROMPT>
Changes the metadata stored in a single PDF's Info dictionary to match the input data file. The input data file uses the same syntax as the   output  from dump_data. This does not
change the metadata stored in the PDF's XMP stream, if it has one. For example:

pdftk in.pdf update_info in.info output out.pdf

output <output filename | - | PROMPT>
The output PDF filename may not be set to the name of an input filename. Use - to output to stdout. When using        the dump_data
operation, use output to set the name of the output data file. When using the unpack_files operation, use output to set the name of        an output directory. When using the burst operation,
you can use output to control the resulting PDF page filenames (described above).

encrypt_40bit | encrypt_128bit
If an output PDF user or owner password is given, output PDF encryption strength defaults to 128 bits. This can be overrid- den by specifying encrypt_40bit.

allow <permissions>
Permissions are applied to the output PDF only if an encryption strength is specified or an owner or user password is given. If permissions are        not specified, they default to 'none,' which
means all of the following features are disabled.

The permissions section may include one or more of the following features:

Printing
 Top Quality Printing

DegradedPrinting
 Lower Quality Printing

ModifyContents
 Also allows Assembly

Assembly

CopyContents
 Also allows ScreenReaders

ScreenReaders

ModifyAnnotations
 Also allows FillIn

FillIn

AllFeatures
 Allows  the  user      to  perform  all of the above, and top

 quality printing.

owner_pw <owner password | PROMPT>

user_pw <user password | PROMPT>
If an encryption strength is given but no passwords are sup- plied, then the owner        and user passwords remain empty, which
means that the resulting PDF may        be opened and its security
parameters altered by anybody.

compress | uncompress
These are only useful when you want to edit PDF code in a text editor like vim or emacs. Remove PDF page stream compression by applying        the uncompress filter. Use the compress filter to
restore compression.

flatten
Use this option to merge an input PDF's interactive form       fields
(and their data) with the PDF's pages. Only one input PDF may be given. Sometimes used with the fill_form operation.

keep_first_id | keep_final_id
When combining pages from multiple PDFs, use       one of these
options to copy the document ID from either the first or final input document into the new output PDF. Otherwise pdftk creates a new document        ID for the output PDF. When no operation is
given, pdftk always uses the ID from the (single) input PDF.

drop_xfa
If your input PDF is a form created using Acrobat 7 or Adobe Designer, then it probably has XFA data. Filling such a form using pdftk yields a PDF with data that     fails to display in
Acrobat 7 (and        6?). The workaround solution is to remove the
form's XFA data, either before you fill the form using pdftk or at the time you fill the form. Using this option causes pdftk to omit the XFA data from the output PDF form.

This option is only useful when running pdftk on a single input PDF. When assembling a PDF from multiple inputs using pdftk, any XFA data in the input is automatically omitted.

verbose By default, pdftk runs quietly. Append verbose to the end and it
will speak up.

dont_ask | do_ask
Depending on the compile-time settings (see ASK_ABOUT_WARNINGS), pdftk might prompt you for further input when it encounters a problem,        such as a bad password. Override this default behavior
by adding dont_ask (so pdftk won't ask you what to do) or do_ask (so pdftk will ask you what to do).

When running in dont_ask mode, pdftk will over-write files with its output without notice.

 

ENVIRONMENT

-

 

EXIT STATUS

-

 

FILES

-

 

EXAMPLES

Decrypt a PDF
pdftk secured.pdf input_pw foopass output unsecured.pdf

Encrypt a PDF using 128-bit strength (the default), withhold all permissions (the default)
pdftk 1.pdf output 1.128.pdf owner_pw foopass

Same as above, except password 'baz' must also be used to open output PDF

pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz

Same as above, except printing is allowed (once the PDF is open)
pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz allow printing

Join in1.pdf and in2.pdf into a new PDF, out1.pdf
pdftk in1.pdf in2.pdf cat output out1.pdf

or (using handles):

pdftk A=in1.pdf B=in2.pdf cat A B output out1.pdf

or (using wildcards):

pdftk *.pdf cat output combined.pdf

Remove 'page 13' from in1.pdf to create out1.pdf
pdftk in.pdf cat 1-12 14-end output out1.pdf

or:

pdftk A=in1.pdf cat A1-12 A14-end output out1.pdf

Apply 40-bit encryption to output, revoking all permissions (the default). Set the owner PW to 'foopass'.

pdftk 1.pdf 2.pdf cat output 3.pdf encrypt_40bit owner_pw foopass

Join two files, one of which requires the password 'foopass'. The output is not encrypted.
pdftk A=secured.pdf 2.pdf input_pw A=foopass cat output 3.pdf

Uncompress PDF page streams for editing the PDF in a text editor (e.g., vim, emacs)
pdftk doc.pdf output doc.unc.pdf uncompress

Repair a PDF's corrupted XREF table and stream lengths, if possible
pdftk broken.pdf output fixed.pdf

Burst a single PDF document into pages and dump  its data to doc_data.txt

pdftk in.pdf burst

Burst a single PDF document into encrypted pages. Allow low-quality printing
pdftk in.pdf burst owner_pw foopass allow DegradedPrinting

Write a report on PDF document metadata and bookmarks to report.txt
pdftk in.pdf dump_data output report.txt

Rotate the first PDF page to 90 degrees clockwise
pdftk in.pdf cat 1E 2-end output out.pdf

Rotate an entire PDF document to 180 degrees
pdftk in.pdf cat 1-endS output out.pdf

 

SEE ALSO

wintoolsintro(1), pdfrotate(1)

 

NOTES

This manpage is an extract of the pdftk --help output of pdftk version 1.41 and modified to fit into the WA2L/WinTools structure.

 

BUGS

-

 

AUTHOR

pdftk was developed by Sid Steward and integrated into WA2L/WinTools by Christian Walther. Send suggestions and bug reports related to the integration to wa2l@users.sourceforge.net .

 

COPYRIGHT

Copyright © 2020 Christian Walther

This is free software; see WA2LWinTools/man/COPYING for copying conditions. There is ABSOLUTELY NO WARRANTY; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


 

Index

NAME
SYNOPSIS
AVAILABILITY
DESCRIPTION
OPTIONS
ENVIRONMENT
EXIT STATUS
FILES
EXAMPLES
SEE ALSO
NOTES
BUGS
AUTHOR
COPYRIGHT

This document was created by man2html using the manual pages.
Time: 16:32:38 GMT, September 14, 2024