Like other files, a PDF document also has document properties. These properties are key-value pairs. Each property gives particular information about the document.
Following are the properties of a PDF document −
S.No. | Property & Description |
---|---|
1 | File This property holds the name of the file. |
2 | Title Using this property, you can set the title for the document. |
3 | Author Using this property, you can set the name of the author for the document. |
4 | Subject Using this property, you can specify the subject of the PDF document. |
5 | Keywords Using this property, you can list the keywords with which we can search the document. |
6 | Created Using this property, you can set the date created for the document. |
7 | Modified Using this property, you can set the date modified for the document. |
8 | Application Using this property, you can set the Application of the document. |
Following is a screenshot of the document properties table of a PDF document.
PDFBox provides you a class named PDDocumentInformation. This class has a set of setter and getter methods.
The setter methods of this class are used to set values to various properties of a document and getter methods which are used to retrieve these values.
Following are the setter methods of the PDDocumentInformation class.
S.No. | Method & Description |
---|---|
1 | setAuthor(String author) This method is used to set the value for the property of the PDF document named Author. |
2 | setTitle(String title) This method is used to set the value for the property of the PDF document named Title. |
3 | setCreator(String creator) This method is used to set the value for the property of the PDF document named Creator. |
4 | setSubject(String subject) This method is used to set the value for the property of the PDF document named Subject. |
5 | setCreationDate(Calendar date) This method is used to set the value for the property of the PDF document named CreationDate. |
6 | setModificationDate(Calendar date) This method is used to set the value for the property of the PDF document named ModificationDate. |
7 | setKeywords(String keywords list) This method is used to set the value for the property of the PDF document named Keywords. |
PDFBox provides a class called PDDocumentInformation and this class provides various methods. These methods can set various properties to the document and retrieve them.
This example demonstrates how to add properties such as Author, Title, Date, and Subject to a PDF document. Here, we will create a PDF document named doc_attributes.pdf, add various attributes to it, and save it in the path C:/PdfBox_Examples/. Save this code in a file with name AddingAttributes.java.
import java.io.IOException; import java.util.Calendar; import java.util.GregorianCalendar; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDDocumentInformation; import org.apache.pdfbox.pdmodel.PDPage; public class AddingDocumentAttributes { public static void main(String args[]) throws IOException { //Creating PDF document object PDDocument document = new PDDocument(); //Creating a blank page PDPage blankPage = new PDPage(); //Adding the blank page to the document document.addPage( blankPage ); //Creating the PDDocumentInformation object PDDocumentInformation pdd = document.getDocumentInformation(); //Setting the author of the document pdd.setAuthor("Howcodex"); // Setting the title of the document pdd.setTitle("Sample document"); //Setting the creator of the document pdd.setCreator("PDF Examples"); //Setting the subject of the document pdd.setSubject("Example document"); //Setting the created date of the document Calendar date = new GregorianCalendar(); date.set(2015, 11, 5); pdd.setCreationDate(date); //Setting the modified date of the document date.set(2016, 6, 5); pdd.setModificationDate(date); //Setting keywords for the document pdd.setKeywords("sample, first example, my pdf"); //Saving the document document.save("C:/PdfBox_Examples/doc_attributes.pdf"); System.out.println("Properties added successfully "); //Closing the document document.close(); } }
Compile and execute the saved Java file from the command prompt using the following commands.
javac AddingAttributes.java java AddingAttributes
Upon execution, the above program adds all the specified attributes to the document displaying the following message.
Properties added successfully
Now, if you visit the given path you can find the PDF created in it. Right click on the document and select the document properties option as shown below.
This will give you the document properties window and here you can observe all the properties of the document were set to specified values.
You can retrieve the properties of a document using the getter methods provided by the PDDocumentInformation class.
Following are the getter methods of the PDDocumentInformation class.
S.No. | Method & Description |
---|---|
1 | getAuthor() This method is used to retrieve the value for the property of the PDF document named Author. |
2 | getTitle() This method is used to retrieve the value for the property of the PDF document named Title. |
3 | getCreator() This method is used to retrieve the value for the property of the PDF document named Creator. |
4 | getSubject() This method is used to retrieve the value for the property of the PDF document named Subject. |
5 | getCreationDate() This method is used to retrieve the value for the property of the PDF document named CreationDate. |
6 | getModificationDate() This method is used to retrieve the value for the property of the PDF document named ModificationDate. |
7 | getKeywords() This method is used to retrieve the value for the property of the PDF document named Keywords. |
This example demonstrates how to retrieve the properties of an existing PDF document. Here, we will create a Java program and load the PDF document named doc_attributes.pdf, which is saved in the path C:/PdfBox_Examples/, and retrieve its properties. Save this code in a file with name RetrivingDocumentAttributes.java.
import java.io.File; import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDDocumentInformation; public class RetrivingDocumentAttributes { public static void main(String args[]) throws IOException { //Loading an existing document File file = new File("C:/PdfBox_Examples/doc_attributes.pdf") PDDocument document = PDDocument.load(file); //Getting the PDDocumentInformation object PDDocumentInformation pdd = document.getDocumentInformation(); //Retrieving the info of a PDF document System.out.println("Author of the document is :"+ pdd.getAuthor()); System.out.println("Title of the document is :"+ pdd.getTitle()); System.out.println("Subject of the document is :"+ pdd.getSubject()); System.out.println("Creator of the document is :"+ pdd.getCreator()); System.out.println("Creation date of the document is :"+ pdd.getCreationDate()); System.out.println("Modification date of the document is :"+ pdd.getModificationDate()); System.out.println("Keywords of the document are :"+ pdd.getKeywords()); //Closing the document document.close(); } }
Compile and execute the saved Java file from the command prompt using the following commands.
javac RetrivingDocumentAttributes.java java RetrivingDocumentAttributes
Upon execution, the above program retrieves all the attributes of the document and displays them as shown below.
Author of the document is :Howcodex Title of the document is :Sample document Subject of the document is :Example document Creator of the document is :PDF Examples Creation date of the document is :11/5/2015 Modification date of the document is :6/5/2016 Keywords of the document are :sample, first example, my pdf