What is Metadata?
Documents may contain information about the document itself: we call this metadata. For instance, a raster image file contains metadata recording the image's width and height; a word processing document may contain metadata recording the document's author and title. Metadata can be represented by key-value pairs. For instance, a document's title can be represented as the key "Title" and the value "Annual Report". We refer to a single metadata key-value pair as a metadata field.
Containers (documents with subfiles) can contain metadata about their subfiles. For instance, a Personal Folders (.pst) file is a container that can have multiple email messages as subfiles. A PST file may contain metadata, including the "To" and "From" fields of these subfiles.
Access Metadata using the Java API
After specifying the source document by calling the setInputSource() method, you can access document metadata using the method getMetadata() on your Filter object.
You can access subfile metadata using the method extGetSubFileMetadataList() on your Filter object.
These methods both output a MetadataList object, so the output from both functions can be used in the same way.
KeyView uses MetadataList and MetadataElement objects to represent metadata.
- A MetadataListobject represents a list ofMetadataElementobjects, allowing you to iterate over them. The order of elements within the list is not significant and may change in the future.
- A MetadataElementobject represents one metadata field.
Mail Metadata
The metadata for an e-mail message (the header fields such as "To", "CC", "Subject", and so on) are typically stored in the mail container (such as an MSG or EML file). To access this metadata you can call the function extGetSubFileMetadataList().
The message body and any attachments are considered by KeyView as subfiles of the container. When you extract the message body, KeyView includes the header fields (by default). If you do not want to include this information, call the method setExcludeMailHeader(true) on your ExtSubFileExtractConfig object, before you call extExtractSubFile(). You might want to do this if you have already accessed the metadata and do not want to process it again.