New Hampshire Bar Association
About the Bar
For Members
For the Public
Legal Links
Publications
Newsroom
Online Store
Vendor Directory
NH Bar Foundation
Judicial Branch
NHMCLE

NHBA`s 2-volume Practice and Procedure Handbook has evolved into a first-source reference for New Hampshire Practitioners of all levels of experience.

Visit the NH Bar Association's Lawyer Referral Service (LRS) website for information about how our trained staff can help you find an attorney who is right for you.
New Hampshire Bar Association
Lawyer Referral Service Law Related Education NHBA CLE NHBA Insurance Agency

Member Login
username and password

Bar News - February 23, 2001


Metadata Solutions, Part II

By:

METADATA IS information about how and when a file was created and edited as well as the file changes themselves. To some extent, there is metadata in all Microsoft products, including Excel and PowerPoint, as well as non-Microsoft products like WordPerfect. But it is Microsoft Word that may pose the greatest concern to attorneys.

Microsoft Word automatically creates and stores a great deal of information in a document file other than the text that appears on the screen. Because the information is generated automatically by Word, the author or editor may be completely unaware of its existence.

The obvious risk of metadata is passing confidential client information and waiving attorney-client privilege. The older the document is, the more information that is contained in the file. Moreover, even if the file is password-protected, metadata can still be accessible. For instance, document properties that you see when you click File Menu and Properties can be viewed with a text editor such as Notepad regardless of any password-protection.

This hidden document information has been around for years, but new resources in Word97 and Word 2000 allow savvy users to easily access it. Microsoft has acknowledged the seriousness of this issue and for those firms brazen enough to wait, the next version of Microsoft Office will allow users a privacy option to remove all personal information and warn them if metadata is being stored in the file. However, to those who choose not to wait for a solution, you have a number of options. Which option you choose should depend on what specific information you send electronically that might somehow compromise your firm.

If you send Word documents via e-mail or save them to disk, you need to know what information other than the text of the document you are sending at the same time. Microsoft has identified over a dozen types of metadata that Word creates and updates automatically. Additionally, there are several switches or settings that can create additional hidden file information. Together these metadata and switches can be classified into three general categories pertinent to practicing attorneys: Personal Identification, Document History and Work Product.

Personal Identification embedded in a Word document file includes: (1) author's name and initials; (2) previous authors'/editors' names; (3) system manager's name; (4) your firm's name; (5) the "name" of your computer; and (6) the name of the network server or hard disk where the document was saved (the file save location). Word automatically creates and updates all of this information.

Document History includes information such as (1) document title; (2) creation time; (3) when the document was edited; (4) when it has been accessed; (5) total time spent working on the file; (6) subject; and (7) revision number of the document. Word generates all of this information, except Subject, automatically.

Work Product is a generic term meaning the document content you create. Word Work Product may include: (1) comments; (2) hidden text; (3) document versions; (4) document revisions; (5) non-visible portions of embedded objects; and (6) keywords. While Word does not initially generate this information, if someone collaborating with you activates these "switches," the switches stay activated when the file is shared. This means that Word continues to generate this information while you are working on the document and does not outwardly indicate that it is doing so. Thus it is very easy for attorneys who believe they are working on "clean" documents to later find out that they sent compromising information.

To combat this problem, many firms are starting to control the hidden content of their documents. As you consider how to address metadata in your firm's documents, you should be aware that because it is created in a variety of ways, there is presently no single way to eliminate all metadata from your documents. However, you can eliminate a great deal of document information by simply going to Properties under the File Menu and then clicking the Summary or Custom tabs.

Firm's name. When you first installed Word, your name and company name that you entered are stored in Word. Each time you create a new Word document, Microsoft automatically enters this information into the Summary tab under File Properties. You can delete this information easily from the Summary tab, but you must do so each time you create a new document.

Subject and system manager's name. While you may find information in these fields of the Summary tab, Word does not automatically enter it. Word allows you to delete this information like regular text from the Summary tab.

You can eliminate other information from the Tools menu by clicking Options and then any of the 10 folders that you see.

Author name and initials. When you first install Word, Microsoft stores this information and enters it in the Name and Initials fields of the User Information tab. To eliminate this data you will need to enter a space for the name field and initials field. If you leave the fields completely blank and then save the document, Word will fill in the information again for you.

Fast saves. This option tells Word to append deleted text to the end of the file instead of rearranging the text every time a change is made. This allows quicker interim saves until the user is ready to save the final version of the document. The trouble is that someone with a text editor can view these normally invisible changes to your document. In order to get rid of any deleted text hidden in your file, you can disable the Fast Saves option. From Tools Options, click the Save tab. If the box labeled "Allow Fast Saves" is checked, uncheck it. When you save the document, the changes will be incorporated into the document and the hidden text will be deleted.

Other fixes require more creative measures:

Previous author. Word automatically stores the names of the last 10 authors/editors of the document. This feature can not be disabled, but you can eliminate this information by first saving the document to .RTF format and then re-saving in Word format.

Creation, last accessed, last modified, last printed, revision number and total editing time. Word automatically records the time and date when a file was first saved as well as when it was accessed, edited or printed. Additionally, Word records how many times the document has been changed and how much time has been spent editing the document. To eliminate this information, from the Edit menu, choose Select All and Copy. Then create a new document and paste the information into the new document. The text will go to the new document but this metadata will not transfer to the new file.

Document title. Even though you name the document, Microsoft derives the "document title" from the first 125 characters of the first paragraph automatically if no information has been manually keyed in the title bar of the Summary tab under File Properties. Once you delete the title here and save the document, the "Document Title" will disappear.

Comments. Comments that you would make while reading along in a document usually appear as highlighted text and contain the name of the person who wrote the comments as well as the comments themselves. Deleting comments is easy, if tedious. You can scroll through your document, right-clicking on each highlighted section of text and clicking Delete Comment. However, this is a slow process and it is possible for comments to not appear as highlighted text. Therefore, you can perform a search for comments by going to Edit, Find, Special, Comment Mark. A list of all comments will appear at the bottom of the screen that you can then edit or delete.

Embedded objects. Embedded objects such as data you cut-and-paste from Excel into word can contain extra information that the user can access. For example, if you cut-and-paste several lines of an Excel spreadsheet into Word, the entire workbook can be included along with it. Eliminating this information can be difficult. Microsoft provides an explanation of how to do this at http://support.microsoft.com/support/kb/articles/Q223/7/90.ASP.

Hidden text. To remove existing hidden text from Tools Options, click the View tab. From there, select Hidden Text and click OK. Next, from Edit, choose Replace and then click the More tab. Then click Format and Font and then Hidden. Finally, click OK and then the Replace All tab.

Track changes. This function is not automatically enabled, but if another user has enabled it, every change contains the author's name, the time and date when the change was made as well as the text that was inserted or deleted. While it would be prudent to stay away from ever using Track Changes, if you accept changes prior to distributing the file, the changes will be incorporated into the document and should not be viewable with a text editor by a third party.

Versions. This is another feature that is not automatically enabled. Nevertheless, Word does allow you to save multiple versions of the same document in the same file. If you have older versions of your document, you can delete them by Selecting Versions from the File Menu, and then choosing the version(s) that you want to delete. The next time you save the document, Word will automatically delete the prior versions you selected.

 

In review, there is some metadata generated by Word that is relatively easy to eliminate:

► Author name and initials

► Comments

► Embedded objects

► Fast saves

► Firm's name

► Hidden text

► Keywords

► Subject

► Track changes

► Versions

Information that you may be unable to eliminate in Word includes:

► Creation time and date

► Document statistics (such as word and character count, etc.)

► File location

► Previous authors

► Revision number

► Time spent editing the document

► Total editing time

► When last accessed

► When last modified

► When last printed

► Who it was last saved by

If any information like that above would concern you or your client or if you would simply prefer not to manually remove metadata from your documents, you might want to consider exporting the document to Acrobat PDF format. This option is becoming increasingly popular with many attorneys. Since Acrobat Reader is free, anyone can download Reader if they don't already have it on their systems and at least read your file. But even if they have the full Acrobat Writer, they won't be able to edit the file as you could in Word because the file will be saved as an image.

Alternatives include saving the document in .RTF format, but this will not eliminate much metadata. Saving a document as HTML is absolutely no answer. Word has an excellent HTML converter that will save your metadata there just as effectively as it would in the Word document.

You can run a program like Payne Consulting Group's Metadata assistant, available at http://www.payneconsulting.com/MetadataAssistant/. This free program analyzes and cleans your Word document of much metadata but cannot purge your document of the above information. If you buy another metadata purging utility, you will want to consider what kinds of metadata remain in your document after it is "cleaned," whether the utility runs automatically and if you must run it manually and whether you must rerun the application every time you save the document.

If the person on the other end must edit the document, you might want to consider sending it as a .TXT file. This is a rather blunt solution as all formatting and font selections are lost, as well as other key information you might want to include in your document. Another option is instead to create an automated process like a macro. This will automatically enter certain metadata and prevent others. If you opt for a macro, you will need to consider its adaptability to your needs as well as what metadata remains in the document.

If you decide that you would rather avoid the headache of metadata entirely, you can always fax the paper document. Note that if you have fax software on your computer, some programs "fax" the actual Word file itself, rather than just the image of the Word document.

The safest method, of course, is to decide not to distribute documents in electronic form. Instead, print them and send by mail. If you are collaborating with a client or co-counsel, you might find this an overreaction, but you can only ever be sure that you yourself do not transmit undesirable metadata. Once your document leaves your hands, you now rely on the knowledge of those who will then see it.

You could also wait until Microsoft comes out with the next version of Word that they claim will allow a privacy feature to prevent the collection of document metadata. Regardless of which option you choose, it is always a wise idea to become familiar with the menu features of your word-processor and how they may be adding unintended content to your documents.

Todd Cheesman is the courtroom director at Franklin Pierce Law Center, where he teaches Techno-Advocacy. He also performs law and technology workshops for the public, teaches technology-related continuing legal education programs, performs office and courtroom automation consulting and is an adjunct member of the New Hampshire Technology 2000 Task Force.

 

Click for directions to Bar events.

Home | About the Bar | For Members | For the Public | Legal Links | Publications | Online Store
Lawyer Referral Service | Law-Related Education | NHBA•CLE | NHBA Insurance Agency | NHMCLE
Search | Calendar

New Hampshire Bar Association
2 Pillsbury Street, Suite 300, Concord NH 03301
phone: (603) 224-6942 fax: (603) 224-2910
email: NHBAinfo@nhbar.org
© NH Bar Association Disclaimer