|
|
Consulting
Editing
Teaching
Other
Copyright 2003, William Silvert |
Shrinking Large Word Documents Anyone who works with Microsoft Word has had to endure the astonishing growth in size of Word documents (as well as other Microsoft Office files) over time. You start with a simple draft of 20 KB, make a few minor changes, and it suddenly grows to 60 KB or more. Aside from wasting disk space on your hard drive, this slows down loading and creates problems when you try to email the document. Fortunately there are some simple tricks that can be used to shrink a bloated document file down to reasonable size. The Microsoft Way The official Microsoft recommendation, which I received from their support staff, is to copy the text of the document to a new document. To do this you first go to the end of the document with Ctl-End (hold down the Control key and press End), and then Ctl-Shift-Home to select the entire contents of the document (see note at end*). Copy all this and paste into a new document. In practice I have found that this works fine for small documents, but for large ones that is a lot of material to copy to the clipboard. I have recently heard that doing this does not always copy the header and footer into the new document. It does for me, but perhaps there are problems with this technique that I am not aware of. In any case, you should always check the new document against the old to make sure that it is OK. I stumbled on a much simpler solution which works very well - just change the page format. Suppose that your paper is formatted for A4 paper. Go to Page Setup>Paper Size, and change the document to Letter, Legal, or some other paper size. Save it. Change it back and save it again. The file will look just the same as it did in the first place, but it will be much smaller, often by a factor of two or three. Two cautions - first, better make a backup copy of the file before you try this trick, and make sure that the document is OK before you delete the backup. Second, you may have problems if you have objects that do not fit in the alternate page size, such as full-page tables. Also, make sure that you reformat the entire document, and not just one section of it. I recently (January 2010) heard from Robin Findley about another technique. On the File menu click "New ..." and then select "From existing document...". Select the original document and you end up with an exact copy, but one that is smaller. (This technique has only been tested with Word 2003.) Figures Figures in a document can swell its size incredibly. I recently inserted two 20 KB GIF's in a 35 KB Word document, and the result came to over 800 KB! If you run into this problem, you may decide to keep the figures separate until the last minute, and only include them when necessary for final printing. But suppose that you have a Word document with embedded figures, and you want to email it to someone (to me, for example, for editing). Can you save space by somehow extracting the figures? Yes, especially if you have a recent version of Word with the option to save the file as HTML (this is the format used by Web documents like this one). When you save the file to HTML the text is converted to a special format, and the graphics are converted to GIF files (a standard image format). You can discard the converted text file, delete the images from a copy of the original document, and you end up with a single Word file containing only text and several GIF files. It is a good idea to rename the GIF files, and then you can send the whole package off. Here is a more detailed example: You have a document called MY_PAPER.DOC which contains two figures. Save it as HTML, and give it the name DUMMY.HTM. You will also find that Word creates two files called something like IMAGE13.GIF and IMAGE14.GIF (the numbering may appear arbitrary). Discard DUMMY.HTM and rename the two image files to FIGURE1.GIF and FIGURE2.GIF. Reopen MY_PAPER.DOC and delete the figures, and save the file as TEXT_ONLY.DOC (you may want to change the page size twice as indicated above to further save space). These three files, TEXT_ONLY.DOC and the two figures, contain all the material in the original file and may be much smaller. Keep in mind that you do not always gain a major saving in file size in this way. Some graphics files are quite compact in Word - I have found that charts copied from Excel usually do not require a lot of space in Word. The only way to know for sure is to experiment - but that is what scientists do best, isn't it? You can also use a file compression program like ZIP to compact the file, or to combine several files in one archive, but you may not save much space in this way - graphics files do not compress easily, and JPEG files are already compressed. Additional Information A Scottish colleague, William Dalziel. sent the following information. I have not had a chance to check this, but it is worth considering: There is a setting in Microsoft Word, available from the Tools Menu > Options > "Save" tab, entitled "Allow Fast Saves". The purpose of this preference is described by Word's help file as: "Speeds up saving by recording only the changes in a document. When you finish working on a document, clear this option and save the complete file with a full save. A full save may decrease the file size of your document." What happens is that Fast Saves retain all the deleted and edited text that is not visible to the user, and when the document is reopened and resaved with this option enabled (ie. checked), it increasingly bloats the document with hidden data. In addition to this unwanted side-effect, it is possible to see this hidden data if the document is opened in other viewers such as in the MS-DOS Edit.com program. In that respect, there is obviously the risk of revealing sensitive text that the user believed to have been permanently deleted. When the option is disabled (ie. unchecked) to allow a final FULL Save, as described in the Word Help file extract above, this normally removes all the hidden data from edits made while "Fast Saves" was enabled. I am sure that "The Microsoft Way" technique you gave in your web page is doing much the same process by stripping out the hidden data, but the user setting I have detailed above provides an alternative method. He even provided the following references: Frequently Asked Questions About "Allow Fast Saves" - Word 2000: Opening Word Document in Text Editor Displays Deleted Text: Avoid sending deleted text along with your docs: I thank him for this information and add the caveat that Microsoft Word has gone through several revisions and new versions since I wrote this, and our suggestions may not apply to all versions.
* The reason this works is that all kinds of formatting information, much of it obsolete, is stored in the final paragraph mark of the document. When you go to the end of the document with Ctl-End, the cursor is positioned just before this final paragraph mark. When you select everything back to the start of the document with Ctl-Shift-Home you select everything but the final mark. On the other hand, if you select everything with Select All, you include the final paragraph mark with all the junk in it, so copying to a new document doesn't necessarily do you any good (although sometimes it works). |