Need to edit 3000 PDF files fast.

By shamsael ·
I have about 3000 pdf files to edit.

The files are each 18 pages long and the last two pages of every file contains confidential information which must be removed after 1 year.

The files have sequential hex codes for file names, so the first, oldest, file is called 00000001.pdf, the 15th file is called 0000000f.pdf, etc.

I need a program, tool, script, macro or any other automated process that will remove the last two pages from every pdf file in the directory.

This conversation is currently closed to new comments.

Thread display: Collapse - | Expand +

All Answers

Collapse -

I'm not sure that anything exists to do this as you want

by OH Smeg Moderator In reply to Need to edit 3000 PDF fil ...

All I can suggest is looking at your PDF Editors Web Site to see if they have any Plug ins or something that may be of assistance.

If they do not whoever is responsible for creating these Files messed up big time and it's going to require manual editing to remove the bits that need to be deleted.


Collapse -

If these PDFs are so professional (with hex numbers etc)...

by OldER Mycroft In reply to Need to edit 3000 PDF fil ...

Chances are that they all have a Contents index at the front, which is equally likely to have reference to the last two pages. The confidential information may even be referenced from within the body copy itself.

Thus (simply) removing the pp 17 & 18 could result in a shoddy, badly referenced main text.

If ALL 3000 are becoming 1 year old at the same time, then they must all have been created at the same time - chances are that somewhere the original source code exists. This could be suitably edited then all 3000 created again, one year on.

Publishing Law requires that all amended documents bear a legend to that effect on Page 1 anyway. As there are only four types of publication (New, Revised, Amended, Reprinted) what you are advocating does not conform to any of them.

Collapse -

3000 PDF Files....

by shamsael In reply to If these PDFs are so prof ...

The files were created from scanned data forms. The pdf files are only a few months old, but the data forms themselves are one year old. The forms were scanned using Kofax Ascent Capture software.

Collapse -

Presumably if these PDFs are scanned 1-yr old data forms, then....

by OldER Mycroft In reply to 3000 PDF Files....

The Dataforms are due to expire and be replaced with ones that are up-to-date.

Perhaps you should 'hold off' until the new ones arrive, then re-scan them all. It is likely that their body text will have been updated too. :)

Any future scans could be organised into TWO parts:

16 pages of body text as 1 x PDF
2 pages of time-limited text as 1 x PDF

That way, when the year-end occurs - you stop issuing the 2pp PDF. Plus, if the body text has not altered you just scan any new tech specs and issue them as a Supplement with last year's 16 page PDF.


Related Discussions

Related Forums