Nick Reddan – PDF Indexer Excel

Background

I wanted to create an index from a PDF file in a database format so it could be uploaded to a the web and made more accessible to more people. When you copy a phrase in Acrobat reader it may contain additional cariage returns which when pasted to Excel create multiple cells. I developed a way of removing these extranious cells then developed macros to add the indexing classes and to rearrange "John Smith" to "SMITH, John". This latter rearrangement is necessary for genelogical purposes.

How it works

Set up

Defaults Sheet
The defaults sheet data

First go to the "Defaults" sheet. Ther are two things you may want to change. You can enter up to 10 levels for the first level terms in your index. Typically you will have at least "Person" and "Place". You might include say "Famous Person". The macro uses them to name the buttons in the userforms that guide you through building the index.

The second parameter is the "size_bunch" which is the number of SQL insert statements in each column of the "SQL out" sheet. This sheet is used to load your data to an MySQL or other remote database.

Defaults Sheet
The index sheet data

The index sheet

The index sheet contains you index data. It can be edited by hand say, to correct mistakes or remove duplicates.

It has buttons that can be used to run various macros. The first button "Index term in the clipboard and add row" copies the data in the clip board to the next available row and puts all the data into one cell. It then starts a series of userforms which allow you to decide the way these data will be indexed.

Selection level one term
How you select the first level indexing term

Once you select the first level term another selection needs to be made. That is whether the data is a given name family name that needs to be reversed in the indexing process. If you say it is a family name indexing the family name will appear in colum "C" and the Given name in column "D".

 

 

 

 

Selection family name indexing
Index Family name first

Selection Volume part and Page
Select Volume part and Page

Finally you select the Volume, Part and Page numbers. The defaults for this selection are the values in these columns in the row immediately above the new one.

 

 

 

 

 

 

 

 

 

Creating the SQL

The SQL Base sheet

On this sheet you can add the name of the SQL table and its fields. There are then used to create the SQL for putting you data on the web. The create table statement is in the cell D12 and the header for the insert statenments is in cell E12.

To run the macro go to the Visual basic window or click on the macros button in the View ribbon and run the export_sql macro.

The excel macro workbook

Also see the "How to use this workbook" sheet in the workbook.