• 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Reading music in a book with searchable pdfs
I am looking for fellow musicians who are determined to locate and display songs without the time consuming extraction of individual songs or use of Excel spreadsheets.  I am hoping that there are others out there in the ethos who, like me, have been considering searchable pdf's as a solution.

Here's the technical summary (skip if you wish).  It's essentially impossible to search for text in a traditional pdf since it's simply an image.  A conversion program can look for patterns that appear to be text and record the text content and location in an invisible "layer" below the image.   So if you grab a search program that can detect that text (e.g., title, composer, lyric...)  it becomes very easy to to locate and display the entire page image with the desired text highlighted.  Not bad for a start. 

The sequence here is to take the original scanned image pdf of the printed music, convert it to a searchable pdf, search that pdf for what you want (a title, composer, etc), and then display it as a musician would like it.  

Unfortunately, MS does not detect that text information.  There is software out there that, although not originally intended for this purpose, can do the detection and something of a good job of displaying the charts.  For example, in Windows, the explorer can easily locate the searchable data.  It can then pass that information along to a program such as Adobe Acrobat which can display the chart looking just like a book and, trumpets please, also allow for turning pages by keyboard or foot pedal.  I want the system to successfully pass the data all the way to MS because its display and bells and whistle are so good.

I'm currently at the making sure each program in the line can do its job, and then, how to pass along information from program to program with as little human intervention as possible.   Got any questions or experience here?  Leave me a message and I'll write back promptly.

The PDF library I'm using does support searching for text in a PDF. So if you use another program to make the PDF searchable, my library could go through that text and highlight words. I just haven't had time to add support for that yet.  Allowing a search on an opened PDF makes sense, but would you want to be able to search through every PDF in the entire library to find something? With thousands of files, that could take a considerable amount of time.

Dear Mike,

Thanks for your note -- sorry for the delay in responding.  


As for speed of searching, by indexing my searchable pdf library, I was able to search every word on every page of my music in a fraction of a second.  However, the time overhead of maintaining indices, keeping track of consistent naming conventions, etc. made me think that you were truly wise to concentrate on presenting and navigating musical notation in lieu of dealing with issues of cross platform interpretation of images, etc. 

I spent a long time trying to make searchable pdfs work with my sheet music but black spots, tiny print, missing info on scans, and other practical problems made it not worth the time.  I think that most of us don't have an exactly pristine music collection.  I'm now using macros and a good PDF editor (Kofax Power PDF Standard) to make the extraction process produce less pain in less time with more results.


My music library started as normal image based pdfs with one pdf for each individually published song and one pdf for each book of music 

Here's the good news.  

OK, one pdf = one song.  Good!  If that were all I had, then I could search the filename and, poof, the correct chart would appear.

Books were the motivating factor.  I really created this process so that I didn't have to extract lots and lots of individual songs from books.  After converting all my sheet music to searchable pdfs (that took just a few hours while I was sleeping), I indexed every word in my collection using Adobe Acrobat (yes, another nap).  With that index in hand I could do a search in less than a second.  

Now the bad news:  

Unfortunately, I discovered some problems as I kept evaluating how Acrobat could search every word in the sheet music (title, composer, year of publication, lyrics, etc.)  Failures to find words in the sheet music were caused when

1)  there was too much visual noise on the sheet near the title (e.g., dark spots..)
2)  the title was written in an original font that Adobe couldn't recognize.
3)  the year of the publication (at the very bottom of the page) was sliced off when I did the original scan
4)  the year of the publication was so small that it just wasn't legible in even a good quality scan.

Etcetera.  Etcetera.  Etcetera.  

I had to swallow my pride because the fixes were going to take longer than extraction.  Fooey!

I'm using two macros (created by the free Macro Recorder program). I've moved back to a simpler PDF program (Kofax Power PDF Standard -- old name Nuance -- about $130) which isn't as powerful as Acrobat, but it works for me and costs a bunch elss than Acrobat.  Add to that strictly consistent naming conventions and I have an extraction process that is not especially painful or time consuming. 


I would have completed a finished and useful process using searchable pdf's if I had had a pristine collection of music and time to deal with all sorts of other practical and coding problems.  I'm retired now, and if I were still playing jobs, even more problems would have reared their ugly heads.

You were truly wise in staying away from adding a way to avoid extracting pieces from books.  New problems popped up every time I turned a corner. The thought of writing code for searchable pdfs on multiple platforms gives me the willies.  I'm also not sure that a lot of my musical colleagues would  be happy being required to use a proper naming convention -- you know, a perfectly consistent system with no spaces, no apostrophes, no commas, etc.  

Thanks for all your work.

Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  

Users browsing this thread:
1 Guest(s)

  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2021 MyBB Group.