• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extracting songs from pdf files... ideas?
#1
Hi. I have been using the MobileSheets companion to extract song sheets from a large pdf song book. For example, song "A" is on sheets 1-5, and song "B" is on sheets 6-8 in the pdf. I would like to create independent pdf files for each song A and B. While MS companion does do this, it creates .png (image) files which are harder to work with than pdf files in MS.
I can do this manually by 'printing to a file' from the large pdf, one at a time, but this takes a lot of work/time to do all the songs in a pdf.

Are there any tools out there that will take a pdf file and automatically create all of the independent pdf files for each song?

thanks!
Rod
Surface 4, 4GB
TuneLab
Reply
#2
I believe that the only application that can do this is the (expensive) full version of Adode. However there are some online sites that offer this functionality.

One such is: www.sodapdf.com/split-pdf/

It isn't totally anonymous as you have to provide an email address to receive the files and they probably bombard you with advertising afterwards. I have used similar sites for other purposes and I have a second, anoymous email address that I use when I have to provide it to sites that I don't totally trust. mail.com offers free anonymous emails that you don't have to provide mobile numbers or other data.

Good luck!
Reply
#3
I do not split my fakebooks. I keep them as a complete book, prepare a CSV file for each book and import single songs as soon as I need them using MSPs "CSV import" feature.

But if your personal workflow is different and you still want to split your PDFs:
PDFsam (split and merge) is a fine tool to do that (there are others too), the free version is fully sufficient.
https://pdfsam.org/pdfsam-basic/#split-pdf
https://pdfsam.org/download-pdfsam-basic/
first language: German
Acer A1-830, Android 4.4.2 - HP x2 210 G2 Detachable, Win 10 22H2 - Huawei Media Pad T5, Android 8.0 - Boox Tab Ultra C, Android 11
www.moonlightcrisis.de - www.basdjo.de - www.frankenbaend.de


Reply
#4
Splitting automatically will always require some sort of index that specifies the song titles and the pages.
If you have (or take the time to make) such an index, you can also use it to have MSPro treat your big pdf file as individual songs without splitting, as itsme wrotes in the previous posting.
Johan
johanvromans.nl — hetgeluidvanseptember.nl — mojore.nl -- howsagoin.nl
Samsung Galaxy Note S7FE (T733) 12.4", Android 13.0, AirTurn Duo & Digit (Gigs).
Samsung Galaxy Note S4 (T830) 10.5", Android 10.0 (maintenance and backup).
Samsung A3 (A320FL), Android 8.0.0 (emergency).
Reply
#5
As itsme and Sciurius already said, there will always be manual work required since you have to tell the program what you want to split and how to name the files.

If you want to use it for MSP only csv and an import with MSP is the way to go. There is already a big thread about it in this forum with several csv to download for fake books.

If you indeed want/need separate pdfs I wouldn't use Acrobat. I would split the pdf in single files with a tool like pdfsam or pdftk and then make an index or rather a script/a batch to rename the single pdfs (if they're one pagers) and join those needed for multipagers (again with a batch with pdftk). As I said it's a bit work but you get quicker results than doing it with PDF editors.
Reply
#6
Thanks for the ideas everyone. As Sciurus says, all (present) methods require an index. I guess I was hoping that there was a "magical tool" that could do this without an index...

I'm no software guru, but I could see how a tool could input a large pdf songbook, reading it page by page. If a page had >40 point text (for example) at the top of the page (the song's title/name), this would be marked as a "first page" (and this large text would also be used as the split-out pdf name). The tool would continue reading the next pages, until it hit the next "first page", or the end of the file. Just dreaming ;-)
Surface 4, 4GB
TuneLab
Reply
#7
Theoretically, someone (e.g. me) could whip up such a tool. But I foresee that the criteria for a new "first page" will be very different from case to case.
Often there is a table of contents in the PDF that could be useful.

But creating an index is not such a hard job, needs to be done only once, and has multiple purposes.
Johan
johanvromans.nl — hetgeluidvanseptember.nl — mojore.nl -- howsagoin.nl
Samsung Galaxy Note S7FE (T733) 12.4", Android 13.0, AirTurn Duo & Digit (Gigs).
Samsung Galaxy Note S4 (T830) 10.5", Android 10.0 (maintenance and backup).
Samsung A3 (A320FL), Android 8.0.0 (emergency).
Reply
#8
In case the PDF has bookmarks, PDFsam can use the bookmarks for splitting and MSP can use the bookmarks instead of a CSV for import (just song title and pages, no additional meta data)
If I need a CSV I try at first to find one. See the thread in this forum, ask me or Robipad.
If the PDF has some  kind of table of content copy/paste often works.
If it is a scanned PDF, the TOC can be created via OCR. That usually needs proof reading.
If that fails I search the web. Many websites that sell fakebooks provide a list of contained titles.
There are websites out there that offer fake book indexes
Having created a list of song titles it needs proof reading and least adding or correcting the pages. I usually also add the keys of the songs.

If you have invested all that effort, please share your results
first language: German
Acer A1-830, Android 4.4.2 - HP x2 210 G2 Detachable, Win 10 22H2 - Huawei Media Pad T5, Android 8.0 - Boox Tab Ultra C, Android 11
www.moonlightcrisis.de - www.basdjo.de - www.frankenbaend.de


Reply




Users browsing this thread:
1 Guest(s)


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2024 MyBB Group.