• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
import meta data from PDF
#16
Hi Mike,

I tried a lot of things and I think I got the metadata the same way as you described.
But there were not scraped (Collection and Setlist)
I am on 3.8.6 and I tried it on Windows and Android.

I created a demo pdf and attached it here. Can you please tell me, if it is correct formatted for you?

This feature is really helpful for me and I really like it.

Regards,
Stephan


Attached Files
.pdf   test.pdf (Size: 7.08 KB / Downloads: 6)
Reply
#17
Stephan,

It looks like your file is built perfectly, but it's not going to work on Windows unfortunately because the company that makes the PDF library I use has not added support for extracting the XMP information from the PDF in the Windows version of their library. I contacted them three weeks ago about this and have not gotten a response (which is unusual), so I'm going to follow up again with them. This should work on Android and iPadOS though, as the method exists in those versions. 

Having said that, the easy-to-use "GetMeta" function that is provided by their library does not return the Collection or Setlist you specified, which is strange, as their "SetMeta" adds it to the PDF in the exact same way as in your file. However, there is a workaround for this - I will just parse out the metadata using the XMP XML itself instead of relying on their GetMeta function. So this should work as you would expect with version 3.8.7 which I'm aiming to finish today or tomorrow at the latest. Let's revisit this once you have that update, and thanks for your patience while we've figured this all out.

Thanks,
Mike
Reply
#18
Hi Mike,

Sounds good. Feel free to contact me for testing. 
By the way: Are you planning to support all fields to be set from metadata?
As I wrote earlier, it would be very useful for me

Regards,
Stephan
Reply
#19
Stephan,

I currently support about 80% of the fields, so I will go ahead and add the last 20% for the next update which I'm finalizing right now.

Thanks,
Mike
Reply
#20
Great!
Thanks a lot for that. I think/hope with this feature many ForsScore users from our orchestra will switch to MobileSheets because of this feature Wink
Reply
#21
Does/will this feature support adding PDFs with multiple song entries? 

It would also be useful to see a list of the keywords that this import supports as compared to the keywords on CSV import.
Reply
#22
This feature pulls metadata out of a single PDF and adds that to the song that will be created for that PDF. If you import multiple PDFs at once, then a separate song will be created for each PDF and the metadata will be pulled out of each PDF for that song. So you can certainly import many PDFs at once using one of the Import options and the metadata will be extracted from those PDFs. However, this feature is not designed for taking one PDF and generating multiple songs from it - that's what the CSV feature is designed for. So if that is your objective, the CSV feature would serve you better than trying to utilize this. It's much more tedious to try to manage PDF metadata (in my opinion) than creating a CSV file to break up a large PDF.

Mike
Reply
#23
Good morning Mike,

The Metadata import seems to work for me now.
But "Custom" and "Custom2" seem of beeing ignored.

Would it be possible to add them?

Regards,
Stephan
Reply
#24
Sure, I can add those.
Reply
#25
Mike - some observations (low priority) on what is by now, I'm sure, your favorite topic: metadata.

I see that in the latest update the SourceType field now imports on the iPad; and that you've even added the ability to pick up the custom fields.

In order to lock in my ongoing workflows and templates I spent the better part of the afternoon working through a variety of metadata experiments. In the end I have settled on a set of field definitions, and input parameters, for ExifTool to make it all work for me with MobileSheets.

However, in working through this I did notice some items that might be anomalies, or might be: "works as designed." Absent my masters degree in metadata I'm not sure which.

But I offer my findings here in case you want to take a look and see what you think.

The 'A' and 'B' documents show some of these potential anomalies.

The 'C' document is where I have settled on what works for me.

The 'E' document concerns the use of true 'list' fields in metadata. On Phil Harvey's ExifTool site there is a whole thing about list fields being truly list fields and not strings with separators. I have noticed that if I define a custom field as a 'list' type in the XMP group, and then with that field definition I set a list of items into a PDF doc's metadata, then when I have ExifTool show me the metadata in the PDF file, it does actually show them as separate line items in a list as opposed to a string with separator characters. I don't know if that might be useful to deal with commas versus semicolons and all that jazz? Anyway, an observation I have made.

Also, in a post I did on Phil's site, he made the comment: "XMP is the future..." This was in reference to a question I had about the use of the PDF group vs the XMP group.


Attached Files
.pdf   210_TEST_Score_B_SomeSemiColons.pdf (Size: 386.86 KB / Downloads: 12)
.pdf   210_TEST_Score_C_BestCanDo.pdf (Size: 387.46 KB / Downloads: 10)
.pdf   210_TEST_Score_E_ListFields-2.pdf (Size: 129.17 KB / Downloads: 5)
.pdf   210_TEST_Score_A_AllFields.pdf (Size: 452.12 KB / Downloads: 15)
Reply
#26
Thanks for all of the information. I'll spend some time testing and working on fixing whatever problems remain. It definitely looks like there are some issues with whitespace not being trimmed.

Mike
Reply
#27
(09-04-2023, 04:27 PM)Zubersoft Wrote: Thanks for all of the information. I'll spend some time testing and working on fixing whatever problems remain. It definitely looks like there are some issues with whitespace not being trimmed.

Mike

Thanks Mike.
Reply
#28
I wasn't able to get the fix included with version 3.8.14, but I've tested your files with my latest changes and it all works properly now. Some of the entries had lots of whitespace and newline characters, so I had to strip the newline characters, trim the whitespace and everything looks correct after doing that. I also had to add one important fix that prevents duplicate entries from being created if the same values are specified in the metadata in multiple fields.

Mike
Reply
#29
(09-07-2023, 07:32 AM)Zubersoft Wrote: ...Some of the entries had lots of whitespace and newline characters, ....

Hmm.. that's interesting. Good that you are trimming to make that work. But now that has me curious as to why all that whitespace gets in there as I only added one blank space between comma/semi-colon separated entries. Not your puzzle to track down; and even I may just let that go for now.

Thanks for hangin' in there with me Mike on all this metadata stuff.
Reply
#30
I managed to set metadata fields in pdfs, but it seems to me, that MobileSheets only uses them, when doing an import from the app. When opening a file from the internet (on my android tablet "open with MobileSheets") then at least the title-field is not used. Or am I missing something? 
MobileSheets Companion seems to ignore all metadata?

(Great software, btw)
Reply




Users browsing this thread:
10 Guest(s)


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2024 MyBB Group.