• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Fakebook Indexes for CSV import
#18
(01-19-2016, 10:57 PM)sciurius Wrote: I think that a collective approach towards fakebook indices is very good, and I appreciate you taking initiatives!

Thanks!  And I very much appreciate your useful feedback!

(01-19-2016, 10:57 PM)sciurius Wrote: However, looking at your CSVs I think they're too limited.

Absolutely - that's why I asked for feedback in the first place :-)  The existing repo is just a prototype.

(01-19-2016, 10:57 PM)sciurius Wrote: They contain just a starting and ending page, not page ranges.
If two pages were swapped in your copy of the book --and this happens-- this cannot be dealt with.

Very good point, but this is extremely easy to fix!  For example we could collapse the page selection into a single field which supports different types of values:
  • "5" would mean just page 5
  • "5-" would mean page 5 and all subsequent pages until the next page not claimed by any other song
  • "5-8,10,9,11-" would mean the same as above but with pages 9 and 10 swapped round and assuming that the song doesn't finish before page 11
(01-19-2016, 10:57 PM)sciurius Wrote: Also, there's no provision for information like key, composer, artist. Columns are fixed.

Yes - also easily corrected, and I agree that it obviously needs to be.

(01-19-2016, 10:57 PM)sciurius Wrote: Another potential trap of these indices is "The final page number is optional, because it can often be automatically inferred by the starting page of the next tune, ...".

In NewReal1.csv I see:

Code:
Airegin,17,
# ...
All Or Nothing At All,444,
Always There,18,

This would mean that song "Airegin" runs from page 17 thru 433, and "All or nothing at all" will probably crash the program Smile.

Oh dear, you seem to have a very dim view of my programming skills ;-)  However you can set your fears to rest; my code easily handles this, e.g. https://github.com/aspiers/PDFexploder/b...ion.rb#L25 - and any other implementation would find it easy to do the same.

(01-19-2016, 10:57 PM)sciurius Wrote: It would be great to employ a more generic data standard for these indices. With a headings row, it would already be more flexible. For example, an index entry could use either "startpage" and "endpage", or a more powerful "pagerange". And it allows for additional, optional information without affecting tools that to do not understand this.

Absolutely - great idea.

(01-19-2016, 10:57 PM)sciurius Wrote: A final remark: the github page reads "... it's the very well-known CSV (or Comma-Separated Values) format". I'm sorry to disappoint you, but although well-known, there is no such thing as a standard for CSV formatted data files. The de facto standard defined in RFC 4180 is a good starting point.

You are not disappointing me, because I entirely disagree ;-)  That is a separate discussion which probably does not belong on this forum, but the starting point would be to consider what is a reasonable definition of "standard", then to consider some other popular standards in the computing world (e.g. the myriad of standards relating to email), and finally compare how strictly those are ratified and adhered to by implementations, relative to CSV.  We leave in a messy world, where even supposedly clearly defined standards often contain crucial ambiguities, flaws, and rival definitions.  However that does not preclude them from being standards; it just means that they could be improved.  But I'm not sure it's worth having that discussion on this forum.

Standard or not, it does not diminish the point I made elsewhere, which is that the potential conflict in CSV files between characters used both as delimiters and within data fields is trivially solved by a solution which has been successfully used in CSV implementations all over the world for multiple decades.  And that solution is simply to quote either just the data fields which contain delimiter characters, or quote all data fields.  (My preference is the former, since the latter leads to CSV files which are less readable by humans.)  So this is a solved problem and I would strongly recommend this community to reuse that solution rather than aim for an alternative which relies on the delimiter character never being needed within any data field, since that is pretty much doomed to fail in some corner cases. 

Anyway, thanks again for the great feedback!  Cheers, Adam
Reply


Messages In This Thread
Fakebook Indexes for CSV import - by itsme - 01-06-2016, 09:23 PM
RE: Fakebook Indexes for CSV import - by sciurius - 01-06-2016, 11:19 PM
RE: Fakebook Indexes for CSV import - by sciurius - 01-07-2016, 12:46 AM
RE: Fakebook Indexes for CSV import - by BRX - 01-07-2016, 12:54 AM
RE: Fakebook Indexes for CSV import - by itsme - 01-07-2016, 04:21 AM
RE: Fakebook Indexes for CSV import - by itsme - 01-07-2016, 04:28 AM
RE: Fakebook Indexes for CSV import - by itsme - 01-07-2016, 04:34 AM
RE: Fakebook Indexes for CSV import - by sciurius - 01-07-2016, 05:02 AM
RE: Fakebook Indexes for CSV import - by itsme - 01-07-2016, 09:28 AM
RE: Fakebook Indexes for CSV import - by BRX - 01-07-2016, 08:19 PM
RE: Fakebook Indexes for CSV import - by sciurius - 01-07-2016, 06:40 AM
RE: Fakebook Indexes for CSV import - by sciurius - 02-08-2016, 01:14 AM
RE: Fakebook Indexes for CSV import - by sciurius - 02-28-2016, 12:02 AM
RE: Fakebook Indexes for CSV import - by BRX - 01-07-2016, 08:14 PM
RE: Fakebook Indexes for CSV import - by sciurius - 01-07-2016, 08:39 PM
RE: Fakebook Indexes for CSV import - by itsme - 01-12-2016, 10:06 AM
RE: Fakebook Indexes for CSV import - by aspiers - 01-19-2016, 09:42 PM
RE: Fakebook Indexes for CSV import - by sciurius - 01-19-2016, 10:57 PM
RE: Fakebook Indexes for CSV import - by aspiers - 01-20-2016, 02:33 AM
RE: Fakebook Indexes for CSV import - by sciurius - 01-20-2016, 10:52 PM
RE: Fakebook Indexes for CSV import - by sciurius - 01-21-2016, 02:26 AM
RE: Fakebook Indexes for CSV import - by aspiers - 01-21-2016, 03:43 AM
RE: Fakebook Indexes for CSV import - by sciurius - 01-21-2016, 04:44 AM
RE: Fakebook Indexes for CSV import - by aspiers - 01-21-2016, 05:08 AM
RE: Fakebook Indexes for CSV import - by sciurius - 01-21-2016, 04:49 AM
RE: Fakebook Indexes for CSV import - by aspiers - 01-21-2016, 05:24 AM
RE: Fakebook Indexes for CSV import - by sciurius - 01-21-2016, 07:38 AM
RE: Fakebook Indexes for CSV import - by itsme - 01-21-2016, 08:20 AM
RE: Fakebook Indexes for CSV import - by itsme - 01-21-2016, 08:47 AM
RE: Fakebook Indexes for CSV import - by sciurius - 01-21-2016, 06:50 PM
RE: Fakebook Indexes for CSV import - by sciurius - 01-21-2016, 06:57 PM
RE: Fakebook Indexes for CSV import - by itsme - 01-21-2016, 08:20 PM
RE: Fakebook Indexes for CSV import - by BRX - 01-22-2016, 02:36 AM
RE: Fakebook Indexes for CSV import - by sciurius - 01-22-2016, 03:44 AM
RE: Fakebook Indexes for CSV import - by itsme - 03-10-2016, 09:36 PM
RE: Fakebook Indexes for CSV import - by sciurius - 03-11-2016, 06:28 AM
RE: Fakebook Indexes for CSV import - by sciurius - 03-29-2016, 12:29 AM
RE: Fakebook Indexes for CSV import - by sciurius - 03-29-2016, 06:31 AM
RE: Fakebook Indexes for CSV import - by itsme - 04-20-2016, 03:42 PM
RE: Fakebook Indexes for CSV import - by itsme - 04-21-2016, 01:30 AM
RE: Fakebook Indexes for CSV import - by itsme - 05-04-2017, 07:25 AM
RE: Fakebook Indexes for CSV import - by itsme - 05-04-2017, 07:43 AM
RE: Fakebook Indexes for CSV import - by itsme - 05-04-2017, 05:36 PM
RE: Fakebook Indexes for CSV import - by itsme - 05-04-2017, 05:48 PM
RE: Fakebook Indexes for CSV import - by itsme - 05-05-2017, 07:09 AM
RE: Fakebook Indexes for CSV import - by chrisss - 05-14-2018, 05:08 AM
RE: Fakebook Indexes for CSV import - by itsme - 05-14-2018, 05:57 AM
RE: Fakebook Indexes for CSV import - by chrisss - 05-15-2018, 01:29 AM
RE: Fakebook Indexes for CSV import - by reggoboy - 03-08-2020, 01:44 AM



Users browsing this thread:
1 Guest(s)


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2024 MyBB Group.