From: Paul Rohr (paul@abisource.com)
Date: Tue May 14 2002 - 13:43:02 EDT
At 12:23 PM 5/14/02 -0400, Dom Lachowicz wrote:
>On Tue, 2002-05-14 at 11:57, Karl Ove Hufthammer wrote:
>> [1] <URL: http://dublincore.org/documents/dces/ >
>> [2] <URL: http://dublincore.org/documents/dcmes-qualifiers/ >
>> [3] <URL: http://dublincore.org/documents/dcmes-xml/ >
>> [4] <URL: http://dublincore.org/resources/faq/ >
>
>I've taken Paul's suggestion, and now we use #define statements instead
>of methods. I've updated the MSWord importer as needed.
>
>I've also taken Karl's suggestion, and now we support a superset of the
>Dublin Core elements. The supersetted term is "keywords" which, IMO, is
>sorely lacking from their list.
>
>I'll probably export the properties as DC/RDF sometime soon to make our
>metadata and m tags go away. The AbiWord importer will have go get
>seriously smarter to properly handle the namespacing issues involved.
Karl and Dom,
I've just finished skimming the DC stuff too, and I agree that we'll need a
superset approach. By design, DC is a small well-defined set of metadata
that's considered useful for public indexing of all content. Transparently
capturing that data is a Good Thing. However, people and organizations use
word processor properties dialogs to store all kinds of other stuff. I
suspect that trying to shoehorn that whole mess into "pure" DC is a losing
battle.
As for switching to RDF, the big question for me is whether adding all that
code is really need to help us solve this namespace problem. If not, I'm
tempted to follow the simpler precedent already in place for HTML:
http://www.ietf.org/rfc/rfc2731.txt
In short, the idea would be that we preface any of *our* keys which are
DC-compatible with the DC prefix. All others -- whether defined by Word or
by users -- go at top level. To implement this should just take a quick
upgrade to Dom's current #defines.
the screw cases
---------------
For example, what would the RDF equivalent of the following markup be?
<m name="DC.title">World Domination</m>
<m name="DC.creator">Abi the Ant</m>
<m name="DC.language">en-US</m>
<m name="DC.subject">My secret 10-year plan. Shh! Don't tell!</m>
<m name="DC.date.created">1998-08-01T09:14:37-05:00</m>
<m name="DC.date.printed">2002-05-13T13:15:30Z</m>
<m name="pages">143</m>
<m name="Checked by">Legal Review Committee #43</m>
<m name="Typist">MLM</m>
<m name="Playlist">Dave Brubeck, Time Further Out; Tom Tom Club</m>
<m name="$%&^$$&$">See, that property name is in Inuktitut. I don't speak
Inuktitut -- it's too cold for us ants up there -- but I just love working
on a word processor that lets me do stuff like this.</m>
If this example seems contrived, think again. There are large organizations
who like to keep close track of who did what in the production and review
process. There are also creative individuals who like to mention which
tracks from their playlist helped inspire a given work.
Can you imagine someone like Abi doing the latter while operating inside the
corporate constraints of the former? I can. ;-)
a few other notes
-----------------
1. For those of you who read the above date examples carefully, I'm not
sure whether our canonical datetime output should include the timezone
offsets or not. For details, see:
http://www.w3.org/TR/NOTE-datetime
2. Where possible, we should certainly map as many of the standard Word/RTF
properties onto their properly-qualified DC equivalent. The user-visible
names obviously wouldn't have all that dotted DC gibberish, though. That
way, people can get decent DC compatibility (ignoring the controlled
vocabulary stuff, of course) by just typing in a friendly dialog. Or by
importing their existing documents to AbiWord. ;-)
3. FWIW, I'm not sure it's all that safe to map Word's company onto DC's
publisher. Word actually has a separate publisher keyword in their custom
tag.
4. Getting back to my original metadata vs. document properties thread, is
a DC.language property the right place to store the document's default
language, or should we be using PROPS for that instead?
bottom line
-----------
I think DC is a small but very useful subset of the metadata our users will
want to capture. For me, the jury's still out on whether RDF adds any value
beyond that.
Paul,
trying to think ahead
This archive was generated by hypermail 2.1.4 : Tue May 14 2002 - 13:45:41 EDT