Representation as translation

Words know more about themselves than we generally do, and one should always take the time to explore their etymology. Take representation, a word which is a bone of contention as soon as linguists, semioticians, ontologists, librarians and Web geeks happen to meet, and look back for a minute at its latin origin. Repraesentatio is built on praesens, which has kept in both french présent and english present its double meaning of "being here and now" and "being given". Re-presenting is therefore giving or making present again, and repetition of the praesens is important to notice. Whatever the representation stands for, signifies, means, points at, suggests ... in one word, gives or simply puts us in presence of, here and now, the praesens has been (re)present(ed) already before, somewhere else.
And since the story of representations is so old, and the beginning of the story so misty, let's assume that in the long chain of representations, each new one is leveraging previous ones, and it's turtles all the way down. So, instead of wondering about the untractable issue of the first presentation, why not focus on the process through which one representation is emerging from previous ones. This process has a name : translation.
In "Traduire - Défense et illustration du multilinguisme" (published may 2009, only available in French so far), François Ost presents the multiplicity of languages as a benediction for the life of thought, and calls for translation as our needed common language and new paradigm. This rich and thoughful book is a must read is your French is fluent enough. I cannot in one post pretend to cover all it brings about, but just re-present here a few main points, enough to show the convergence between Ost's thesis and the line of thought we've been defending here for years.
Primo, as indicated in introduction, every text or production of language is the result of a process of translation, and this process is not only happening at the borders of languages, but inside the language itself, and not only between dialects and individual expressions, but inside the mind of the locutor herself. The dialectics of the "what do you mean?" - "in other words" is not a bug of the conversation, it's definitely a constitutive and essential feature of efficient language.
Secundo, translation, at least when it's a good one, is not necessarily an entropic process through which the original meaning is degradated and betrayed. The meaning, or whatever you want to call the praesens in the original (and I will indeed call it praesens hereafter), is not necessarily lost in translation. If the translation is good, it's also present in the re-presentation. More, this new presentation is likely to enrich the original one, and maybe help to understand it better.
Tertio, to keep the praesens alive, we need to re-present it over and over again. Translation is a never-ending story. And the apparent paradox is that we need to do that the more for things considered to be untranslatable. Ost quotes there Barbara Cassin : L'intraduisible, c'est ce qu'on n'arrêtera pas de (ne pas) traduire. Saying the praesens is untranslatable means simply that no presentation is exhausting its praesens, and new re-presentations are needed to get new viewpoints enlightening the previous ones. Pretending to achieve the final representation which rules them all is falling back again into the arrogance and stupidity of Babel's Tower builders, who had lost the meaning by forgetting the diversity of languages. God's action was then not a punition, but a liberation from this deadly road of the unique thought, as Ost shows in the first chapter of the book, a brilliant hermeneutic analysis of Babel's various translations.

Porting the translation paradigm in our local metaphore is quite easy. The wheel is the never-ending story of living knowledge and languages. So many spokes, so many directions to look from, converging in the untranslatable hub which is beyond, but praesens in all re-presentations. But the translation paradigm is what was lacking to get the wheel rolling. As Umberto Eco put it about Europe : Translation is our common language. Indeed, and to put this paradigm into action, each one of us engaged in the process has to learn several languages, in order to be able to look at one language with the eyes of another. Because no one undertands her own language from inside it, and there is no meta-language to rule them all.

Last but not least, the ethical aspect of the translation paradigm : any other language is the language of one other, including mine. In translation, I learn not only to look at the other as myself, as an alter ego, which is still looking from my own viewpoint and measuring with my own metrics, but to look at myself as another, ego alter. Discovering the other-ness, the alterity inside myself, my own language, my own view of the world, is indeed a paradigm change that we all need.


Asserting subclasses of open ranges or domains

I had an interesting exchange on Semantic Web list on this issue last week. You can browse the whole thread, but I would recommend answers by Pat Hayes and John Sowa.
Below are extracts of my final answer.

1. There is quite a difference to make between concepts in ontologies strongly defined by domain experts, and targeted at feeding reasoners (e.g., bio-medical or legal ontologies), and lightweight ontologies such as FOAF, VCard, Dublin Core, Geonames ... which are mainly targeting interoperability of data, and of which meaning (if not formal semantics) emerge from usage and population. I can't define formally what a Person is, but I can say that you and I are some instances.

2. For the latter said ontologies, the main objective is to provide guidelines for applications harvesting and managing data. The actual formal semantics of those models is next to nothing, but implementations can reasonably leverage them on the basis of a common sense interpretation. For example the thousands of different data models to represent a person can be re-engineered as so many specifications of the generic class foaf:Person, therefore allowing a shallow, but efficient level of data interoperability.

3. There is no more, no less semantics nor potential usability in declaring skos:Concept to be in the range of dcterms:subject, than to declare foaf:Person to be a subclass of foaf:Agent. "Dont acte"

For some other ill-defined property ranges in the Semantic Web popular ontologies, another path would be to use enumerated classes, or in a more flexible way, to indicate a published vocabulary maintaining a reference enumeration. For example when LoC publishes later this year the authoritative ISO 639-2 list of languages as a SKOS Concept Scheme, the range of dcterms:language could be restricted to the values in such a list (using e.g., a restriction on the value of skos:inScheme). This would avoid Dublin Core to go through the painful task of defining formally the class dcterms:Linguistic System which is the current specified range - with the same lack of definition as foaf:Agent. Referring to some authority is certainly the best way to deal with the issue here. We (DC) don't know what a language is, go ask ISO 639-2 folks, apparently they know because they are able to provide a list.


From meaningless to ambiguous

A previous post looked at concepts as names species. Several threads about URI meaning and ambiguity later, including on www-tag@w3.org list this post from Karl Dubost, and with in mind again names and concepts as living things, here is yet another variation on morning, noon, and evening mountains.
Morning names, meaningless, useless
Names at noon, meaningful, useful
Evening names, abused, ambiguous
This basically is the life cycle of names from no use to use and eventually abuse. Use leads to meaning, then abuse leads to meaning overloading. This is the natural course of things. And since URIs are names, this is also the natural URI life cycle. So let us use what is meaningful while it is if we want meaning. Be prepared to ambiguity at the end of the day, but if evening ambiguous names have been abused, it does not mean they are useless.
Reblog this post [with Zemanta]


URI species

The debate about proliferation of URIs representing the same thing keeps on rolling on various Semantic Web lists, going back again and again to the same questions. How does one discover existing URIs for a thing, if any? Is it a good or bad practice to mint a new URI for a thing which already has one? How do one link URIs identifying the same thing? Many smart and conflicting answers have been given, largely depending on the viewpoint on Web architecture and the main use of URIs in the mind of their authors. Web pragmatists and linked data evangelists tend to consider that proliferation of URIs is not necessarily a good idea, but something we are bound to live with, whereas experts in knowledge representation tend to consider it should be avoided by all means. Trust, persistence, quality of resource descriptions, use and abuse of owl:sameAs have been discussed over and over, with no obvious technical answer.

Since life provides the oldest, proven, efficient ways to store, maintain, replicate and use information, I've tried to figure if we could not learn from biology. Interestingly enough, biologists are not more able to come to a consensus about what a species is than Semantic Web gurus to agree on what is behind a URI. Somehow, the two issues are very similar. They deal with persistence of information over time. With the disclaimer that I am not a biologist, let me assume here the definition of a species as the set of individual expressions of some common genetic pool. Protection and persistence of the species genetic pool is the main occupation of any form of life. Strategies to achieve this goal present an awesome diversity, but in this variety one can find some constants. Among those are the basic facts that individuals are bound to a short life span, so the protection of the genetic pool is best achieved by assuming mortality of individuals, and ensuring duplication and replication of the information in as many individuals as possible. Not by defending a single representation behind firewalls.

How does that apply to the Semantic Web? A URI, along with the resource description it provides, can be seen as an individual expression of a species concept. As any human artefact, or any living individual, or any physical manifestation in this world, this expression is bound to be a transient. The agent who created and maintain the URI is bound to disappear, among other things. It will be less costly, as life tells us, to have copies of the information in as many expressions as possible all over the place, than to protect this specific one. Consider a URI not as the unique representation of a thing, but as an individual expression of a species.


Common Tag

Based on Common Tag specification and various APIs around, there are certainly a lot of easy next steps towards interfacing more efficiently the free tagging and linked data universe. For example it should not be too difficult to build an interface allowing the mapping of Twitter hashtags to DBpedia URIs, based on both Twitter and Zemanta APIs. Faviki could open this path.
Reblog this post [with Zemanta]


Everything is a thing, everyone is many

So I'm now on Twitter, following a couple a people, hoping some interesting bits and pieces will float around to my shore. And well, some have. For example this interesting piece on Multiple Personas. What Silona writes there on the essential multiplicity of a person can certainly also apply to things and representations. We know more or less from inside our multiplicity and ambiguity, and the importance of keeping multiple personas revolve around our fundamental and essential emptiness. But we've lost most of the time the capacity to look at things the same way. As Jean Rousselot put it, we are not simple enough any more to "enter things as things can enter things"
Il faudrait pouvoir entrer sans frémir
Dans les choses
Comme les choses
Entrent dans les choses.
The poets are the ones able to enter things, experience their mutiplicity, and show how they appear to us as multiple personas.
We should look at the necessary convergence of the social and semantic web(s) with this paradigm of multiple personas in mind.



sameas.org is quite an implementation of hubjects for the linked data universe. It relies still a bit too much on owl:sameAs, but I begin to believe that owl:sameAs eventual semantics will be the one applications make of it. Only reasoners in closed universes will apply owl:sameAs for what it is in the standard (strict identity). Open Web and linked data cloud will use it as "follow-your-nose" hubs to switch from one representation to another and aggregate information.