Justin.jpgHaving the fortune to work as CSO for Talis, an innovative UK software company, in one of the most exciting times for software and the internet; I thought I would share some ideas and insights I am finding exciting at the moment.

Entries in semantic web (5)

The puzzle of semantic web adoption

I am a believer in the rise of the web of data. In fact I am CTO of Talis which is investing heavily in semantic web technologies. So don’t take this the wrong way but I can’t help but feel the semantic web community is ignoring a vital part of the semantic web jigsaw and this is creating a major credibility problem between it and large parts of the technology community. I am concerned because I think that the semantic web currently lacks two critical things that drove mass adoption of the web.

To be fair the W3C has created a semantic web outreach group and Talis has two representatives on this, so we are doing our bit to help to spread the word :-) but this is only going to really work if the semantic web community really understands what the major missing pieces are for mass adoption. Today, looking at the conversations in the semantic web community, I don’t think the real barrier is being seen clearly.

So here is my personal view on what is going on here.

Many in the semantic web community have been concerned mainly with the rightness of the technology and not the utility of the technology. That is fine for the invention process but badly wrong for the adoption process. Just ask the inventors of the Beta Max :-) . It doesn’t matter how right you are!

Adoption is a strong function of day 0 utility. That means: “What can I do better today by using semantic web technology rather than existing technology?” You can’t use the argument that when everyone has adopted RDF it will be really really useful, because the people who need to adopt the technology in order reach that critical mass of RDF won’t do it because of belief in the semantic web vision. These adopters are pragmatic and need technology to give them advantage today not in 5 years. Network effect based features always have this kind of initiation problem.

To overcome the network effect initiation problem there needs to be day 0 value to drive adoption until the network effect kicks in and takes over as the main reason for adoption. In short there needs to be a killer application of the technology.

What is the semantic web killer application?

So my question to the semantic web community is what exactly can I do far better today with the more unproven semantic web technology than I can today with more established technology such as agreeing simple XML standards?

 

A clear answer to this question is vital. 

I actually think that for most specific instance of usage you could achieve faster adoption and lower risk through de facto standards agreement with a simple XML approach. Take RSS.  Was this a success because it was RDF or because is was the de facto emergence of a simple standard based on its raw day 0 utility, not some far off network effect based value. It is not a semantic web killer application.

 

So it seems to me that semantic web adoption is a very different problem to the web of documents in the early days.

The web had 0 day utility. Many people will remember that feeling of seeing the web for the first time and knowing that you could easily publish any thing you liked and the whole world could read it instantly, mind blowing.
You didn’t need any special tool to write a HTML document, doing it by hand was easy enough.

The web was its own killer application. The semantic web is not.

But the web had a piece missing. You could start at any resource and navigate the links but you couldn’t search the space itself to find a good starting resource in the first place. This meant that as the web grew, more and more of the content could not in practice add any extra value to a users experience.
Of course the missing piece was the search engine. This allowed a user of the web to query the whole space and now every document no matter how obscure could potential enrich a users web experience.
I don’t think it is right to characterise this as something missing from the architecture of the web because search engines could be layered on top and that is better than building in complexity to the core standards, but from a users point of view, the real potential of the web of documents could not be realised until it was possible to query the whole web space.

We talk about the semantic web in terms of a web as database. But where is the database engine? Google is the free text engine for the web of documents. Where is the equivalent for the semantic web? 

So the semantic web appears to have little day 0 utility over a specific approach, it is not its own killer application and it lacks the ability to query the semantic web information space itself.

This may appear a provocative conclusion, I don’t know. But is it correct? If it is then who is doing what about it?

If true does this mean the semantic web will forever be a dream?
No I don’t believe that. But I do believe it changes the way we should think about semantic web adoption.

For example, it would be crazy to believe all data must be in RDF, that would create a huge barrier. Instead the question should be how RDF and other data approaches can work together to create a powerful web of data, the superior value of the RDF approach should over time increase the amount of RDF versus other approaches.

But I think the single biggest blocker on web of data adoption and by extension, the semantic web, is the lack of ability to query the whole space. Where is the database engine for the semantic web?

 

 

Posted on Friday, December 8, 2006 at 09:39PM by Registered CommenterJustin Leavesley in , , | CommentsPost a Comment

Ecosystem 1 - Physical technology meets social technology

I’m pretty sure that the concepts of co-operation,  platforms, webs of data and webs of functions will be central to understanding how the internet and web will continue to change our world and the way technology companies can make defensible long term value.  In the next few post I will look at web1.0, web2, the semantic web from the point of view of ecosystems, drawing on the very useful new view of economics known variously as evolutionary or complexity economics.  Over on Nodalities you can follow how Talis is putting these ideas to work in the real work of high tech innovation lead business. It is this special combination of theory and practice that makes Talis such an intense and wonderful company to work for.  

Ecosystem
It is the constant dance between physical and social technology,

Click to read more ...

Posted on Sunday, November 26, 2006 at 08:25AM by Registered CommenterJustin Leavesley in , , , , | CommentsPost a Comment | References1 Reference

Will the real Semantic Web please stand up

We live in an amazing and unique time.  Most of you reading this blog were alive at the birth of the global computer, around 15 years ago. In that time the computer has never been switched off, never been rebooted and has grown to an almost inconceivable size and complexity. The shear storage and processing power is almost impossible to calculate. The computer is fed information and programmed by the actions of around a billion users, night and day, evolving at an incredible speed.  For example, in the last two years, over 14 million blogs alone have appeared, seemingly with no effort or investment!

But there is something else going on other than computing on a grand scale. A new type of approach to computing is arising, one which fundamentally changes the relationship between the user and the computer.  I am talking about a new approach which is based on tapping into the collaborative effort of millions of users to programme software through the everyday actions of the users. The new programs are effectively learning systems that extract training and feedback from users actions on an unprecedented scale. Fuzziness, statistics and learning over programmatic logic.
The Google spell checker is a great example of this. Google could have sat a bunch of programmers down and coded a spell checker using a dictionary and lots of rules. Doing this in every language under the sun and keeping it current as new words come into being (e.g. blogging) would have been a great effort. Instead, Google uses the actions of the users to programme the spell checking, extracting patterns of behaviour from users retyping misspelled words and feedback on when the user accepts a suggested spelling correction.  Amazon's people who bought this book also bought these system is a more limited example.
Built on participation between the users and the system, the result is what you might call collaborative intelligence.
It is an emergent rather than programmed.
It is interesting to note that this is also the same transition that artificial intelligence went through. It became clear that predicate logic based solutions did not scale well and the field turned to fuzzy logic, statistics and neural networks where systems required training rather than programming.

The other important quality of this approach is scalability. Implicitly this scales, in fact it thrives on scale.
Traditional programmatic approaches, essentially based on logic, have a harder time scaling.

Considering that it really is only in the last few years where the hardware costs and online community size has enabled experimentation at scale, I am very excited about what the next 10 years will bring in this direction.

So this brings me to the title of this blog. It seems to me that humans are very good at semantics and that systems that are based on human computer collaboration (i.e the emergent properties of large numbers of users) will be very important in semantic based systems. You could consider del.icio.us and Flickr and the massive rise of tagging and microformats to be very early examples.  If the collaborative approach of del.icio.us could be synthesised with more sophisticated semantic methods such as RDF then we might really be cooking with gas.

So I conceive of the Semantic Web including applications built as collaborative emergent systems. 
Here in lies my problem. The Semantic Web as defined by Tim Berners-Lee's and expressed in his paper on the design issues for the Semantic Web, expressly excludes any type of fuzzy system from being a Semantic Web application (see exert below and comment). This is because he requires applications to be logically provable and guaranteed so that first order predicate calculus (predicate logic)  is the only logic that the Semantic Web admits. The example TBL gives is of a banking application needing to be guaranteed.
I have two main issues with this:

1) Why exclude the Semantic Web from the exciting possibilities of fuzzy and statistical approaches to  semantic systems. Can't both be included, a banking application just requires a stricter criteria on statements it can operate on. Applications don't need to be guaranteed to be useful (although I admit Banking applications do!!).

2) Will this massively scale? What gives us reason to believe it will? FOPC based systems have proven difficult to scale in several fields so far. TBL admits that the Semantic Web approach is not very different from previous approaches that did fail to scale. The basic point is that FOPC based systems cannot cope with inconsistency (as TBL points out) , as you scale, keeping consistency in practice becomes harder. 

So, what will the semantic web be like. I guess in time the real semantic web will stand up.

The rest of the blog looks at TBLs semantic web design paper in more detail and may not be of great interest to most readers

First of all, thanks Rick and Ian for persevering with all my questions.

Fuzzy or not has been the main theme behind all my SW blogs to date. Tim Berners-Lee's is quite clear - Not.
I just don't get why not, certainty is just a special case of fuzziness, why can't we include both?

We are back again to where I started perfect or sloppy rdf shirky and wittgenstein.html which was based on Tim Berners-Lee's paper  you mentioned Rick

This quote has almost the entire point I am trying to make in it. I'll take a few sentences at a time and explain what they mean to me.

"The FOPC inference model is extremely intolerant of inconsistency [i.e. P(x) & NOT (P(X)) -> Q], the semantic web has to tolerate many kinds of inconsistency.

Toleration of inconsistecy can only be done by fuzzy systems. We need a semantic web which will provide guarantees, and about which one can reson with logic. (A fuzzy system might be good for finding a proof -- but then it should be able to go back and justify each deduction logically to produce a proof in the unifying HOL language which anyone can check) Any real SW system will work not by believing anything it reads on the web but by checking the source of any information. (I wish people would learn to do this on the Web as it is!). So in fact, a rule will allow a system to infer things only from statements of a particular form signed by particular keys. Within such a system, an inconsistency is a serious problem, not something to worked around. If my bank says my bank balance is $100 and my computer says it is $200, then we need to figure out the problem. Same with launching missiles, IMHO. The semantic web model is that a URI dereferences to a document which parses to a directed labeled graph of statements. The statements can have URIs as prameters, so they can may statements about documents and about other statements. So you can express trust and reason about it, and limit your information to trusted consistent data."

1)Toleration of inconsistecy can only be done by fuzzy systems. We need a semantic web which will provide guarantees, and about which one can reson with logic.
Here TBL specifically excludes fuzzy approaches from the semantic web. By extension other statistical and learning based approaches to knowledge systems are also excluded. The reason given is that guaranteed and provable is an absolute requirement. If your app is not guaranteed it is not a semantic web app. This immediately limits the concept of the semantic web to what is computable by logic rather than what is usefully computable by any means.
Sure banking applications do need to be guaranteed, so they should use rules that only operate on provable, trusted statements. But there are loads of application of semantics where usefulness rather guarantees is the goal.
I do not see why it need be one or the other, you just have stricter requirements for proof in a banking app than a fuzzy app. See Semantic Superpositions for thoughts on a semantic web that included fuzziness.

Considering FOPC approaches have been largely discredited in the field of AI and replaced by fuzziness, this would seem a risky limitation to impose.

2)Any real SW system will work not by believing anything it reads on the web but by checking the source of any information. (I wish people would learn to do this on the Web as it is!). So in fact, a rule will allow a system to infer things only from statements of a particular form signed by particular keys. Within such a system, an inconsistency is a serious problem, not something to worked around
The necessary consequence of 1). is, as TBL states here, that in any SW system an inconsistency is a serious problem. Because of the guaranteed requirement, it isn't even enough that the data is accidentally consistent, it must be logical consistent i.e you will only encounter an inconsistency if there is a programming fault or corruption, standard user action should not be a factor. That is, the statements a SW app is using must be guaranteed consistent.

This means semantic web applications are quite fragile, the larger the scale the harder to maintain consistency in practice, whereas statistical approaches work the opposite way, the larger the scale the better they work. 

Any SW application therefore requires there to be only one version of the truth, i.e. it can only work with consistent statements.  However, there are many things we wish to describe where there is no one version of the truth.
Here is the rub; this is a result only of the requirement to be logically guaranteed. There are many computational approaches that can operate on inconsistent statements, fuzzy system, statistical approaches, neural networks. These can mine huge value out of those statements. None of that is possible with Semantic Web applications (as defined above), all those rich patterns must be collapsed into a single consistent version of the truth before the application can operate on it. The Google approach to spell checking is a great example of using such statistical approaches rather than logic to programme the spell checker.


The requirement for consistency in practice is very tough because humans are in the loop of data. Here we run straight into the fact RDF is designed to allow multiple agencies to make statements about the same thing. Even if two agencies are using the same URI and the same definition of a particular property, when users come to enter data and have to make classification decisions based on that URI description, the users will not classify the same thing in exactly the same way. The URI is not an authority, it cannot guarantee consistency between agencies e.g. you cannot show two copies of Harry Potter to the Editions URI and ask it if they are different editions or the same. People make that call according to there own interpretation of the description of the Concept. 
Reversing that around, if you receive two statements about the number of Editions that exist for a Harry Potter book and one states 1 edition and the other states 2 editions. The only way to arbitrate between them is to get the actual real books out and examine them against your own interpretation of the URI definition.
What I have described above is the fact that single authorities only make sense for certain classes of problem. i.e. where there is only one version of the truth. They make perfect sense for bank accounts, in the library domain, each library has an equal right to make statements about a book whilst cataloguing it so there is no concept of one authority. Similarly, who is the authority that decides if a photo is a smiling face or a sad face.

The result of all that is that to guarantee consistency, for a particular SW system, there can be only one authority for statements or else inconsistency will arise from user actions. This allows any conflict to be resolved by asking the authority to decree. Note also, that it is not good enough that statements don't conflict with published statements from the authority, the authority may not have published all possible statements, statements must actually agree with statements made by the authority.

TBL also says

"

A semantic web is not an exact rerun of a previous failed experiment

Other concerns at this point are raised about the relationship to Knowledge representation systems: has this not been tried before with projects such as KIFand cyc? The answer is yes, it has, more or less, and such systems have been developed a long way. They should feed the semantic Web with design experience and the Semantic Web may provide a source of data for reasoning engines developed in similar projects.

Many KR systems had a problem merging or interrelating two separate knowledge bases, as the model was that any concept had one and only one place in a tree of knowledge. They therefore did not scale, or pass the test of independent invention. [see evolvability]. The RDF world, by contrast is designed for this in mind, and the retrospective documentation of relationships between originally independent concepts."

3) They therefore did not scale, or pass the test of independent invention
For any SW app to have guaranteed consistency,  independent invention is not possible because you would need to force all statements from two separate agencies to be the same, and that means they are not independent at all i.e. one agency is not free to act independently of another because that will cause inconsistency.
It then rather seems that for all intents and purposes that independent descriptions are excluded from any particular SW app by the requirement to achieve consistency, exactly how is does a semantic web app then differ from those failed experiments?

 

To sum up, I can't understand why the semantic web (at least as described by TBL) should exclude any approach based on fuzziness, statistics and inconsistency. The requirement of consistency, when taking statements from different systems, cannot be met because humans cannot be made to all agree on classification statements(what ever training or manuals you give them) and therefore will make inconsistent statements through their use of the computer systems. Whilst RDF is free to describe all the variety in the world, the Semantic web application can only make use of the tiniest portion of it.

From some of the comments I have received, clearly some people agree with the TBL vision and others don't.
In the end I guess it doesn't really matter. People will use RDF to do cool things and call them semantic apps even if they don't accord to TBL FOPC requirement for Proof.  I do think it is at the basis of a lot of sceptism from outside the Semantic Web community though, given the spectacular failure of FOPC to scale in previous attempts by the AI and KR communities. It might be an idea to really present this stuff clearly to either face up to this criticism or prove it false.

I personally have had enough of this topic now and am going to think about other things for a while :-)

Thanks to all those who have contributed to the discussion. I'm sure there are lots of people out there who will disagree with things I have said above.  Just goes to show how hard it is to get people to share the same concept of things, the world is fuzzy after all.

Posted on Wednesday, August 10, 2005 at 06:27AM by Registered CommenterJustin Leavesley in | Comments5 Comments

Schrödinger's Web

Looking back at my post Perfect or Sloppy- RDF,Shirky and Wittgenstein and Danny's detailed response, wittgensteins-laptop  (sorry you lost the original post Danny), a couple of things are clear. I didn't do a good job of explaining what I think the issue is and it was a bit them and us(not the intention).

Ian also made a good point. I should clarify that the issues that were bugging me are not with RDF itself but with the layers further up the Semantic Web Stack specifically the logic and proof layer built on top of RDF.

I would like to describe how I understand the proposed Semantic Web Stack and ask the community how certain questions have been covered off. It may be that I misunderstand the vision or that the questions I have  have already been covered off.

As I understand it, the RDF and Onotology layers allow graphs of statements to be made and linked together. Multiple descriptions of a concept can be made and RDF allows inconsistency. The query level allows portions of graphs to be selected or joined together and the logic level allows knew knowledge to be inferred from the statements and questions to be answered using the mass of RDF statements. I understand this Logic to be first order predicate calculus(FOPC)?.

My concern is that the logic layer is very intolerant of inconsistency or error. From what I have been able to find it seems the proposed answer to this is to limit the scope of the logic to trusted consistent statements or user arbitration of conflicts will be required. This is the root of my concern, I cannot see how this is possible. Inconsistency is not just generated at the system logic or schema level it is deeper. It is the necessary result of allowing multiple descriptions of the same thing.

Inconsistency will always arise when ever humans have to make classification choices. This was one of the points in my previous post.

Danny was quite right to point out that most software today requires consistency. We all know the length programmers go to to ensure consistency and this is because programmatic methods are based on predicate logic. If a program enters an inconsistent state, usually that thread of execution must end. If the inconsistency is in persistent data you are in real trouble because restarting won't fix the problem.

Compilers enforce the consistency of the code but the data in the system must also be consistent if programmatic LOGIC is to be based on it.

Two principal methods are used to achieve this:

1. Limit to one description of an entity I.e no competing descriptions e.g one record per entity ID.

2. Fields marked as non programmatic e.g. Text descriptions. The contents of these fields will not be used by the program logic, they are for human use only.

With this approach any uncertainty in programmatic fields cannot generate inconsistency, principally because there is only one version of the truth i.e. statements are orthogonal.

Now contrast this with the semantic web where by definition you will be working with descriptions from many systems. Inconsistency will a natural feature not an error condition.

Note the fundamental nature of the inconsistency, it is not a property of the different systems, two identical systems will still yield inconsistency, because it is a function of how people use a system, not the system itself .

I confused the previous example by suggesting two different systems with slightly different schemas.

This time consider two identical library systems, both which have a schema with the concept of editions of a work , both of which are defined by the same RDF URI. In one system the librarian considers an edition to be when ever there is difference such as the two different covers for the same Harry Potter and catalogues accordingly. The librarian using the other system thinks it is a different edition only if the contents are different.

Now, taking descriptions from both systems you will get an inconsistency. Does the work Harry Potter have 1 or 2 editions.

This is not something you can fix by giving a different URI to the editions concept for each system because the inconsistency is result of the classification decision made by that person for that record in that system at that time i.e. it is not systematic. The result is that inconsistency will arise in an unpredictable way even between identical software systems with identical schemas. (it is one reason why integration of different systems still remains a pain even if you use RDF).

This inconsistency isn't a problem in its own right. But if layers of predicate logic are working off this data then it will become unstable very quickly.

My current understanding is that the SW community are suggesting that either inconsistency is avoided (how as it is a fundamental result of allowing multiple descriptions of the same thing) or that the system should ask a user at that point to arbitrate (on what basis should they choose one over the other? They are both right).

It strikes me that if inconsistency is fundamental then it should be treated as such, not something to be avoided.

Isn't the SW approach today, based on predicate logic, simply using the wrong maths? Just as the AI community was before it embraced fuzzyness, uncertainty and statistics? Or the classical physics community before quantum mechanics?

That transition saw AI moving from "programming" AI systems with rules and logic to creating learning systems that needed training.
It seems to me that the internet has 1 billion users capable of training it. We see examples of this in things like Google spell checking, which, rather than creating a traditional dictionary is based upon what people type and then retype when they can no results. When a spelling suggestion is given, if the user chooses it this provides further feedback or training as to what is useful spelling help and not. This turns out to work much better than the programmed approach. Other examples that spring to mind in del.icio.us and Flickr.

Realising that a work both has 1 and 2 editions at the same time seems to me to be exactly the position classical physics found it self in at the birth of quantum physics. The maths of classical physics could not cope with particles being at several locations at the same time. Neither could the classical physicist!

A new maths was required. One based upon uncertainly and probability. This maths is very well understood and forms the basis of solid state physics upon which electronic engineering is based upon which of coarse the computer is based!

So I guess my question is this: Is the logic layer intended to be FOPC and if so why. Who is ensuring that the SW community isn't falling into the same traps the AI community did? What can be learned from the AI community?
What is the problem of using probability based maths, it works for physics!

Maybe all of these have good answers. If so I wasn't able to easily find them. Or my understanding of the logic layers is wrong?

Please let me know.

 

 

Posted on Sunday, August 7, 2005 at 10:53AM by Registered CommenterJustin Leavesley in , , | Comments18 Comments | References1 Reference

Perfect or sloppy - RDF, Shirky and Wittgenstein 

The first thing to say here is that this is not an attack on RDF. I do think RDF is great and very useful.
But when I read various blogs from the semantic web community using a trivial argument to debunk Clay Shirky's essay I have to come to his defence.

Clay and Adam Bosworth are smart people, don't you think they understand that you can have multiple different RDF descriptions of the same concept? Of course they do. The point is that this STILL creates a single ontology (RDF itself if you like) because RDF is based on the identity of concepts not the comparability of concepts.
This point is profound and subtle. The same and similar are worlds apart. No two things in our world are the same.

It essential hinges on this, do you believe two people have ever in the history of humanity shared the same (i.e identical) concept. Do you believe that concepts exist as perfect entities that we share or infact do we say a concept is shared when we see a number of people using words in a similar enough way. i.e is the world fuzzz, sloppy and uncertain or is it perfect? Are concepts A Priori or derived?

So I do not think the Semantic Web community is hearing what Wittgenstein and Shirky are saying. There is a subtle yet very profound error in the arguments for RDF and the semantic web.

The artificial intelligence community fell into exactly the same hole, many AI efforts were built on the premises that they just needed to collect enough assertions into one system and they would then be able to use propositional logic to infer answers to questions. The results were poor unless the system was kept trivial.
This seems to be exactly what the semantic web community is trying to recreate, the web contains the assertions in RDF, we pull them together and into a central system (exactly as the AI guys did) and bobs your uncle.
The reasons it didn't work is the same reason that RDF exactly doesn't equate to the Wittgenstein view, and of the islands of meaning analogy.
The AI community tried propositional logic and it failed them. They discovered the need to develop means of dealing with uncertainty, incompleteness, fuzziness because that is how our world is and how we describe it with proposition. Fuzzy logic and neural networks rule modern expert systems not discrete propositional logic.

Even Wittgenstein himself had to go through a similar trial of propositional logic in his great work Tractatus Logico-Philosophicus, which after completing he realised the limitation of that approach and described it thus "the propositions of the Tractatus are meaningless, not profound insights, ethical or otherwise ". He then went on to develop his famous works on the role of language and meaning.

This is the essential error that Wittgenstein points out in his later work. There is no single shared meaning that we all can describe in our different ways. To believe so is to believe that a meaning exists A Priori and that language is just our means of describing it. Instead Wittgenstein turns it on its head and says, meaning is nothing more than the way a word is actually used by people. Now two people let alone two groups ever use a word in exactly the same way. The world is continuous yet we break it up into discrete concepts, however the exact boundary between these concepts is fuzzy and vague. Each persons concept is a slightly wider or narrow than somebody else's. I might say "that is sleet" where as some else might say "that is snow", where is the boundary between sleet and snow or chair and stool.
The truth is no two peoples concepts of anything are identical,.... but they are comparable. The fact that concepts are comparable but never identical is why fuzzy logic, uncertainty and incompleteness needed to be the corner stone of the AI approaches not propositional logic. This is what Clay is talking about.

You say of RDF
"It allows you to describe something and then relate it to another person's description of the same thing that was made using _different terms_"

But this is exactly the error, RDF requires these two descriptions to be about an identical concept if you are to relate the two descriptions.
RDF is fundamentally built upon the premise that two different groups or individuals can describe an identical, not similar or comparable but identical, concept; it doesn’t allow for fuzziness.

Here is an example from a very well defined domain. Two different RDF descriptions of Harry Potter and the prince of darkness exist. Both include many concepts like publication date (is that the date it is first printed, or warehoused or in the shop or the ISBN is registered??) and they share the same concept called Editions sharing the same URI. They have several other differences but at least they share the concept of Editions. Problem is, when exactly is a book a new edition? There are two different covers for this work, adult and child but the content is the same. Some librarians call these two different editions others say it is one edition because the content is identical. So you have a contradiction between these two because in reality the concept Edition, like all concepts, is fuzzy.

The natural reaction to this fuzziness by the RDF community is to create ever more fine grained descriptions, so separate editions with just cosmetic changes from those with content changes and so on. But this just makes the problem worse. The more accurate you try to make the description the more erroneous it becomes.
If you examine the details of exactly what any persons means by and concept you will find they are all different, exactly when does a stool become a chair.
I have seen some truly amazing ontologies with such fine grained concepts that I certainly couldn't say what the differences were meant to be.

So I say to the semantic web community "Don't you think the problem is more fundamental that a shared data model? Is it not that fact that our world is fuzzy, the way we transcribe it into computers is fuzzy, computers are not fuzzy and don't deal well with similarity. "

Does this mean that RDF and OWL are useless. Of course not. They will solve many problems, but as Tim Berners-Lee admits himself (http://www.w3.org/DesignIssues/RDFnot.html search for FOPC), only over trusted consistent data. Which is Clay's point, there is not much of that!

Posted on Tuesday, August 2, 2005 at 11:12PM by Registered CommenterJustin Leavesley in , | Comments9 Comments | References2 References