Site Archive (Complete)
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
August 07, 2003
AI Expert Newsletter - August 2003

Dennis Merritt
AI Expert Newsletter is all about artificial intelligence in practice. Features include case studies, technology tutorials, product reviews and AI news-plus classic articles from the original AI Expert magazine! Keep up with the latest in logic programming, expert systems, neural networks, genetic algorithms, and fuzzy logic.
Dr. Dobb's AI Expert Newsletter - 8/07/03

AI - The art and science of making computers do interesting things that are not in their nature.


As always, feedback is welcome. dennis@ddj.com


Semantic Web

Clearly an interesting thing to do with the Internet is to create robots that can search out answers to questions. Suppose you wanted to find out who was the editor of the Dr. Dobb's AI Expert Newsletter. Any human could answer that question in a minute or less by finding the DDJ Web site, clicking on newsletters and scanning down for the AI Expert description.

How would we write a program that could answer the same question? Well it wouldn't be easy. It would require using natural language understanding software to scan the document looking for words that might imply it found the editor, assuming it was able to figure out which page to look at in the first place.

It is a difficult program to write because the Web is designed for human use, not machine use.

As with many programming tasks, the problem can be made much simpler with a better choice of data structures. If, in addition to free-form text, a Web site had formal specifications of the content, then writing a program to answer the question becomes almost trivial.

For example, if there was some XML like this at the DDJ Web site:

<site name=ddj>
  ...
  <publication type=newsletter>
    <name>AI Expert</name>
	<description>blah blah blah</description>
    <editor>Dennis Merritt</editor>
  </publication>
  ...
</site>
then it would be easy to write a program to answer the question. Of course, it would help if all Web sites used similar XML so our program could search more than just the DDJ site.

This is exactly what the Semantic Web initiative of the W3C is working on.

Knowledge Representation and Reasoning Engines

An AI application typically has two components: knowledge representation and reasoning engine. The knowledge representation is the semantics used to describe the knowledge in the particular application domain. The reasoning engine then uses that knowledge for the desired result.

The more expressive the knowledge representation, the simpler the reasoning engine can be.

The Semantic Web is an attempt to standardize a flexible, extensible knowledge representation for the Web. Once this is started, a whole new world of applications for the Web will be possible.

The Semantic Web is built on layered technologies.

XML - eXtensible Markup Language

XML is the base technology. XML is a more general purpose HTML where tags can be defined that specify the structure and components of various types of documents. The example used to start this discussion is some made-up XML with tags of my own creation that might describe the content on a Web site.

RDF - Resource Description Framework

The earliest AI researchers found that object-attribute-value triples were a very versatile way to represent knowledge. For example:

car:color:blue
car:doors:4
This, in a nutshell, is what RDF is. Except they don't call them object-attribute-value triples, but rather subject-predicate-object triples. So in the example, car is the subject, color is the predicate and blue is the object.

RDF is very powerful, which means it's not quite as easy to read as the simple example that started this section. The newsletter description in RDF might look like:

<rdf:Description rdf:ID="AIX Newsletter"">
   <example:title>AI Expert</example:title>
   <example:description>blah blah blah</example:description>
   <example:editor>Dennis Merritt</example:editor>
</rdf:Description>

The rdf:ID refers to the subject. In this case it would be an anchor on the DDJ Web site named "AIX Newsletter." There are three subject-predicate-object triples associated with that subject. The predicates are title, description, and editor, with the object value being enclosed in each of their tags.

What does the "example:" part of the syntax refer to? Common definitions of predicates that add universality to RDF.

RDF Schemas

While the predicates in RDF can be whimsical creations of your own design, that renders them relatively useless to anyone but yourself.

RDF provides a means for organizations to create libraries of predicate definitions that can then be used by anyone with information to catalog that could make good use of those definitions. These are often called "metadata."

The Dublin Core is one such set of definitions that is similar to the properties (predicates) used in library card catalogs. We might have used them for the newsletter RDF, in which case we would use "dc:" instead of "example:". We would also have provided some additional RDF syntax that indicated we were using the Dublin Core schema and a link to it.

But schemas and RDF only go so far.

Web Ontology Language (OWL)

It is often necessary to describe the relationships between different predicates, as well as the behavior of a given predicate. (See the June issue for more on ontologies.) Documenting these relationships further extends the power of reasoning software that will use the Semantic Web.

For example, a manufacturing RDF Schema might include the predicate isPartOf. We couldn't make full use of that predicate unless we knew that if X isPartOf Y and Y isPartOf Z then X isPartOf Z. In other words, isPartOf is transitive.

OWL provides the means for adding these higher level semantic descriptions of relationships. Armed with this knowledge, an application could then answer bill of material type questions for our manufacturing site.

RDF Tools

Typing RDF/OWL is tedious business, so a number of tools are being developed to make the creation and editing of RDF/OWL documents easier. See the links for details.

Foundations in Logic

The concepts of RDF and OWL come directly from logic. One can see in the relations/predicates the same roots that led to relational database and to logic programming languages.

The mappings serve it well, as RDF has the potential to be the glue between data on Web sites and in relational databases stored at those sites and logic programming languages used to create intelligent Web robots.

RDF in Action

These examples come from the RDF primer on the W3C site.

Dublin Core Initiative - Definitions of terms about documents, such as author, publisher, etc. This is a replication of the categories used in a library card catalog for deployment on the Web. Documents using the Dublin Core metadata can be searched automatically just as a human would use a card catalog in a library.

PRISM: Publishing Requirements for Industry Standard Metadata - Metadata that builds on the Dublin Core and is defined by the publishing industry to serve their needs. For example it has terms to define the rights associated with a publication that can then be used to automatically search for the rights associated with a given published item. Magazines are using PRISM to document an article as soon as it is published.

RSS: RDF Site Summary - Metadata used to describe news for a news feed. It allows the definition of a site as a "channel" and the latest news items from that channel. Each item has properties like title, description, link and date. A news service can then to go various channels, pick up the latest news items and then redisplay them or use them to answer search queries from their users. This is probably the most widely used RDF application on the Web.

CIM/XML - The Common Information Model (CIM) specifies semantics for power system resources. CIM/XML uses RDF Schema and RDF to describe those semantics and has been adopted as the standard for communication of technical information betwen power transmission system operators.

Gene Ontology Consortium - Created metadata for describing gene products to aid in the distribution and exchange of medical information.

Composite Capabilities/Preferences Profile (CC/PP) - Metadata for the description of components and attributes that can be used dynamically to allow the restructuring of HTML data for a particular device or browser.

Conferences

The 5th IFAC/CIGR Workshop on Artificial Intelligence in Agriculture will be held in Cairo on March 8-10 2004. Deadline for submitting extended abstract is Sept. 30, 2003. More Information can be found at www.claes.sci.eg/aia04

Links

http://www.w3c.org/2001/sw/ - The W3C page describing work on the Semantic Web.

http://www.scientificamerican.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21&catID=2 - An excellent Scientific American article by Tim Berners-Lee, James Hendler and Ora Lassila, describing the Semantic Web

http://www.xml.com/pub/a/2001/01/24/rdf.html - Tim Bray's overview of RDF.

http://www.w3.org/TR/rdf-primer/ - A more technical primer for RDF that provides a good introduction to the syntax and meaning of RDF statements and their expression in XML.

http://owl.mindswap.org/ - The first Semantic Web site?

http://www.cs.umd.edu/projects/plus/SHOE/index.html - Simple HTML Ontology Extensions (SHOE) is a precursor to RDF and OWL, and is easier to understand. The examples in the SHOE tutorial on this Web site make it clear how the Semantic Web will work.

http://www.w3.org/TR/owl-features/ - An overview of OWL, an ontology built on top of RDF.

http://www.w3c.org/RDF/#developers - Resources for developers, listing a number of tools for working with RDF.

http://www.swi-prolog.org/packages/semweb.html - Prolog is a natural language for working with RDF and OWL and the developers of SWI-Prolog have created a tool kit for using RDF and OWL as well as tools for creating and editing RDF and OWL. These are part of SWI's Semantic Web Library.

Until next month,

Dennis Merritt
dennis@ddj.com


TOP 5 ARTICLES
No Top Articles.
DR. DOBB'S CAREER CENTER
Ready to take that job and shove it? open | close
Search jobs on Dr. Dobb's TechCareers
Function:

Keyword(s):

State:  
  • Post Your Resume
  • Employers Area
  • News & Features
  • Blogs & Forums
  • Career Resources

    Browse By:
    Location | Employer | City
  • Most Recent Posts:
    MEDIA CENTER  more
    NetSeminar
    Creating Common and Scalable SOA Solutions for the Enterprise Leveraging an Enterprise Service Router (ESR)
    Creating Common and Scalable SOA Solutions for the Enterprise Leveraging an Enterprise Service Router (ESR) Despite the many emerging instances of SOA today, the ability to leverage common services and a common metadata layer in a secure and scalable manner is paramount, but rarely addressed. In this webinar, Intel discusses the core issues and opportunities behind the quest to provide a common services and information management layer, and explore a new architectural component called an Enterprise Service Router. Thursday, November 13, 2008. 11AM PT/2PM ET
    Next Generation ALM: Automating the Entire Build and Release Process
    As more and more software development shops adopt Agile processes, fully automating the build and release management processes becomes a critical element of Application Lifecycle Management (ALM) strategy. Join Forrester Senior Analyst Jeffrey Hammond and Anders Wallgren, CTO from Electric Cloud, as they discuss release management best practices and how to get started. Wednesday, November 19, 2008. 11AM PT/2PM ET
    Pain Relievers for Continuous Integration and Agile Development
    You're the build manager. You actually build the product that ships. Yet you get no respect. You're on your own to build your toolset. You're expected to provide a Continuous Integration environment - even though you have developers all over the world and a build that takes 14 hours to run. You're expected to be agile. Learn how in a Free webinar on December 3rd at 11am PST from Electric Cloud. Wednesday, December 3, 2008. 11AM PT/2PM ET
                                   
    EVENTS

    Nominations for the Jolt Awards - the “Oscars” of the software development industry - are now open!
    INFO-LINK

    Resource Links:




    Related Sites: DotNetJunkies, SD Expo, SqlJunkies