Thanks for the great write up.
I’ve often used the term folskonomy when introducing people to semantics. Tags are simple concepts to understand, but more specifically, to use. My hope here is that once people start to use folksonomies to categorise content, the leap to using a vocabulary to “tag” more granular ideas in that content will be easier.
It’s true that for data integration, meshups don’t require semantics. This is because integration is performed manually, on concepts that are easily mapped. The most difficult part is the volume of mappings and conversions to perform, not their complexity nor verification. The small number of more-complex definitions are simply done manually as well, albeit with greater detail.
The pervasiveness of semantics is completely overlooked, so much so that they are seen as a black-box without really thinking about the underlining libraries. A simple API call is where a developer’s or a project manager’s understanding needs to stretch, and no further.
Controlled vocabularies are being used for specific reasons and again API consumers only use the bare minimum of what they require to get the job done. Frédérick Giasson’s recent post on ontologies explains the different types of ontologies and how they tie into to the generally accepted understanding. Unfortunately, this understanding usually stops at a folksonomy, a closed-world vocabulary, or Linked Data. Anything above arbitrary semantics is not required for any of these structures.
Your point about the real contribution of Linked Data is spot on. RDF is sufficient for the majority of what people want to use it for. And really, when using it without any semantics, RDF simply becomes a microformat (RDFa) I’ve discussed the limitations of this elsewhere. RDF can easily represent the relations between relational tables and columns. But it would not suffice to represent the business logic. More expressive languages like OWL are required.
Your point about closed-world vs. open-world is important as well. Because most systems are developed with closed-world assumptions (read defaults), using open world semantics is not required. This is one of the key places where we are in fact asking the wrong questions and looking in the wrong places. When systems step out of their application’s domain to perform meshups is the time when they need better semantic technologies. Companies don’t often have the need to do this except in incremental steps. The steps ar so incremental in fact, there is no need to invest the resources to make this process smoother. Manual integration and verification is sufficient.
The Ontology Driven Apps post was a great introduction to this methodology brought on by Nicalo Guarino and Michael Uschold. Thanks for that as well!
The fields of Semantics and Ontologies has consumed my life for the past several years. Why?
They are enormous topics with applicability in every field, not just formal logic or pure academics. They are fundamental to the way we interpret and reason with the world around us, and the amount of information about the world we are to absorb is staggering. Our mobile devices, our activity on the internet, and the increased digitization of our records: have all contributed to the exponential growth in data. Sorting out all this data is an ever increasingly difficult task. And it’s not just the growth of data that is increasing but also the way it evolves from it’s initial capture-state to how we use it.
Fundamentally, when analyzing data you must determine 3 things:
1) What is the nature of the data? Is it:
- user records
- patient records
- news articles
- individual tweets
- sensory data from weather stations?
All must be treated differently.
2) Are there any patterns in the data?
3) How does the data relate to itself and to other information sources?
- What does the data tell us and what can we learn from it?
- Can the correlation of patient symptoms tell us something about the causation of their condition?
- How accurate are weather predictions based on past records?
Question (1) deals with questions such as (in the weather domain) “What is snow?”, ”What is temperature?”, and maybe even “What is low temperature?”. Humans have no problem answering these questions, but machines run into several issues mainly becuase “Snow” is just a 4 character label. This label is not enough to encapsulate all that is “Snow”. The technology behind Linked Data begins to reveal this information. At the very least, it provides a point of reference as an URI for an intended meaning of “Snow”. This particular DBpedia link provides meta data that is associated with the concept “Snow”. Now anyone in any language can reference Snow and mean the same thing.
This series of posts is all about how semantics can help us deal with information.
The fields of Semantics and Ontologies has consumed my life for the past several years. Why? They are enormous topics with applicability in every field, not just formal logic or pure academics. They are fundamental to the way we interpret and reason with the world around us, and the amount of information we are absorbing about the world is staggering. Our mobile devices; our activity on the internet; the increased digitization of our records: are all contributing factors to the exponential growth in data. Being able to understand this data is becoming an increasingly difficult task. More on data later, for now I present what semantics are in a general sense.
The most general and abstract definition of semantics is “the study of meaning”.In this series I will focus on semantics as applied to Logic and focus on its application in Computer Science.
In this context, semantics deal with relations between 2 entities. Take the plus sign “+” in these two examples.
1) 4 + 2 = 6
2) hard work + luck = success
Clearly the semantics of “+” between (1) and (2) are very similar. In (1) we add two numbers to get a number. In (2) we add two nouns to get a noun. How we arrived at the result is different, and depends on the the two elements being added, and what “being added” actually means.
Welcome to bartg.org, the personal blog of Bart Gajderowicz. Here you’ll find the things that interest me with the occasional rant on everything else. I’d love to hear from you so do contact me via email, by posting a comment or via social links on the right. My previous blog has been archived (see below).
Who I am …
I am married to my beautiful and lovely wife Emily with whom I live in Toronto, Canada.
I am a Software Engineer in Semantic Technologies at Eqentia Inc.
I hold a master’s degree from the Department of Computer Science at Ryerson University, focusing on semantic integration and machine learning. Up to recently I was the Senior Research Assistant at the Ubiquitous and Pervasive Computing Lab in my department.
What I do …
My interests include an array of topics, but my time is usually consumed by the technical aspects of information/data/text processing, programming, logic, philosophy, and the social engagement and interaction with technology.
As part of my graduate and undergraduate research, I have studied machine learning, and the various ways it is applied to the areas of ontology matching, search, classification, pattern recognition, data mining, pattern analysis, context awareness and new media.
I enjoy philosophy and have spent the last several years consumed by ontologies, semantics, logic, reasoning, inference, and the technical issues that surround their applications. Not a logician by any means, my research interests include semantic technologies, reasoning and machine learning. Recently I have focused on Description Logics, OWL 2, the now many definitions of the Semantic Web, and incorporating these areas into machine learning.
I enjoy programming and have made Ruby my language of choice. Its simple and natural syntax makes it an absolute pleasure to work with, not to mention makes prototyping concepts dead simple and quick. Professionally I have been a Ruby on Rails developer since 2006. I also program in C/C++ and Java, especially when incorporating existing libraries often written in Java. I need to contribute to Open Source projects more often. Me on github.
How I do it …