Metadata is "data about data," of any sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema. Metadata is used to facilitate the understanding, characteristics, and management usage of data.
I’ve been thinking about one particular artifact of the folksonomy phenomenon — the folksonomy menu that serves as a sort of buzz index providing users with a quick visualization of the most popular tags (technically I think it’s called a weighted list). Popular tags are displayed in a larger font and it’s relatively easy to identify hot topics at a glance. This visual representation of the popularity of any given tag is undeniably cool. However, once the coolness factor wears off it becomes fairly obvious that these menus are also not very accessible.
The affinity diagram, or KJ method (after its author, Kawakita Jiro), wasn't originally intended for quality management. Nonetheless, it has become one of the most widely used of the Japanese management and planning tools. The affinity diagram was developed to discovering meaningful groups of ideas within a raw list. In doing so, it is important to let the groupings emerge naturally, using the right side of the brain, rather than according to preordained categories.
The Atom API is an emerging interface for editing content. The interface is RESTful and uses XML and HTTP to define an editing scheme that's easy to implement and extend. History, basic operation, and applications to areas outside weblogs will be covered.
Are you a Web publisher or Web submitter who uploads documents manually? The primary role of a Web publisher is to publish content on the Web. The content being published could be in different file formats like html pages, pdf documents etc. A typical process involves a Web publisher uploading a document, entering the metadata for the document, and posting it to the Web. This effort is time-consuming. Is there any automated tool that can help Web publishers submit several documents automatically without having to enter the metadata of the documents manually and processing them? The answer is yes.
In the information age it is widely understood that there is now too much information. Some of this newly created information will most certainly be valuable, but despite marked improvement in search tools, finding the valuable information is a slow panhandle. Perhaps in light of this situation, the W3C under the direction of Berners-Lee has begun to build the foundation for the next phase of the web. This phase, called the Semantic Web, will make information stored with this technology much more processible by machines.
In this article I will look at the doctype in a lot more detail, showing what it does and how it helps you validate your HTML, how to choose a doctype for your document, and the XML declaration, which you’ll rarely need, but will sometimes come across.
Los metadatos son información relativa a otra información. Al definir un grupo de metadatos para un objeto dado, estamos describiendo el objeto en cuestión, lo estamos caracterizando. Por ejemplo, HTML permite definir metadatos para una página web a través de su etiqueta . Esos metadatos (author, keywords...) caracterizan la página, describen su contenido. Los metadatos, utilizados tradicionalmente en el entorno bibliotecario, están resultando de gran utilidad en la Web, tanto en Sistemas de Recuperación de Información (back-end) como en Sistemas de Navegación (front-end).
XFML (eXchangeable Faceted Metadata Language), creado por Peter Van Dijck, es un lenguaje o vocabulario con sintaxis XML para definir, distribuir e intercambiar metadatos en forma de taxonomías o clasificaciones facetadas.
La classificazione rappresenta un investimento che comporta dei costi nel breve termine, ma che dà anche notevoli frutti nel lungo termine (se impostata correttamente). Fra i sistemi di classificazione, quello a faccette (o multidimensionale) è sicuramente il più potente e versatile (nonostante gli schemi affermatisi come standard nella maggioranza delle biblioteche sono assai distanti da quello a faccette).
In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality. It allows for distinguishing between eight of the most relevant functional classes of Web sites. We show that a pre-classification of Web sites utilizing structural properties considerably improves a subsequent textual classification with standard techniques. We evaluate this approach on a dataset comprising more than 16,000 Web sites with about 20 million crawled and 100 million known Web pages. Our approach achieves an accuracy of 92% for the coarse-grained classification of these Web sites.
Once upon a time, we were curious and everything we encountered was new. We were excited about discovering new things and the world offered unlimited possibilities. Then we went to school and were taught to color inside the lines, that everything had its place and the world was ordered.
The Semantic Web really is an attempt to reconceptualize and reengineer AI for the Web. Discusses the path forward for successfully selling and developing Semantic Web technology into industry.
A controlled vocabulary makes a database easier to search. Since we have many different ways of describing concepts, drawing all of these terms together under a single word or phrase in a database makes searching the database more efficient as it eliminates guess work. However, arriving at this efficiency requires consistency on the part of the individual indexing the database and the use of pre-determined terms.
The prevalance of digital information raised issues regarding the suitability of conventional library tools for organizing information. The multi-dimensionality of digital resources requires a more versatile and flexible representation to accommodate intelligent information representation and retrieval. Ontologies are used as a solution to such issues in many application domains, mainly due to their ability explicitly to specify the semantics and relations and to express them in a computer understandable language. Conventional knowledge organization tools such as classifications and thesauri resemble ontologies in a way that they define concepts and relationships in a systematic manner, but they are less expressive than ontologies when it comes to machine language. This paper used the controlled vocabulary at the Gateway to Educational Materials (GEM) as an example to address the issues in representing digital resources. The theoretical and methodological framework in this paper serves as the rationale and guideline for converting the GEM controlled vocabulary into an ontology. Compared to the original semantic model of GEM controlled vocabulary, the major difference between the two models lies in the values added through deeper semantics in describing digital objects, both conceptually and relationally.
You have probably heard information architects discussing the benefits of their latest taxonomy project and how you should be implementing one. But how, you might wonder, can you get started? In the next installment about Controlled Vocabularies, our authors go into detail about one methodology.
With the overall purpose of improving the information literacy skills of librarianship and information science students, an academic portal specifically centred on abstracts and abstracting resources is proposed. We take the existing literature, together with our knowledge and experience of abstract/abstracting topics and web-based technologies to conceive the research design. The research mainly consists of the selection, assessment and web-display of the most relevant abstracts on knowledge management, information representation, natural language processing, abstract/abstracting, modelling the scientific document, information retrieval and information evaluation. The resulting Cyberabstracts portal presents its products consistently and includes reference, abstract, keywords, assessment and access to the full document. Improvement opportunities for this unique subject-based gateway, representing much more than a mere subject catalogue, are uncovered as the starting point on a planned route towards excellence.
This paper outlines the assumptions, process and results of a pilot study of issues of interoperability among a set of seven existing controlled vocabulary schemes that make statements about the audience of an educational resource.
I've been thinking a lot about metadata recently, but not from the standpoint of XML or programming or helping to organize and index data. My interest is in the future of content ownership, delivery, and value. I see a future for media that looks very different from the media of today. The germ of this idea actually came from my experiences with online movie rentals.
Since BBCi launched in November 2001, its search offering has been collecting data on the way that BBC website users search both the BBC's website, and through its homepage Websearch , the whole wide web. Given such a mass of data, the easiest way to aggregate and make sense of it has been to measure the search terms that are most popular. Indeed, the BBCi homepage has a panel displaying the three most popular search terms of the moment, and an editorial and taxonomy team at the BBC constantly monitor the searches gaining high volume in order to match the correct content to them.
An XML document is considered 'well written' when its syntax is correct, and 'valid' when it respects a document model. While a document must be 'well written,' it does not necessarily have to be 'valid.' However, as XML is a meta language, there are an infinite number of XML formats, and most XML documents should respect a particular document model, which can be defined in one of two ways: By a Document Type Definition (DTD); By an XML Schema. In this article, we are going to look at how you should go about implementing the former, using a DTD.
As long as people have been collecting information together, be it in the form of a library, an institutional filing system, a collection of accounting records or whatever, they've needed to come up with ways to help them know how to properly file and retrieve documents. These systems needn't involve any high technology.