The affinity diagram, or KJ method (after its author, Kawakita Jiro), wasn't originally intended for quality management. Nonetheless, it has become one of the most widely used of the Japanese management and planning tools. The affinity diagram was developed to discovering meaningful groups of ideas within a raw list. In doing so, it is important to let the groupings emerge naturally, using the right side of the brain, rather than according to preordained categories.
Los metadatos son información relativa a otra información. Al definir un grupo de metadatos para un objeto dado, estamos describiendo el objeto en cuestión, lo estamos caracterizando. Por ejemplo, HTML permite definir metadatos para una página web a través de su etiqueta . Esos metadatos (author, keywords...) caracterizan la página, describen su contenido. Los metadatos, utilizados tradicionalmente en el entorno bibliotecario, están resultando de gran utilidad en la Web, tanto en Sistemas de Recuperación de Información (back-end) como en Sistemas de Navegación (front-end).
La classificazione rappresenta un investimento che comporta dei costi nel breve termine, ma che dà anche notevoli frutti nel lungo termine (se impostata correttamente). Fra i sistemi di classificazione, quello a faccette (o multidimensionale) è sicuramente il più potente e versatile (nonostante gli schemi affermatisi come standard nella maggioranza delle biblioteche sono assai distanti da quello a faccette).
In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality. It allows for distinguishing between eight of the most relevant functional classes of Web sites. We show that a pre-classification of Web sites utilizing structural properties considerably improves a subsequent textual classification with standard techniques. We evaluate this approach on a dataset comprising more than 16,000 Web sites with about 20 million crawled and 100 million known Web pages. Our approach achieves an accuracy of 92% for the coarse-grained classification of these Web sites.
The prevalance of digital information raised issues regarding the suitability of conventional library tools for organizing information. The multi-dimensionality of digital resources requires a more versatile and flexible representation to accommodate intelligent information representation and retrieval. Ontologies are used as a solution to such issues in many application domains, mainly due to their ability explicitly to specify the semantics and relations and to express them in a computer understandable language. Conventional knowledge organization tools such as classifications and thesauri resemble ontologies in a way that they define concepts and relationships in a systematic manner, but they are less expressive than ontologies when it comes to machine language. This paper used the controlled vocabulary at the Gateway to Educational Materials (GEM) as an example to address the issues in representing digital resources. The theoretical and methodological framework in this paper serves as the rationale and guideline for converting the GEM controlled vocabulary into an ontology. Compared to the original semantic model of GEM controlled vocabulary, the major difference between the two models lies in the values added through deeper semantics in describing digital objects, both conceptually and relationally.
With the overall purpose of improving the information literacy skills of librarianship and information science students, an academic portal specifically centred on abstracts and abstracting resources is proposed. We take the existing literature, together with our knowledge and experience of abstract/abstracting topics and web-based technologies to conceive the research design. The research mainly consists of the selection, assessment and web-display of the most relevant abstracts on knowledge management, information representation, natural language processing, abstract/abstracting, modelling the scientific document, information retrieval and information evaluation. The resulting Cyberabstracts portal presents its products consistently and includes reference, abstract, keywords, assessment and access to the full document. Improvement opportunities for this unique subject-based gateway, representing much more than a mere subject catalogue, are uncovered as the starting point on a planned route towards excellence.
This paper outlines the assumptions, process and results of a pilot study of issues of interoperability among a set of seven existing controlled vocabulary schemes that make statements about the audience of an educational resource.
As long as people have been collecting information together, be it in the form of a library, an institutional filing system, a collection of accounting records or whatever, they've needed to come up with ways to help them know how to properly file and retrieve documents. These systems needn't involve any high technology.
This paper describes a new concept-based multi-document summarization system that employs discourse parsing, information extraction and information integration. Dissertation abstracts in the field of sociology were selected as sample documents for this study. The summarization process includes four major steps — (1) parsing dissertation abstracts into five standard sections; (2) extracting research concepts (often operationalized as research variables) and their relationships, the research methods used and the contextual relations from specific sections of the text; (3) integrating similar concepts and relationships across different abstracts; and (4) combining and organizing the different kinds of information using a variable-based framework, and presenting them in an interactive web-based interface. The accuracy of each summarization step was evaluated by comparing the system-generated output against human coding. The user evaluation carried out in the study indicated that the majority of subjects (70%) preferred the concept-based summaries generated using the system to the sentence-based summaries generated using traditional sentence extraction techniques.
In content metadata and hierarchies, you will often find a goldmine of implicit and explicit data that you can leverage to creatively contextualise content. After a brief introduction on taxonomy and metadata, this article focuses on finding and utilising such relationships in hierarchies.
The Darwin Information Typing Architecture (DITA) is a hot topic among those who author, edit, deliver and manage content. But adopting a standard architecture is an important decision that requires up front research and knowledge of the pitfalls. Find out if DITA is right for your organization. Read this whitepaper to learn more (PDF).
What is Dublin Core? And why would you need a whole conference about it? The end of September and beginning of October brought representatives from various countries around the world to a sunny and warm Seattle, Washington, host of the 2003 Dublin Core Conference.
The 2002 Dublin Core annual conference and workshop marked the beginning of a new effort by the Dublin Core Metadata Initiative (DCMI) to involve members of the corporate world in the evolution and application of the Dublin Core standard. The first meetings of two DCMI Circles of Interest were held on Monday, October 14, 2002, followed the next day by a panel session with several members of the Circles presenting their initial observations and conclusions to the wider conference.
The Dublin Core is currently the best-developed candidate for a simple resource description model for electronic resources on the Web. It represents the results of a three year process of consensus-building through a series of focussed, invitational workshops involving librarians, digital library researchers, and various content specialists from many countries.
This paper presents 'Distributed Active Relationships' (an extension of the Warwick Framework), a general framework for dealing with meta data issues in digital libraries and other information systems. By treating meta data as data, rather than giving it a special distinguished role, arbitrary resources are allowed to be associated with arbitrary relationships.
Metadata is information about information: more precisely, it's structured information about resources. This can be a single set of hierarchical subject labels, such as a Yahoo or Open Directory Project category. More often, the metadata has several facets: attributes in various orthogonal sets of categories. This is often stored in database record fields and tables, especially for product catalogs.
This paper examines user-generated metadata as implemented and applied in two web services designed to share and organize digital media to better understand grassroots classification.
We need a word for the class of comparisons that assumes that the status quo is cost-free, so that all new work, when it can be shown to have disadvantages to the status quo, is also assumed to be inferior to the status quo.
Folksonomies are clearly compelling, supporting a serendipitous form of browsing that can be quite useful. But they don't support searching and other types of browsing nearly as well as tags from controlled vocabularies applied by professionals.
The weighted list, known popularly as a `tag cloud', has appeared on many popular folksonomy-based web-sites. Flickr, Delicious, Technorati and many others have all featured a tag cloud at some point in their history. However, it is unclear whether the tag cloud is actually useful as an aid to finding information. We conducted an experiment, giving participants the option of using a tag cloud or a traditional search interface to answer various questions. We found that where the information-seeking task required specific information, participants preferred the search interface. Conversely, where the information-seeking task was more general, participants preferred the tag cloud. While the tag cloud is not without value, it is not sufficient as the sole means of navigation for a folksonomy-based dataset.
Work with structured abstracts--which contain sub-headings in a standard order--has suggested that such abstracts contain more information, are of a higher quality, and are easier to search and to read than are traditional abstracts. The aim of this article is to suggest that this work with structured abstracts can be extended to cover scientific articles as a whole. The article outlines a set of sub-headings--drawn from research on academic writing--that can be used to make the presentation of scientific papers easier to read and to write. Twenty published research papers are then analyzed in terms of these sub-headings. The analysis, with some reservations, supports the viability of this approach.
Metadata is now both a competitive advantage and a competitive necessity. And if we really want content to be found and audiences to be served and apps and revenue to be created, we'll give metadata — annoying as it may be — the attention it deserves.
Categories are only useful if they meets the needs of the user. I can’t imagine that the variations of what I think of as “Science Fiction books” that were listed in the category are of any use to anyone.
An interview with Kevin Shoesmith about information architecture and the challenge of organizing complicated websites. Shoesmith explains about the importance of metadata, providing user-driven organization, taxonomy vs. folksonomy, the Dublin core, the usability of web menus.