Building a Metadata-Based Website
The online world has been flooded in recent years with talk of metadata, structured authoring, and cascading style sheets. The idea of a semantic web is gaining momentum. At the confluence of these two broad categories of activity, new models of websites are emerging.
Lider, Brett and Anca Mosoiu. Boxes and Arrows (2003). Design>Web Design>Information Design>Metadata
Developing and Creatively Leveraging Hierarchical Metadata and Taxonomy
In content metadata and hierarchies, you will often find a goldmine of implicit and explicit data that you can leverage to creatively contextualize content. After a brief introduction on taxonomy and metadata, this article focuses on finding and utilizing such relationships in hierarchies.
Ricci, Christian. Boxes and Arrows (2004). Design>Web Design>Information Design>Metadata
An Evaluation of Document Keyphrase Sets 
Keywords and keyphrases have many useful roles as document surrogates and descriptors, but the manual production of keyphrase metadata for large digital library collections is at best expensive and time-consuming, and at worst logistically impossible. Algorithms for keyphrase extraction like Kea and Extractor produce a set of phrases that are associated with a document. Though these sets are often utilized as a group, keyphrase extraction is usually evaluated by measuring the quality of individual keyphrases. This paper reports an assessment that asks human assessors to rate entire sets of keyphrases produced by Kea, Extractor and document authors. The results provide further evidence that human assessors rate all three sources highly (with some caveats), but show that the relationship between the quality of the phrases in a set and the set as a whole is not always simple. Choosing the best individual phrases will not necessarily produce the best set; combinations of lesser phrases may result in better overall quality.
Jones, Steve and Gordon W. Paynter. Journal of Digital Information (2003). Design>Web Design>Information Design>Metadata
Automated classification tools can't solve today's large-scale web and intranet indexing challenges alone. Neither can humans. But solutions that integrate human expertise with software products such as Interwoven's Metatagger and Autonomy's Categorizer can provide real value and savings. After a brief introduction to automated classification, this white paper discusses the benefits and limitations of manual, automated, and hybrid approaches. It explores the opportunities for leveraging controlled vocabularies and thesauri to produce more effective indexing solutions.
Hagedorn, Kat. DLib Magazine (2001). Design>Web Design>Information Design>Metadata
FacetMap is both a data model and a software package, created to let users browse complex metadata while retaining a simple, familiar, menu interface.
FacetMap (2003). Design>Information Design>Metadata>Web Design
This article provides an overview of work completed at Tsinghua University Library in which a metadata framework was developed to aid in the preservation of digital resources. The metadata framework is used for the creation of metadata to describe resources, and includes an encoding standard used to store metadata and resource structures in information systems. The author points out that the Tsinghua University Library metadata framework provides a successful digital preservation solution that may be an appropriate solution for other organizations as well.
Niu, Jinfang. D-Lib Magazine (2002). Articles>Information Design>Web Design>Metadata
Towards a Core Ontology for Information Integration 
In this paper, we argue that a core ontology is one of the key building blocks necessary to enable the scalable assimilation of information from diverse sources. A complete and extensible ontology that expresses the basic concepts that are common across a variety of domains and can provide the basis for specialization into domain-specific concepts and vocabularies, is essential for well-defined mappings between domain-specific knowledge representations (i.e. metadata vocabularies) and the subsequent building of a variety of services such as cross-domain searching, browsing, data mining and knowledge extraction. This paper describes the results of a series of three workshops held in 2001 and 2002 which brought together representatives from the cultural heritage and digital library communities with the goal of harmonizing their knowledge perspectives and producing a core ontology. The knowledge perspectives of these two communities were represented by the CIDOC/CRM, an ontology for information exchange in the cultural heritage and museum community, and the ABC ontology, a model for the exchange and integration of digital library information. This paper describes the mediation process between these two different knowledge biases and the results of this mediation - the harmonization of the ABC and CIDOC/CRM ontologies, which we believe may provide a useful basis for information integration in the wider scope of the involved communities.
Doerr, Martin, Jane Hunter and Carl Lagoze. Journal of Digital information (2003). Design>Web Design>Information Design>Metadata
I have long wondered why government web sites all over the world tend to use metadata of several different types jumbled together and overlapping. For example, pages with two description metatags or two or three title tags are common. I suspect that most of the replication and confusion has developed for historical reasons.
McAlpine, Rachel. Quality Web Content (2005). Articles>Web Design>Information Design>Metadata
Unraveling the Mysteries of Metadata and Taxonomies
Recently Boxes and Arrows caught up with Samantha Bailey, formerly at Argus and current lead IA for Wachovia Corporation's Wachovia.com website. She talks about the transition from being a consultant to an 'innie' IA, unravels the mysteries of metadata and taxonomies and shares her vision of the future of IA.
Wodtke, Christina. Boxes and Arrows (2002). Design>Web Design>Information Design>Metadata
Hace casi tres años comentábamos que la promesa de la web semántica era convertir la red en 'un espacio auto-navegable y auto-comprensible.' ¿Dónde estamos hoy en día?.
Dursteler, Juan Carlos. InfoVis (2003). (Spanish) Articles>Information Design>Web Design>Metadata
Western States Dublin Core Metadata Best Practices 
This document of best practices offers assistance in creating metadata records for digitized resources using the Dublin Core element data set.
Colorado Digitization Program (2000). Design>Web Design>Information Design>Metadata
The Folksonomy Tag Cloud: When is it Useful?

The weighted list, known popularly as a `tag cloud', has appeared on many popular folksonomy-based web-sites. Flickr, Delicious, Technorati and many others have all featured a tag cloud at some point in their history. However, it is unclear whether the tag cloud is actually useful as an aid to finding information. We conducted an experiment, giving participants the option of using a tag cloud or a traditional search interface to answer various questions. We found that where the information-seeking task required specific information, participants preferred the search interface. Conversely, where the information-seeking task was more general, participants preferred the tag cloud. While the tag cloud is not without value, it is not sufficient as the sole means of navigation for a folksonomy-based dataset.
Sinclair, James and Michael Cardew-Hall. Journal of Information Science (2008). Articles>Web Design>Information Design>Metadata
It's Time To Get Serious About Metadata
When it comes to the Web, there is nothing more misunderstood than metadata. Technical people search vainly for a way to automate its creation. Many editors and writers want nothing to do with it. And yet without quality metadata a website cannot properly achieve its objectives. It’s time to get serious about metadata.
McGovern, Gerry. New Thinking (2004). Articles>Web Design>Information Design>Metadata
Unlike a simple hierarchical scheme, faceted classification gives the users the ability to find items based on more than one dimension. For example, some users shopping for jewelry may be most interested in browsing by particular type of jewelry (earrings, necklaces), while others are more interested in browsing by a particular material (gold, silver). “Material” and “type” are examples of facets; earrings, necklaces, gold, silver are examples of facet values.
Adkisson, Heidi P. Web Design Practices (2005). Articles>Web Design>Information Design>Metadata
XML Transformation and Metadata Repositories Enable Information Integration
Among the popular emerging integration needs in the market today is information aggregation, normalization, and presentation from multiple back-end data sources to front-end applications. Termed Enterprise Information Integration by some vendors in the market, this type of solution relies on a centralized common object model to provide a data access interface to client applications. Applications can used this common interface to request data from one or more data sources in a single query, with the intricate details of resolving the query left to the integration tool. This session will explain the architecture of an enterprise information integration solution in general, highlight some of the vendors and their approaches in this market space, and explain the use of such as solution through a real-world example with a large financial services organization.
Gantz, Stephen. IDEAlliance (2004). Articles>Web Design>Information Design>Metadata
In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality. It allows for distinguishing between eight of the most relevant functional classes of Web sites. We show that a pre-classification of Web sites utilizing structural properties considerably improves a subsequent textual classification with standard techniques. We evaluate this approach on a dataset comprising more than 16,000 Web sites with about 20 million crawled and 100 million known Web pages. Our approach achieves an accuracy of 92% for the coarse-grained classification of these Web sites.
Lindemann, Christoph and Lars Littig. WWW 2007 (2007). Articles>Web Design>Information Design>Metadata
The web is designed to be consumed by humans, and much of the rich, useful information our websites contain, is inaccessible to machines. People can cope with all sorts of variations in layout, spelling, capitalization, color, position, and so on, and still absorb the intended meaning from the page. Machines, on the other hand, need some help. A new kind of web—a semantic web—would be made up of information marked up in such a way that software can also easily understand it. Before considering how we might achieve such a web, let’s look at what we might be able to do with it.
Birbeck, Mark. List Apart, A (2009). Articles>Web Design>Information Design>Metadata
There are 17 readers currently online: 1 registered user and 16 guests. Register.

![]()
![]()


![]()
![]()
![]()