Whether you’re familiar with the term or not, this exploration piece will take you through the ins and outs of metadata, and will let you know why we think it will play an important role in the future of business intelligence.
The actual definition of metadata, unlike many other BI terms, is rather simple: it is data describing other data (Merriam-Webster dictionary)
In other words, it is information that is used to describe the data contained in something like a web page, document, or file. This context is necessary to make the information easier to find, reusable and relevant. It also helps to get a better grip on the information.
Metadata is an important part of any information architecture.
Another way to think of, it is as a short explanation or summary of what the data is. A simple example of metadata for a document might include a collection of information like the author, file size, the date the document was created, and keywords to describe the document. Metadata for a music file might include the artist’s name, the album, and the year it was released.
It represents behind-the-scenes information that’s used everywhere, by every industry, in multiple ways. It’s ubiquitous in information systems, social media, websites, software, music services, and online retailing. Metadata can be created manually to pick and choose what’s included, but it can also be generated automatically based on the data.
Types of Metadata
It comes in several types and is used for a variety of broad purposes that can be roughly categorized as business, technical, or operational.
- Descriptive metadata properties include title, subject, genre, author, and creation date, for example.
- Rights metadata might include copyright status, rights holder, or license terms.
Technical metadata properties include file types, size, creation date and time, and type of compression. Technical metadata is often used for digital object management and interoperability.
- Preservation metadata is used in navigation. Example preservation metadata properties include an item’s place in a hierarchy or sequence.
- Markup languages include metadata used for navigation and interoperability. Properties might include a heading, name, date, list, and paragraph.
The basis of a good metadata approach is the metadata set. This is an overview including the explanation of what it is, why it is used, and in what types of information, it is used.
A lot of it is presented in the selection list or ‘controlled vocabulary. Think of a list of the 12 months of the year. Or a list of target groups. Such a fixed list can help to avoid typos and quickly ‘tag’ information.
If the list threatens to become too large and also has different levels with main and subcategories, then use a taxonomy. A taxonomy is a hierarchical list of nodes. Think of a division of continents, countries, and cities in the world. Or distribution of main and sub-processes or main and sub-themes in an organization. A taxonomy helps to make information relevant and to filter a search result.
Websites use facets. This is a group of metadata fields that are used together to help with filtering information in particular.
Another way to organize it – is a thesaurus. With this, you create semantic relationships between terms, for example, synonyms and ‘related terms’. A thesaurus also provides explanations of terms and preferred terms. You can even make hierarchical relationships with it.
If the above instruments are too limited, an ontology can be a solution. This is an ecosystem of entities connected by all imaginable relationships. This makes an ontology powerful but also complex. To make ontologies you really need a specialist who uses special tools and standards.
When using metadata, make maximum use of standards that already exist in the world and that are also successfully applied. That saves a lot of work and that makes the information easier to find, reusable and relevant because other organizations also use these standards.