Glossary

Metadata

Metadata is data about data. It’s quite simple, really. Learn more about how it’s used within.

Metadata is data about data. For any piece of data, there is typically lots of metadata. Some common types of metadata found on HASH include:

→
Provenance metadata – information such as the timestamp of creation, the datetime a file was last modified, the identity of a file’s creator, and other information about the data’s origin
→
Quality metadata – a margin of error, confidence interval, or probability score that indicates the likely accuracy of the data
→
Statistical metadata – information that describes the process (e.g. data pipeline) that produced data
→
Legal metadata – the copyright holder, and any licensing terms data may be available under
→
Security metadata – logs recording attempts to access data, and information regarding authorized users

Metadata is useful because it provides context to the data we use.

Some metadata is attached automatically by systems such as HASH when users perform certain actions (e.g. creating a file, connecting a datasource, constructing a flow, or editing a row in a dataset).

Other metadata can be added manually. For example, mapping data to schemas within HASH is an example of purposefully attaching metadata that describes the type of columns in a dataset, and the properties of the agents or events those columns represent.

A multitude of standards exist for describing different possible and expected types of metadata, which can in turn be used by business software to provide improved functionality or interoperability.

Create a free account

By signing up you agree to our terms and conditions and privacy policy