Aggregation is the process of generalising data or modifying it so that individual people or groups of people are no longer identifiable.


Anonymisation is the process of modifying data so that it can no longer be directly or indirectly linked to an identified or identifiable person. This can be done by either aggregation or by fragmenting individual-level data so that the individual that it pertains to is no longer identifiable (online behaviour related to a single browsing session, for example).


A series of precisely defined orders or measures for the accomplishment of a specific task or process.


A method for examining and analysing raw data to draw conclusions about the information contained in the data.


Short for application programming interface, defines the ways in which different applications can make requests and exchange information, i.e. communicate with one another.

Open data

Public data produced or accumulated by public administration, organisations or companies and that has been opened up in a structured format to be freely utilised by anyone free of charge.


Information that does not necessarily have semantic meaning or an informative order in itself.

Advanced analytics

A broad range of machine learning methods. These include methods by which an algorithm learns to make connections about data independently. In addition to learning, machine learning/AI enable the creation of predictive models.

IoT data

IoT (Internet of Things) means the automated transfer of data between machines via the internet. IoT data is the data collected using said machines, such as the measuring data collected by an on-site temperature sensor.


A view similar to the instrument panel of a car that is used to monitor key business variables in real time.

Machine learning

An activity in which software learns from repeated events or data and modifies its own operation towards a set goal based on feedback received. Machine learning can be used to automate the interpretation of data and expand the perception of software.


Information about data, meaning information that describes or defines a given dataset.


Or microservice architecture, a software development approach in which a large application is built out of modular components or services.


The processing of personal data so that it cannot be directly linked to the data subject without additional identifying information. This additional information must be stored separately from the personal data, in addition to which technical and organisational measures must be implemented to ensure that the information cannot be linked to an identified or identifiable person.

Boundary resource

Agreement-based or other cooperative guidelines and technical software tools and interfaces. They serve as open interaction channels between a digital platform and any third party. Boundary resources are typically available on the internet freely (or for a small fee).

Artificial intelligence

A general term for algorithms that are capable of learning without requiring a human to separately program each action or step.

Data management

The management of data processing in an information system. The creation and processing of document data is steered by a data management plan that produces metadata values and processing rules (JHS 191). The City of Helsinki’s data management plan steers the processing of both digital data and physical material.


Information is created out of data by giving it semantic meaning.


An information entity composed of documents and other corresponding information related to a specific task or service of the authorities. Source: section 2 of the Act on Information Management in Public Administration (906/2019).

Data lake

An information technology architecture solution based on the utilisation of technologies capable of processing vast amounts of data. A data lake can be used to quickly store vast amounts of both structured and unstructured data.

Data model

Describes the data fields in a dataset and how they relate to one another.

Information resource

An entity composed of data used in the carrying out of the City’s tasks or other operations stored in one or more information systems.

Data warehouse

A separate architecture based on relational databases into which structured data scattered across different systems is collected and loaded for reporting, analytics and other purposes.

Master data

Persistently needed basic data that has been identified and shared across an organisation and the existence of which is essential for many of the organisation’s processes and functions.


The capacity of information systems to communicate with one another in a way or to the extent that they are able to routinely use each others’ output.

The utilisation and exchange of information between different information systems in a way that preserves the meaning and usability of the information.