Data Analysis
Data Analysis Overview
The data analysis assignments will be solved during the training course to provide a better understanding of the lectures presented during the theory sessions. The assignments include performing a discrimination detection and mitigation task that will be performed on the semi-synthetic datasets produced by the FINDHR project.
Data Analysis Structure
The data analysis session will cover the following topics:
- Data Pre-processing and Preparation
- Ranking process
- Fair Ranking process
- Explanation of Ranking process
This session will present solutions to a given assignment based on one of the case studies by exploiting data analytical flows covering the above topics.
Data Analysis Format
The training targets both technical and non-technical audiences. Hence, students will be separated into two tracks: track A, which is designed for non-programmers, and track B, which is designed for programmers.
Track A
Track A, designed for non-programmers, makes use of KNIME which is an extensible, open-source visual analytics tool that constructs analytical workflows by concatenating nodes which execute a specific functionality. In the platform, there are available different nodes. For example, we have nodes for accessing data sources in several types of data formats, nodes for training a machine learning model and nodes for making a prediction by exploiting a specific learnt model, etc.
KNIME also includes extensions on explainability, to which a new extension designed specifically for ranking and algorithmic fairness in rankings is added.
Track B
Track B, designed for programmers, will make use of interactive notebooks in the Python programming language and open-source libraries for algorithmic fairness analysis and explainability.