Skip to content

Module 1 - Lesson 3: Probability, randomness, and the risk of de-anonymization #2

Open
@turukawa

Description

@turukawa

ETHICS

Determine the implications in the collection, mining and recombination of open- and digital data.

As we use online and specialist data source for analysis, risks of de-anonymization.
Examples: Netflix de-anonymization, NY Taxis, Genome recovery, fitness trackers.

CURATION

Employ methods for presenting data for synthesis and usage, and employing methods for data maintenance.

Knowledge management systems, APIs, data standards and approaches to archival.
Methods for moving, tracking, cleaning, ETF, history and ownership.

ANALYSIS

Perform techniques in randomness and probability to understand distribution and likelihood.

Randomness, probability, generating datasets, tree diagrams, sampling techniques.
From software random numbers, to people (ie. before we get to people) … (samples with / without replacement), law of large numbers / averages.

PRESENTATION

Apply histograms, line charts and scatter plots to illustrate probability.

Using charts from previous lessons.


CASE STUDY

False positives / negatives from a universal breast cancer screening program, including cost and individual anxiety. Also consider risk from universal datasets of this nature.

Metadata

Metadata

Assignees

Labels

LessonLesson outcomes and outline

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions