Skip to content
Snippets Groups Projects
README.md 6.97 KiB
Newer Older
<p align="center">
  <a href="" rel="noopener">
 <img width=200px height=200px src="https://i.imgur.com/6wj0hh6.jpg" alt="Project logo"></a>
</p>
IsolatedSushi's avatar
IsolatedSushi committed

<h3 align="center">PROVEE - PROgressiVe Explainable Embeddings</h3>
IsolatedSushi's avatar
IsolatedSushi committed

<div align="center">


[![Status](https://img.shields.io/badge/status-active-success.svg)]()

[![GitHub Issues](https://img.shields.io/github/issues/kylelobo/The-Documentation-Compendium.svg)](https://github.com/kylelobo/The-Documentation-Compendium/issues)
![Gitlab pipeline status](https://img.shields.io/gitlab/pipeline/vig/provee/master?gitlab_url=https%3A%2F%2Fgit.science.uu.nl)
![Gitlab pipeline status](https://git.science.uu.nl/vig/provee/dummy-vue-grpc/badges/master/pipeline.svg)
![Gitlab coverage](https://git.science.uu.nl/vig/provee/dummy-vue-grpc/badges/master/coverage.svg)

[![GitHub Issues](https://img.shields.io/github/issues/kylelobo/The-Documentation-Compendium.svg)](https://github.com/kylelobo/The-Documentation-Compendium/issues)
[![GitHub Pull Requests](https://img.shields.io/github/issues-pr/kylelobo/The-Documentation-Compendium.svg)](https://github.com/kylelobo/The-Documentation-Compendium/pulls)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](/LICENSE)

</div>

---

<p align="center"> 

Deep Neural Networks (DNNs), and their resulting **latent or embedding data spaces, are key to analyzing big data** in various domains such as vision, speech recognition, and natural language processing (NLP). However, embedding spaces are high-dimensional and abstract, thus not directly understandable. We aim to develop a software framework to visually explore and explain how embeddings relate to the actual data fed to the DNN. This enables both DNN developers and end-users to understand the currently black-box working of DNNs, leading to better-engineered networks, and explainable, transparent DNN systems whose behavior can be trusted by their end-users. 

Our central aim is to open DNN black-boxes, making complex data understandable for data science novices, and raising trust/transparency are core topics in VA and NLP research. PROVEE will advertise and apply VA in a wider scope with impact across sciences (medicine, engineering, biology, physics) where researchers use big data and deep learning.
</p>

## 📝 Table of Contents

- [About](#about)
- [Getting Started](#getting_started)
- [Feature/Performance Comparison](../COMPARISON.md)
- [Deployment](#deployment)
- [Usage](#usage)
- [Built Using](#built_using)
- [TODO](../TODO.md)
- [Contributing](../CONTRIBUTING.md)
- [Authors](#authors)
- [Acknowledgments](#acknowledgement)

## 🧐 About <a name = "about"></a>

In this repository you will find PROVEE, short for Progressive Explainable Embeddings, a visual-interactive system for representing the embedding data spaces in a user-friendly 2D projection. The idea behind [Progressive Analytics](https://arxiv.org/abs/1607.05162), such as described e.g. by Fekete and Primet, is to provide a rapid data exploration pipeline with a feedback loop from the system to the analyst with a latency below about 10 seconds. Research has shown that when performing exploratory analysis humans need a latency below about 10 seconds to remain focused and use their short-term memory efficiently. Therefore, PROVEE's goals are (1) to provide increasingly meaningful partial results as the algorithms execute, (2) provide visualizations that minimize distractions by not changing views excessively, (3) will provide cues to indicate where new results have been found by analytics, (4) should provide an interface for users to specify where analytics should focus, as well as the portions of the problem space that should be ignored. _Note that these goals are adapted from the aforementioned publication._

PROVEE's architecture includes (1) back-end analysis algorithms (particularly, incremental projection algoritms), (2) intuitive, web-based user interfaces/visualizations and (3) intermediate data storage and transfer. Core to our system is an innovative, progressive analysis workflow targeting a human-algorithm feedback-loop with a latency under ~10 seconds to maintain the user's efficiency during exploration tasks. PROVEE will be scalable to big data; generic (handle data from many application domains); and easy to use (requires no specialist programming from the user). Please also refer to our [Performance and feature comparison](../COMPARISON.md) to available (visualization and analysis) tools 


## 🏁 Getting Started <a name = "getting_started"></a>

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See [deployment](#deployment) for notes on how to deploy the project on a live system.

### Prerequisites

<!-- What things you need to install the software and how to install them. -->

```
Docker, Kubernetes
```

### Installing

To install the backend please refer to the [backend README](backend/README.md) and for the frontend to the [frontend README](frontend/README.md).
IsolatedSushi's avatar
IsolatedSushi committed

Jan Zak's avatar
Jan Zak committed
Generate gRPC service classes whenever [protos](protos/README.md) change.

clone the latest Provee directory from Gitlab
```
git clone https://git.science.uu.nl/vig/provee/dummy-vue-grpc
```

cd to the cloned provee backend directory
```
cd dummy-vue-grpc/backend
```

run docker-compose to start all backend services
```
docker-compose up
```

cd to the cloned provee frontend directory
```
cd dummy-vue-grpc/frondent
```

ensure Node.js and yarn are installed

build the frontend

```
yarn install
yarn build
```

<!-- A step by step series of examples that tell you how to get a development env running.

Say what the step will be

```
Give the example
```

And repeat

```
until finished
``` 
End with an example of getting some data out of the system or using it for a little demo.
-->

## 🔧 Running the tests <a name = "tests"></a>

Explain how to run the automated tests for this system.

### Break down into end to end tests

Explain what these tests test and why

```
Give an example
```

### Basic Unit Tests

We use [Jest](https://jestjs.io/) for JavaScript based Unit Tests, 

```
npm run tests
```



## 🎈 Usage <a name="usage"></a>

Notes about how to use the system are TBD, Video coming soon.

## 🚀 Deployment <a name = "deployment"></a>

If you want to deploy a live system refer to the [Deployment Guide](../DEPLOYMENTGUIDE.md).

## ⛏️ Built Using <a name = "built_using"></a>

- [VueJs](https://vuejs.org/) - Web Framework
- [NodeJs](https://nodejs.org/en/) - Server Environment
- [PixiJS](https://www.pixijs.com/) - Visualization
- [D3js](https://www.d3js.org/) - Visualization

## ✍️ Authors <a name = "authors"></a>

- [Michael Behrisch](https://michael.behrisch.info) - Idea & Initial work
- [Simen van Herpt](https://github.com/IsolatedSushi) - Backend & Infrastructure
- [Jan Zak](https://zakjan.cz/) - Vis coding

See also the list of [contributors](https://git.science.uu.nl/vig/provee/dummy-vue-grpc/-/graphs/master) who participated in this project.

## 🎉 Acknowledgements <a name = "acknowledgement"></a>

- Hat tip to anyone whose code was used