# DataProfileViewer

Data profile viewer is compatible with Jupyter Notebooks. Supports the metadata format generated by [datamart-profiler](https://docs.auctus.vida-nyu.org/python/datamart-profiler.html#) library.


![Data summary viewer](https://github.com/soniacq/DataProfileVis/blob/master/imgs/data_summary_plot.png)

![Refine data profiler results viewer](https://github.com/soniacq/DataProfileVis/blob/master/imgs/edit_profiler_plot.png)

## Install via pip

~~~~
pip install data-profile-viewer
pip install datamart-profiler
~~~~

## Demo

In Jupyter Notebook:
~~~~
import DataProfileViewer
data = DataProfileViewer.get_lifeexpectancy_data()
DataProfileViewer.plot_data_summary(data)
~~~~

## Data Profile Exploration

~~~~
import DataProfileViewer
import datamart_profiler
~~~~

In a jupyter notebook, load the data

~~~~
data_path = 'lifeexpectancydata.csv'
metadata = datamart_profiler.process_dataset(data_path, include_sample=True, plots=True)
~~~~

and then plot it using:

~~~~
DataProfileViewer.plot_data_summary(metadata)
~~~~

## Refine Data Profiler Results

You might want to correct/refine the type information, or provide additional annotations for the columns. To do so, use the code:

~~~~
DataProfileViewer.plot_edit_profiler(metadata)
~~~~

To retrieve the updated metadata, use the code:

~~~~
updatedMetadata = DataProfileViewer.get_exported_metadata(data_path)
~~~~