Released a DataFrame summarytool for Jupyter Notebook
About the package
This is python version of summarytools
, which is used to generate standardized and comprehensive summary of pandas DataFrame
in Jupyter Notebooks.
The idea is originated from the summarytools
R package . Only dfSummary function is made available for now. I also added two html widgets (collapsible/tabbed view) to avoid displaying lengthy content.
Quick Start
default view
out-of-box dfSummary
function will generate a HTML based data frame summary.
import pandas as pd
from summarytools import dfSummary
titanic = pd.read_csv('./data/titanic.csv')
dfSummary(titanic)
If too many data summaries are included in the same notebook, the following two widgets should be able to help.
collapsible view
import pandas as pd
from summarytools import dfSummary
titanic = pd.read_csv('./data/titanic.csv')
dfSummary(titanic, is_collapsible = True)
tabbed view
import pandas as pd
from summarytools import dfSummary, tabset
titanic = pd.read_csv('./data/titanic.csv')
vaccine = pd.read_csv('./data/country_vaccinations.csv')
vaccine['date'] = pd.to_datetime(vaccine['date'])
tabset({
'titanic': dfSummary(titanic).render(),
'vaccine': dfSummary(vaccine).render()})
Export as HTML
when export jupyter notebook to HTML, make sure Export Embedded HTML
extension is installed and enabled.
Using the following bash command to retain the data frame summary in exported HTML.
jupyter nbconvert --to html_embed path/of/your/notebook.ipynb
Installation
detail is available at https://github.com/6chaoran/jupyter-summarytools
Leave a comment