Replication Data for: On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data (doi:10.18419/darus-4087)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Replication Data for: On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data

Identification Number:

doi:10.18419/darus-4087

Distributor:

DaRUS

Date of Distribution:

2024-03-14

Version:

1

Bibliographic Citation:

Alvarez Chaves, Manuel; Gupta, Hoshin; Ehret, Uwe; Guthke, Anneli, 2024, "Replication Data for: On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data", https://doi.org/10.18419/darus-4087, DaRUS, V1

Study Description

Citation

Title:

Replication Data for: On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data

Identification Number:

doi:10.18419/darus-4087

Identification Number:

swh:1:dir:84932ba0a47204a2cdbc15d4ba89d75d23cbbc9c; origin=https://github.com/manuel-alvarez-chaves/estimators-paper; visit=swh:1:snp:87bd19d6935c71c902e086a18815777d28233495; anchor=swh:1:rev:d88dffac56bfca7d5115506d56137b4f0f6ed0ad

Authoring Entity:

Alvarez Chaves, Manuel (Universität Stuttgart)

Gupta, Hoshin (The University of Arizona)

Ehret, Uwe (Karlsruhe Institute of Technology)

Guthke, Anneli (Universität Stuttgart)

Grant Number:

EXC 2075 - 390740016

Grant Number:

507884992

Distributor:

DaRUS

Access Authority:

Alvarez Chaves, Manuel

Access Authority:

Guthke, Anneli

Depositor:

Alvarez Chaves, Manuel

Date of Deposit:

2024-03-08

Holdings Information:

https://doi.org/10.18419/darus-4087

Study Scope

Keywords:

Computer and Information Science, Engineering, Mathematical Sciences, Other, Information Theory, Non-parametric Statistics

Abstract:

<h1 id="non-parametric-estimation-in-information-theory">Non-Parametric Estimation in Information Theory</h1> <h2 id="1-introduction">1. Introduction</h2> <p>This is a repository for our paper on: &quot;On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data&quot;.</p> <p>The projects is organizes as follows:</p> <pre><code>├── analysis_results<span class="hljs-string">\</span> │ ├── plots<span class="hljs-string">\</span> ├── data_evaluation<span class="hljs-string">\</span> │ ├── data<span class="hljs-string">\</span> │ ├── notebooks<span class="hljs-string">\</span> │ ├── results<span class="hljs-string">\</span> │ ├── utils<span class="hljs-string">\</span> │ ├── (...) scripts ├── data_generation<span class="hljs-string">\</span> ├── README.md └── .gitignore </code></pre><h2 id="2-installation">2. Installation</h2> <p>Code was written in <code>Python 3.11.5</code> but should be compatible with later and earlier versions of Python down to <code>Python 3.6</code>. Check the <code>requirements.txt</code> file for any dependency issues.</p> <p>Usage is recommended by cloning the repository to a local directory and setting up the required environment using <code>venv</code> and <code>pip</code>:</p> <pre><code class="lang-shell"> python -m venv .venv <span class="hljs-keyword">source</span> .venv<span class="hljs-regexp">/Scripts/</span>activate pip install -r requirements.txt </code></pre> <h2 id="3-generating-data">3. Generating Data</h2> <p>Initially data is generated and stored in the <code>data_evaluation/data</code> directory using the script in the <code>data_generation/</code> directory. The data for the experiments is stored as an HDF5 database.</p> <p>From the root directory:</p> <pre><code class="lang-python"> python dat<span class="hljs-built_in">a_generation</span>/dat<span class="hljs-built_in">a_generation</span>.py </code></pre> <p><strong>Note</strong>: as the <code>data.hdf5</code> file is ~123 GB, it is recommended to be locally generated. This process takes about ~12 hrs in an Intel Xeon E5-26280 v2 but shouldn&#39;t vary too much in any modern CPU. </p> <h2 id="4-conducting-an-evaluation">4. Conducting an Evaluation</h2> <p>The scripts in the directory <code>data_evaluation/</code> are used to read the data and perform the experiments. Results are stored in the <code>results/</code> directory.</p> <p>Again, from the root directory:</p> <pre><code class="lang-python"> <span class="hljs-keyword">python</span> data_evaluation/eval_bin_entropy.<span class="hljs-keyword">py</span> </code></pre> <p>All of the names of the scripts have the format <code>eval_{estimator}_{quantity}.py</code>. In total, 12 scripts must be run, tree for each estimator: binning, KDE, numerical integration of KDE and <em>k</em>-NN.</p> <p>The <code>notebooks/</code> directory serves as an archive of the development of the workflow to test each estimator. The contents of each notebook are generally the same as the code in the scripts. Log files describe the history of the project.</p> <h2 id="5-visualizing-results">5. Visualizing Results</h2> <p>The <code>analysis_results</code> directory contains a notebook to create the plots used in the paper, as well as a script to read the log files and calculate the time per iteration of the different experiments.</p> <p>The plots are generated using the results from the <code>data_evaluation/results</code> directory. Results are read from <code>.hdf5</code> files.</p> <h3 id="promotion">All results produced using the <a href="https://github.com/manuel-alvarez-chaves/unite_toolbox">UNITE Toolbox</a>.</h3>

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Publications

Citation

Title:

Álvarez Chaves, Manuel, Gupta, Hoshin V., Ehret, Uwe and Guthke, Anneli. On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data. Entropy 2024, 26(5), 387

Identification Number:

10.3390/e26050387

Bibliographic Citation:

Álvarez Chaves, Manuel, Gupta, Hoshin V., Ehret, Uwe and Guthke, Anneli. On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data. Entropy 2024, 26(5), 387

Other Study-Related Materials

Label:

requirements.txt

Notes:

text/plain

Other Study-Related Materials

Label:

compute_time.py

Notes:

text/x-python

Other Study-Related Materials

Label:

density_uniform.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

plotting.py

Notes:

text/x-python

Other Study-Related Materials

Label:

plot_evaluation.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

plot_style.txt

Notes:

text/plain

Other Study-Related Materials

Label:

evaluation-10d-gaussian.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

evaluation-4d-gaussian.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

evaluation-bivariate-normal-mixture.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

evaluation-bivariate-normal.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

evaluation-gexp.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

evaluation-normal-mixture.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

evaluation-normal.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

evaluation-uniform.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

eval_bin_entropy.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_bin_kld.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_bin_mi.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_ikde_entropy.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_ikde_kld.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_ikde_mi.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_kde_entropy.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_kde_kld.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_kde_mi.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_knn_entropy.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_knn_kld.py

Notes:

text/x-python

Other Study-Related Materials

Label:

eval_knn_mi.py

Notes:

text/x-python

Other Study-Related Materials

Label:

data.hdf5

Text:

Sample data file.

Notes:

application/x-hdf5

Other Study-Related Materials

Label:

ikde_entropy_dev.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

ikde_kld_dev.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

ikde_mi_dev.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

knn_entropy_dev.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

knn_kld_dev.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

knn_mi_dev.ipynb

Notes:

application/x-ipynb+json

Other Study-Related Materials

Label:

bin.hdf5

Notes:

text/x-hdf5

Other Study-Related Materials

Label:

bin_entropy.log

Notes:

text/plain

Other Study-Related Materials

Label:

bin_kld.log

Notes:

text/plain

Other Study-Related Materials

Label:

bin_mi.log

Notes:

text/plain

Other Study-Related Materials

Label:

ikde.hdf5

Notes:

text/x-hdf5

Other Study-Related Materials

Label:

ikde_entropy.log

Notes:

text/plain

Other Study-Related Materials

Label:

ikde_kld.log

Notes:

text/plain

Other Study-Related Materials

Label:

ikde_mi.log

Notes:

text/plain

Other Study-Related Materials

Label:

kde.hdf5

Notes:

text/x-hdf5

Other Study-Related Materials

Label:

kde_entropy.log

Notes:

text/plain

Other Study-Related Materials

Label:

kde_kld.log

Notes:

text/plain

Other Study-Related Materials

Label:

kde_mi.log

Notes:

text/plain

Other Study-Related Materials

Label:

knn.hdf5

Notes:

text/x-hdf5

Other Study-Related Materials

Label:

knn_entropy.log

Notes:

text/plain

Other Study-Related Materials

Label:

knn_kld.log

Notes:

text/plain

Other Study-Related Materials

Label:

knn_mi.log

Notes:

text/plain

Other Study-Related Materials

Label:

base_evaluator.py

Notes:

text/x-python

Other Study-Related Materials

Label:

bin_evaluators.py

Notes:

text/x-python

Other Study-Related Materials

Label:

kde_evaluators.py

Notes:

text/x-python

Other Study-Related Materials

Label:

knn_evaluators.py

Notes:

text/x-python

Other Study-Related Materials

Label:

tools.py

Notes:

text/x-python

Other Study-Related Materials

Label:

__init__.py

Notes:

text/x-python

Other Study-Related Materials

Label:

data_generation.log

Notes:

text/plain

Other Study-Related Materials

Label:

data_generation.py

Notes:

text/x-python

Other Study-Related Materials

Label:

utils.py

Notes:

text/x-python