The Categorical Data Map - Replication Data (doi:10.18419/darus-3372)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

The Categorical Data Map - Replication Data

Identification Number:

doi:10.18419/darus-3372

Distributor:

DaRUS

Date of Distribution:

2024-01-26

Version:

1

Bibliographic Citation:

Dennig, Frederik L.; Joos, Lucas; Paetzold, Patrick; Blumberg, Daniela; Deussen, Oliver; Keim, Daniel; Fischer, Maximilian T., 2024, "The Categorical Data Map - Replication Data", https://doi.org/10.18419/DARUS-3372, DaRUS, V1, UNF:6:4NrkBxJKpeeQqsRmi8XRPw== [fileUNF]

Study Description

Citation

Title:

The Categorical Data Map - Replication Data

Identification Number:

doi:10.18419/darus-3372

Authoring Entity:

Dennig, Frederik L. (Universität Konstanz)

Joos, Lucas (Universität Konstanz)

Paetzold, Patrick (Universität Konstanz)

Blumberg, Daniela (Universität Konstanz)

Deussen, Oliver (Universität Konstanz)

Keim, Daniel (Universität Konstanz)

Fischer, Maximilian T. (Universität Konstanz)

Grant Number:

251654672

Distributor:

DaRUS

Access Authority:

Dennig, Frederik L.

Access Authority:

Keim, Daniel

Depositor:

Dennig, Frederik L.

Date of Deposit:

2023-03-04

Holdings Information:

https://doi.org/10.18419/DARUS-3372

Study Scope

Keywords:

Computer and Information Science, Human-Centered Computing, Visualization Design and Evaluation Methods

Abstract:

Source code and datasets used for our experiments are shared for replication purposes along our publication "The Categorical Data Map". We describe each of the six datasets individually on a per-file basis. All datasets are purely nominal datasets.

Methodology and Processing

Sources Statement

Data Sources:

<br> Dawson R. J. M.:The "unusual episode" data revisited, 1995. <a href="http://jse.amstat.org/v3n3/datasets.dawson.html">http://jse.amstat.org/v3n3/datasets.dawson.html</a>, last accessed 2020-09-18.

<br> Sabri Hassan and Günther Pernul. (2014). Efficiently Managing the Security and Costs of Big Data Storage using Visual Analytics. In Proceedings of the 16th International Conference on Information Integration and Web-based Applications & Services (iiWAS '14). Association for Computing Machinery, New York, NY, USA, 180-184. <a href="https://doi.org/10.1145/2684200.2684333">doi: 10.1145/2684200.2684333</a>

<br> L. C. Koh, A. Slingsby, J. Dykes and T. S. Kam. (2011). Developing and Applying a User-Centered Model for the Design and Implementation of Information Visualization Tools. In 2011 15th International Conference on Information Visualisation,pp. 90-95. <a href="https://doi.org/10.1109/IV.2011.32">doi: 10.1109/IV.2011.32</a>

<br> G. Lincoff and N. A. Society, (1981). National Audubon Society field guide to North American mushrooms, ser. Audubon Society field guide series. Knopf: Distributed by Random House, New York.

<br> K. Rogers, J. Wiles, S. Heath, K. Hensby and J. Taufatofua. (2016). Discovering patterns of touch: A case study for visualization-driven analysis in Human-Robot Interaction. In 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 499-500. <a href="https://doi.org/10.1109/HRI.2016.7451825">doi: 10.1109/HRI.2016.7451825</a>

<br> Y. Yano, R. G. Kula, T. Ishio and Katsuro Inoue. (2015). VerXCombo: An interactive data visualization of popular library version combinations. In Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension, pp. 291-294. <a href="https://doi.org/10.1109/ICPC.2015.43">doi: 10.1109/ICPC.2015.43</a>

Data Access

Other Study Description Materials

File Description--f275662

File: lincoff.tab

  • Number of cases: 8124

  • No. of variables per record: 23

  • Type of File: text/tab-separated-values

Notes:

UNF:6:rClruQR9kTSyr0NbJisRNA==

A mushroom dataset with 23 attributes and 8124 category combinations by G. Lincoff and N. A. Society. (1981). National Audubon Society field guide to North American mushrooms, ser. Audubon Society field guide series. Knopf: Distributed by Random House, New York.

File Description--f198837

File: dawson.tab

  • Number of cases: 2201

  • No. of variables per record: 4

  • Type of File: text/tab-separated-values

Notes:

UNF:6:n0hJE5dgUmOgW4TXbPH6jw==

The well-known titanic dataset from Dawson R. J. M. (1995) [Daw95]. [Daw95] Dawson R. J. M.:The "unusual episode" data revisited, 1995. http://jse.amstat.org/v3n3/datasets.dawson.html, last accessed 2020-09-18.

File Description--f198838

File: hassan.tab

  • Number of cases: 647

  • No. of variables per record: 4

  • Type of File: text/tab-separated-values

Notes:

UNF:6:452w0FICojAEWwAMO7rPdQ==

A categorical dataset reconstructed from Hassan et al. [HP14] describing data storage security and cost data. It is manually reconstructed from the Parallel Sets visualization in Figure 2 of the publication. The reconstruction method is described in the Process Metadata. The publication does not provide a source for the underlying data. [HP14] Sabri Hassan and Günther Pernul. 2014. Efficiently Managing the Security and Costs of Big Data Storage using Visual Analytics. In Proceedings of the 16th International Conference on Information Integration and Web-based Applications & Services (iiWAS '14). Association for Computing Machinery, New York, NY, USA, 180–184. https://doi.org/10.1145/2684200.2684333

File Description--f198839

File: koh.tab

  • Number of cases: 1077

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:jm91tihAXgNNRcyGYtStlA==

A categorical dataset reconstructed from Koh et al. [YKII15] describing software dependencies. It is manually reconstructed from the Parallel Sets visualization in Figure 4 of the publication by Yano et al. The reconstruction method is described in the Process Metadata. The publication does not provide a source for the underlying data. [YKII15] Y. Yano, R. G. Kula, T. Ishio and Katsuro Inoue. (2015). VerXCombo: An interactive data visualization of popular library version combinations. In Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension, pp. 291-294. doi: 10.1109/ICPC.2015.43

File Description--f198845

File: rogers-1.tab

  • Number of cases: 1013

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:feip2wKBPgQTWR5JZ3oK8g==

The first categorical dataset reconstructed from Rogers et al. describing the results of a HCI study. It is manually reconstructed from the Parallel Sets visualization in Figure 1 (a) of the publication by Rogers et al. [RWH*16] The reconstruction method is described in the Process Metadata. The publication does not provide a source for the underlying data. [RWH*16] K. Rogers, J. Wiles, S. Heath, K. Hensby and J. Taufatofua, "Discovering patterns of touch: A case study for visualization-driven analysis in Human-Robot Interaction," 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2016, pp. 499-500, doi: 10.1109/HRI.2016.7451825.

File Description--f198844

File: rogers-2.tab

  • Number of cases: 1012

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:UPfzwP4EP7X8E9Uk4KxFbQ==

The second categorical dataset reconstructed from Rogers et al. describing the results of a HCI study. It is manually reconstructed from the Parallel Sets visualization in Figure 1 (b) of the publication by Rogers et al. [RWH*16] The reconstruction method is described in the Process Metadata. The publication does not provide a source for the underlying data. [RWH*16] K. Rogers, J. Wiles, S. Heath, K. Hensby and J. Taufatofua, "Discovering patterns of touch: A case study for visualization-driven analysis in Human-Robot Interaction," 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2016, pp. 499-500, doi: 10.1109/HRI.2016.7451825.

Variable Description

List of Variables:

Variables

class

f275662 Location:

Variable Format: character

Notes: UNF:6:xc5j/dXovlqFbU0VFzNTaw==

cap-shape

f275662 Location:

Variable Format: character

Notes: UNF:6:iH4Kx+0XAfk4toDYY3ejkw==

cap-surface

f275662 Location:

Variable Format: character

Notes: UNF:6:Qmi5ZLNpmhjodgnw1ur2Sg==

cap-color

f275662 Location:

Variable Format: character

Notes: UNF:6:TSsxtS/t+SEeMAQysaAAGA==

bruises

f275662 Location:

Variable Format: character

Notes: UNF:6:DSs5k/FYCZ3E/xdSUBk7dg==

odor

f275662 Location:

Variable Format: character

Notes: UNF:6:8lmK/DBZECV24gnyudZsjg==

gill-attachment

f275662 Location:

Variable Format: character

Notes: UNF:6:urrQiM8HGWtp3KaDFTZlcQ==

gill-spacing

f275662 Location:

Variable Format: character

Notes: UNF:6:okzuXXlGf1GVjzv+JBJ6gQ==

gill-size

f275662 Location:

Variable Format: character

Notes: UNF:6:xsyKIciAbSTI7LLCNhymTA==

gill-color

f275662 Location:

Variable Format: character

Notes: UNF:6:avE50+9tq/xpRd5c2pek4Q==

stalk-shape

f275662 Location:

Variable Format: character

Notes: UNF:6:InBnRhcKmpiqYVMRRsblMg==

stalk-root

f275662 Location:

Variable Format: character

Notes: UNF:6:3C519ckOvHNteASl2HvieQ==

stalk-surface-above-ring

f275662 Location:

Variable Format: character

Notes: UNF:6:3DzJKo+wM/5mO+QSAkYhMQ==

stalk-surface-below-ring

f275662 Location:

Variable Format: character

Notes: UNF:6:xYwQcLCeMwj1jHBFTmFbQQ==

stalk-color-above-ring

f275662 Location:

Variable Format: character

Notes: UNF:6:KGpkSv/stLWAC+KLGbJZww==

stalk-color-below-ring

f275662 Location:

Variable Format: character

Notes: UNF:6:Rl44imugARjzxIv+umFrVg==

veil-type

f275662 Location:

Variable Format: character

Notes: UNF:6:KANc9/u/ic6etxMu7ZE/sw==

veil-color

f275662 Location:

Variable Format: character

Notes: UNF:6:+R0A4fmkHlew9WT78yX1/g==

ring-number

f275662 Location:

Variable Format: character

Notes: UNF:6:tQOUH6do2GT0vnGc6gd3RQ==

ring-type

f275662 Location:

Variable Format: character

Notes: UNF:6:9OOHqzSpMFUDllNmTJstHg==

spore-print-color

f275662 Location:

Variable Format: character

Notes: UNF:6:iqcPRMAaMIQ+DKRF2hgrjg==

population

f275662 Location:

Variable Format: character

Notes: UNF:6:9PybZtl7UA1Fc8Ip/o/z/A==

habitat

f275662 Location:

Variable Format: character

Notes: UNF:6:ANy/nkIFpX8t1gGBd9U11Q==

Class

f198837 Location:

Variable Format: character

Notes: UNF:6:fL3ximNCrflgdWeb5vmz6Q==

Age

f198837 Location:

Variable Format: character

Notes: UNF:6:d6bmC2ZhVc5MR9w24dL2aQ==

Sex

f198837 Location:

Variable Format: character

Notes: UNF:6:8vs3CwXbLockFHgxC3yhFA==

Survived

f198837 Location:

Variable Format: character

Notes: UNF:6:o5Az31Iv5ED+BZJoanr/MA==

Value

f198838 Location:

Variable Format: character

Notes: UNF:6:27JSU1UGN/b8r8suvK77xA==

Sensitivity

f198838 Location:

Variable Format: character

Notes: UNF:6:ipWmLNSHe+BtwMwmrJb+lw==

Region

f198838 Location:

Variable Format: character

Notes: UNF:6:cqxpqUSwoQlIDeRgVu/cOg==

Costs

f198838 Location:

Variable Format: character

Notes: UNF:6:Fir+x7n5C8LRLyec3W1heQ==

Purchaser Currently Living In

f198839 Location:

Variable Format: character

Notes: UNF:6:TUT/sKpWxHeUMr2eMkEJ8w==

Property Type Purchased

f198839 Location:

Variable Format: character

Notes: UNF:6:7ZHUak/73XyjNkaBMvmEjw==

Location of Purchased Property

f198839 Location:

Variable Format: character

Notes: UNF:6:SiLlk9ClmCQWLN/zcZyUpA==

Participant

f198845 Location:

Variable Format: character

Notes: UNF:6:rpqa3M5983IAT1IzzqbaMA==

Origin

f198845 Location:

Variable Format: character

Notes: UNF:6:3O2S26vfiU+hwJtjoedlnw==

Touch Location

f198845 Location:

Variable Format: character

Notes: UNF:6:FTRzVb7iIRbvr4OLCdqRsg==

Participant

f198844 Location:

Variable Format: character

Notes: UNF:6:yR5yha1iRo918S6knLTctw==

Origin

f198844 Location:

Variable Format: character

Notes: UNF:6:O9l9ubkE3MbYj7dz+DiAyA==

Touch Location

f198844 Location:

Variable Format: character

Notes: UNF:6:v46uKjM7LwVh7SQAK57XyQ==

Other Study-Related Materials

Label:

categorical-data-map.zip

Text:

Code of the research prototype for "The Categorical Data Map" All required source code and data are packaged in `categorical-data-map.zip` and ready for development and deployment. Software Dependencies: The code in this component has been tested with *Docker version 24.0.5, build 24.0.5-0ubuntu1~22.04.1* and *docker-compose version 1.29.2*. All other dependencies are handled by *Docker*. Development: To run execute `.deploy-dev` and navigate to http://127.0.0.1:3000 in your browser. Deployment: To deploy run `./deploy-prod` and navigate to http://127.0.0.1:3000 in your browser.

Notes:

application/zip

Other Study-Related Materials

Label:

Instructional_Video.mp4

Text:

An instructional video demonstrating the prototype.

Notes:

video/mp4

Other Study-Related Materials

Label:

yano.csv

Text:

A categorical dataset reconstructed from Yano et al. [KSDK11] describing property sales information from Singapor. It is manually reconstructed from the Parallel Sets visualization in Figure X of the publication by Yano et al. The reconstruction method is described in the Process Metadata. The publication does not provide a source for the underlying data. [KSDK11] L. C. Koh, A. Slingsby, J. Dykes and T. S. Kam, "Developing and Applying a User-Centered Model for the Design and Implementation of Information Visualization Tools," 2011 15th International Conference on Information Visualisation, 2011, pp. 90-95, doi: 10.1109/IV.2011.32

Notes:

text/csv