You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CONTRIBUTE.md
+35-1Lines changed: 35 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,6 +46,40 @@ SOCR/SOCRAT-issues#1
46
46
47
47
## SOCR Datasets for testing
48
48
49
-
**Iris** - famous dataset in machine learning [1]. To be able to use it locally for testing, download [CSV file](https://drive.google.com/file/d/0BzJubeARG-hsdTdRTC03RFdhRTg/view?usp=sharing) and place it under ``_build/datasets/iris.csv``
49
+
To be able to use any dataset locally for testing, download CSV file from the provided link and place it under ``_build/datasets/iris.csv``. See SOCR Data desription page for details.
50
+
51
+
**Iris** - The data set contains 3 classes (Iris types) each containing 50 observations (for a total of 150 observations). One class is linearly separable from the other two, but the latter are difficult to linearly separable from each other. The class of the iris plant may be used as the predicted variable. There are 4 variables which can be used as predictive (explanatory) attributes of the Iris class [1] | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/iris.csv) | [SOCR Data description page](http://wiki.socr.umich.edu/index.php/SOCR_Data_052511_IrisSepalPetalClasses)
52
+
53
+
**Simulated SOCR Knee Pain Centroid Location Data** - This simulated data represents the centroid locations for the hypothetical knee-pain locations for 8666 subjects. The data includes the X and Y coordinates of the centroids and a label for the view (left/right and front/back of the knee) [2] | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/knee_pain_data.csv) | [SOCR Data description page](http://wiki.socr.umich.edu/index.php/SOCR_Data_KneePainData_041409)
54
+
55
+
**Neuroimaging study of 27 Alzheimer's disease (AD) subjects, 35 normal controls (NC), and 42 mild cognitive impairment subjects (MCI)** - This is a large neuroimaging study using automated volumetric data processing to obtain different shape and volume measures of local anatomy [3]. The subject population is derived from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database and includes 27 Alzheimer's disease (AD) subjects, 35 normal controls (NC), and 42 mild cognitive impairment subjects (MCI) | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/Global_Cortical_Surface_Curvedness_AD_NC_MCI.csv) | [SOCR Data description page](http://wiki.socr.umich.edu/index.php/SOCR_Data_July2009_ID_NI)
56
+
57
+
**Neuroimaging study of Prefrontal Cortex Volume across Species** - The prefrontal cortex is the anterior part of the frontal lobes of the brain in front of the premotor areas. Prefrontal cortex includes cytoarchitectonic layer IV and includes three regions: orbitofrontal (OFC), dorsolateral prefrontal cortex (PFC), anterior and ventral cingulate cortex. Human brains are much distinct from the brains of other primates and apes specifically in the prefrontal cortex. These structural differences induce significant functional abilities which may account for the significant associating, planning and strategic thinking in humans, compared to other primates [4] | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/Prefrontal_Cortex_Volume_across_Species.csv) | [SOCR Data description page](http://wiki.socr.umich.edu/index.php/SOCR_Data_April2009_ID_NI)
58
+
59
+
**Turkiye Student Evaluation Dataset** - A study at Gazi University, Ankara, Turkey (Turkiye) collected data consisting of 5,820 evaluation scores provided by Gazi University students. Each record consists of 5 meta-data attributes and 28 course specific questions (Q1-Q28) [1] | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/Turkiye_Student_Evaluation_Data_Set.csv) | [SOCR Data description page](http://wiki.socr.umich.edu/index.php/SOCR_TurkiyeStudentEvalData)
60
+
61
+
**Antarctic Ice Thickness Dataset** - Australian Antarctic Data Centers. Regular measurements of the thickness of the fast ice, and of the snow cover that forms on it, are made through drilled holes at several sites near Mawson, Casey and Davis. Number of data points is 1636. | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/Antarctic_Ice_Thickness.csv) | [SOCR Data description page](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_Dinov_042108_Antarctic_IceThicknessMawson)
62
+
63
+
**California Ozone Data** - This dataset includes the ozone levels over 20 geographic locations within the State of California for the period between 1980-2006. | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/California_Ozone.csv) | [SOCR Data description page](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_121608_OzoneData)
64
+
65
+
**California Ozone Pollution Data** - This dataset includes the ozone levels within the State of California. It contains data (N=175) about the date of Ozone measurement, site-location identifier, latitude, longitude, and the ozone measure (O3) within the State of California. | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/California_Ozone_Pollution.csv) | [SOCR Data description page](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_121608_CA_US_OzoneData)
66
+
67
+
**US Ozone Pollution Data** - This data includes the ozone levels for various locations throughout the US. It contains data (N=175) about the date of Ozone measurement, site-location identifier, latitude, longitude, and the ozone measure (O3) within the United States. | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/US_Ozone_Pollution.csv) | [SOCR Data description page](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_121608_CA_US_OzoneData)
68
+
69
+
**Baseball Players Dataset** - Human Height and Weight are mostly hereditable, but lifestyles (e.g., regular strenuous physical exercise), diet, health and environmental factors also play a role in determining individual's physical characteristics. The dataset below contains 1035 records of heights and weights for some current and recent Major League Baseball (MLB) Players. These data were obtained from different resources (e.g., IBM Many Eyes and many other references [5][6]). See also the 25,000 records of adolescent height and weight [7]. | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/Baseball_Players.csv) | [SOCR Data description page](http://wiki.socr.umich.edu/index.php/SOCR_Data_MLB_HeightsWeights)
70
+
71
+
**Countries Rankings Dataset** - These data represent commonly accepted measures for raking Countries on variety of factors which affect the country's internal and external international perception of the country's rank relative the the rest of the World. | [CSV file](http://socr-dev.nursing.umich.edu:3000/datasets/Countries_Rankings.csv) | [SOCR Data description page](http://wiki.socr.umich.edu/index.php/SOCR_Data_2008_World_CountriesRankings)
50
72
51
73
[1] Lichman, M. (2013). [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml). Irvine, CA: University of California, School of Information and Computer Science.
74
+
75
+
[2] SOCR Data KneePainData 041409. http://wiki.socr.umich.edu/index.php/SOCR_Data_KneePainData_041409
76
+
77
+
[3] Dinov ID, Van Horn JD, Lozev KM, Magsipoc R, Petrosyan P, Liu Z, MacKenzie-Graha A, Eggert P, Parker DS and Toga AW (2009) Efficient, Distributed and Interactive Neuroimaging Data Analysis using the LONI Pipeline. Front. Neuroinform. (2009) 3:22. [doi:10.3389/neuro.11.022.2009](https://doi.org/10.3389/neuro.11.022.2009), published online: 20 July 2009.
78
+
79
+
[4] Schoenemann, PT., Sheehan, MJ., Glotzer, DL. (2005) Prefrontal white matter volume is disproportionately larger in humans than in other primates. Nature Neuroscience, 8, 242–252. [doi:10.1038/nn1394](https://doi.org/10.1038/nn1394)
80
+
81
+
[5] Jarron M. Saint Onge, Patrick M. Krueger, Richard G. Rogers. (2008) Historical trends in height, weight, and body mass: Data from U.S. Major League Baseball players, 1869-1983, Economics & Human Biology, Volume 6, Issue 3, Symposium on the Economics of Obesity, December 2008, Pages 482-488, ISSN 1570-677X, DOI: 10.1016/j.ehb.2008.06.008.
82
+
83
+
[6] Jarron M. Saint Onge, Richard G. Rogers, Patrick M. Krueger. (2008) Major League Baseball Players' Life Expectancies, Southwestern Social Science Association, Volume 89, Issue 3, pages 817–830, DOI: 10.1111/j.1540-6237.2008.00562.x.
84
+
85
+
[7] 25,000 records of adolescent height and weight. http://wiki.socr.umich.edu/index.php/SOCR_Data_Dinov_020108_HeightsWeights
***Note: project is under development, new features are in pending Pull Requests, unit tests currently are not passing, bugs are possible**
12
10
13
11
Installation
14
12
------------
15
-
In case you wish to create your own module or contribute to the project, follow these steps to setup your environment.
13
+
In case you wish to run SOCRAT locally, or create your own module, or contribute
14
+
to the project, follow these steps to setup your environment.
16
15
17
16
First, install [Node.js](http://nodejs.org/) if you haven't yet. `npm` is the package manager for `Node.js` and comes bundled with it.
18
17
@@ -24,42 +23,56 @@ Clone the repository:
24
23
25
24
$> git clone https://github.com/SOCR/SOCRAT.git
26
25
$> cd SOCRAT
27
-
28
-
If you're interested in latest changes or want to contribute to the project, switch to the `dev` branch:
26
+
27
+
Switch to the `dev` branch to see latest changes or to contribute to the project:
29
28
30
29
$> git checkout dev
31
30
$> git pull
32
31
33
-
Now, lets install all the dependencies:
32
+
Now, install all the dependencies:
34
33
35
34
$> npm install
36
35
37
-
This will install all the dependencies mentioned in package.json files.
36
+
After that build the project and start the web-server:
37
+
38
+
$> npm run build
39
+
$> node server.js
40
+
41
+
Now you shoule be able to access SOCRAT at `localhost:3000`.
38
42
39
-
Start the development server and see the application running at `localhost:8080`:
43
+
Start the development server with:
40
44
41
45
$> npm run serve
42
46
47
+
You will see the application running at `localhost:8080` and the page will live
48
+
reload on saved changes in source code.
49
+
Also see how to [add test datasets](https://github.com/SOCR/SOCRAT/blob/dev/CONTRIBUTE.md#socr-datasets-for-testing) and general [contrubition instructions](https://github.com/SOCR/SOCRAT/blob/dev/CONTRIBUTE.md).
43
50
44
-
Motivation
51
+
Motivation
45
52
--------------
46
-
[SOCR](http://socr.umich.edu), Statistics Online Computational Resource has a huge user base who constantly access the educational data present and java tools which use these data to aid in understanding statistics.
47
-
As far as the technology is concerned, currently all the applications are written in `java` and are presented as java applets. The reach of these applications is bottlenecked by technology.
48
-
49
-
Goal
53
+
The modern web is a successful platform for large scale interactive web applications, including visualizations. Statistics Online Computational Resource ([SOCR](http://socr.umich.edu)) provides a
54
+
web-based collection of tools for interactive modeling and visual data analysis that has a large user base. However, most of SOCR applets eventually became practically unavailable to end users as new versions of browsers disabled Java by default as a response to numerous vulnerability reports.
55
+
Thus, we designed an open-source platform to build Statistics Online Computational Resource
56
+
Analytical Toolbox (SOCRAT). Platform design defines: (1) a specification for an architecture for building VA applications with multi-level modularity, and (2) methods for optimizing module
57
+
interaction, re-usage, and extension. SOCRAT relies on this platform for integration of a number of data management, analysis, and visualization modules into an easily customizable web application including interfaces for merging third-party components. This ability allows SOCRAT to balance expressive, interactive and processing capabilities, efficiency, compatibility, and accessibility. Multi-level modularity and declarative specifications enable easy customizations of the application, for instance, for a specific project. Online demo demonstrates how SOCRAT can be used for data input, display, and storage, with interactive visualization and analysis.
58
+
For more details see the publication list below.
59
+
60
+
Publications
50
61
------
51
-
The world is going the HTML5 way. Browsers are becoming more powerful.
52
-
We intend to create a toolbox which will serve users on all platforms. We are primarily using `CoffeeScript` (compiles to `JavaScript`) for all the computations and presentation. Given the fact that today’s browsers have powerful javaScript engines (`v8`, `SpiderMonkey`), we perform all the calculations on the browser with no server dependency. File management, database, computation will be performed inside the browser.
53
62
54
-
Technologies/Packages
63
+
If you find our work useful, please cite our paper:
64
+
65
+
Alexandr A. Kalinin, Selvam Palanimalai, and Ivo D. Dinov. 2017. SOCRAT Platform Design: A Web Architecture for Interactive Visual Analytics Applications. In Proceedings of HILDA’17, Chicago, IL, USA, May 14, 2017, 6 pages. [DOI:10.1145/3077257.3077262](http://dx.doi.org/10.1145/3077257.3077262)
66
+
67
+
Technologies/Packages
55
68
----------------
56
69
[`CoffeeScript`](http://coffeescript.org/)
57
70
[`Jade`](http://jade-lang.com/)
58
71
[`Less`](http://lesscss.org/)
59
-
[`Webpack`](https://webpack.github.io/)
72
+
[`Webpack`](https://webpack.github.io/)
60
73
[`Node.js`](http://nodejs.org/)
61
74
62
-
Dependencies
75
+
Dependencies
63
76
--------------
64
77
[`Bootstrap`](http://getbootstrap.com/)
65
78
[`jQuery`](https://jquery.com/)
@@ -69,12 +82,12 @@ We intend to create a toolbox which will serve users on all platforms. We are pr
0 commit comments