@@ -35,50 +35,50 @@ See [`CONTRIBUTING.md`][org-contrib].
3535
3636
3737### The three phases of generating a report
38- - ** 1-Fetch ** : This phase involves collecting data from a particular source
39- using its API. Before writing any code, we plan the analyses we want to
40- perform by asking meaningful questions about the data. We also consider API
41- limitations (such as query limits) and design a query strategy to work within
42- these limitations. Then we write a python script that gets the data, it is
43- quite important to follow the format of the scripts existing in the project
44- and use the modules and functions where applicable. It ensures consistency in
45- the scripts and we can easily debug issues might arise.
46- - ** Meaningful questions **
47- - The reports generated by this project (and the data fetched and processed
48- to support it) seeks to be meaningful. We hope this project will provide
49- data and analysis that helps inform discussions about the commons--the
50- collection of works that are openly licensed or in the public domain.
51-
52- The goal of this project is to help answer questions like:
53- - How has the world's use of the commons changed over time ?
54- - How is the knowledge and culture of the commons distributed ?
55- - Who has access (and how much) to the commons?
56- - What significant trends can be observed in the commons ?
57- - Which public domain dedication or licenses are the most popular?
58- - What are the correlations between public domain dedication or licenses
59- and region, language, domain/endeavor, etc.?
60- - ** Limitations of an API **
61- - Some data sources provide APIs with query limits (it can be daily or
62- hourly) depending on what is given in the documentation. This restricts
63- how many requests that can be made in the specified period of time. It is
64- important to plan a query strategy and schedule fetch jobs to stay within
65- the allowed limits.
66- - ** Headings of data in 1-fetch **
67- - [ Tool identifier ] [ tool-identifier ] : A unique identifier used to
68- distinguish each Creative Commons legal tool within the dataset. This
69- helps ensure consistency when tracking tools across different data
70- sources.
71- - [ SPDX identifier ] [ spdx-identifier ] : A standardized identifier maintained
72- by the Software Package Data Exchange (SPDX) project. It provides a
73- consistent way to reference licenses in applications.
74- - ** 2-Process ** : In this phase, the fetched data is transformed into a
75- structured and standardized format for analysis. The data is then analyzed
76- and categorized based on defined criteria to extract insights that answer the
77- meaningful questions identified during the 1-fetch phase .
78- - ** 3-report ** : This phase focuses on presenting the results of the analysis. We generate graphs and summaries that clearly show trends, patterns, and
79- distributions in the data. These reports help communicate key insights about
80- the size, diversity, and characteristics of openly licensed and public-domain
81- works.
38+
39+ 1 . ** 1-Fetch ** : This phase involves collecting data from a particular source
40+ using its API. Before writing any code, we plan the analyses we want to
41+ perform by asking meaningful questions about the data. We also consider API
42+ limitations (such as query limits) and design a query strategy to work
43+ within these limitations. Then we write a python script that gets the data,
44+ it is quite important to follow the format of the scripts existing in the
45+ project and use the modules and functions where applicable. It ensures
46+ consistency in the scripts and we can easily debug issues might arise.
47+ - ** Meaningful questions **
48+ - The reports generated by this project (and the data fetched and
49+ processed to support it) seeks to be meaningful. We hope this project
50+ will provide data and analysis that helps inform discussions about the
51+ commons. The goal of this project is to help answer questions like:
52+ - How has the world's use of the commons changed over time?
53+ - How is the knowledge and culture of the commons distributed ?
54+ - Who has access ( and how much) to the commons?
55+ - What significant trends can be observed in the commons?
56+ - Which public domain dedication or licenses are the most popular ?
57+ - What are the correlations between public domain dedication or licenses
58+ and region, language, domain/endeavor, etc.?
59+ - ** Limitations of an API **
60+ - Some data sources provide APIs with query limits (it can be daily or
61+ hourly) depending on what is given in the documentation. This restricts
62+ how many requests that can be made in the specified period of time. It
63+ is important to plan a query strategy and schedule fetch jobs to stay
64+ within the allowed limits.
65+ - ** Headings of data in 1-fetch **
66+ - [ Tool identifier ] [ tool-identifier ] : A unique identifier used to
67+ distinguish each Creative Commons legal tool within the dataset. This
68+ helps ensure consistency when tracking tools across different data
69+ sources.
70+ - [ SPDX identifier ] [ spdx-identifier ] : A standardized identifier maintained
71+ by the Software Package Data Exchange (SPDX) project. It provides a
72+ consistent way to reference licenses in applications.
73+ 2 . ** 2-Process ** : In this phase, the fetched data is transformed into a
74+ structured and standardized format for analysis. The data is then analyzed
75+ and categorized based on defined criteria to extract insights that answer
76+ the meaningful questions identified during the 1-fetch phase.
77+ 3 . ** 3-report ** : This phase focuses on presenting the results of the analysis .
78+ We generate graphs and summaries that clearly show trends, patterns, and
79+ distributions in the data. These reports help communicate key insights about
80+ the size, diversity, and characteristics of openly licensed and public
81+ domain works.
8282
8383[ tool-identifier ] : https://creativecommons.org/share-your-work/cclicenses/
8484[ spdx-identifier ] : https://spdx.org/licenses/
0 commit comments