Skip to content

CSR report

l-gou edited this page Jan 14, 2025 · 31 revisions

This report outlines the CSR impact of our A.P.RI.L. project. The project enables users to efficiently gather and analyze web data through a user-friendly interface. We have ensured the project adheres to ethical standards of data collection, minimizes environmental impact, and provides social value through transparency and accessibility.

Table of contents

CSR Strategy and Goals

Ethical Data Use

  • Compliance with Data Privacy Regulations: The project adheres to data privacy laws, such as the General Data Protection Regulation (GDPR), focusing exclusively on publicly available information. No personal or sensitive data is collected, processed, or stored, ensuring privacy and legal compliance. The project aligns with French legal frameworks confirming the legitimacy of text and data mining for research purposes.

    Evidence of Compliance:

    • Scraping only non-personal, publicly accessible data.
    • Regular audits to ensure adherence to applicable laws and website policies.
    • Implementing data anonymization and strict access controls.
  • Transparent Data Collection: All data collection practices are clearly documented, ensuring stakeholders are informed about sources and methods. The team respects intellectual property rights and avoids using proprietary data without authorization. This approach complies with the legal recognition of text and data mining in France source.

    Best Practice:

    • Keeping comprehensive logs of websites and data sources used.
    • Documenting the entire data collection process to ensure transparency and accountability.
  • Consent and Permission: Although web scraping often involves gathering publicly available data, the project carefully respects each website’s robots.txt file and scraping policies. If scraping is explicitly prohibited, data is not collected from that site. In cases where terms are unclear, the project seeks explicit permission to scrape.

  • Ethical Web Scraping:

    • Compliance with Legal Standards: Scraping pipelines must respect privacy laws like GDPR, ensuring no personally identifiable information (PII) is processed or stored without consent.
    • Minimizing Negative Impact: Ethical scraping involves limiting requests to avoid overloading servers, respecting rate limits, and maintaining data integrity to prevent malicious use.
    • Transparency and Collaboration: Credit websites or data sources when possible to foster respect and transparency.

Environmental Impact

This topic has been covered in a detailed report that you can find at CSR Report on GitHub.

Social and Economic Impact

  • Access to Data for Social Good: The data collected and analyzed through this project has the potential to drive positive social outcomes. By offering insights into key topics, it can support research aimed at addressing critical societal challenges. This data can be instrumental in public research, informing policy-making, and raising awareness of issues like the consequences of global warming.

  • Promoting Equity in Research: The project democratizes access to valuable data that might otherwise be difficult to obtain, particularly for researchers, nonprofits, and institutions with limited resources. By making data and insights widely available, the project supports more equitable participation in research, allowing underrepresented groups and regions to contribute to and benefit from important findings. This can lead to more diverse perspectives and inclusive solutions to global issues.

Clone this wiki locally