Skip to content

Commit 8d3f9a1

Browse files
author
SOCIALSCIENCEai
committed
2 parents 2e636ab + 4be4c43 commit 8d3f9a1

File tree

2 files changed

+6
-15
lines changed

2 files changed

+6
-15
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[![License](https://img.shields.io/badge/License-CC--BY--NC--SA--4.0-green.svg?style=flat-square)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
44
[![Github Stars](https://img.shields.io/github/stars/socius-org/sentibank?style=flat-square&logo=github)](https://github.com/socius-org/sentibank)
55
[![Github Watchers](https://img.shields.io/github/watchers/socius-org/sentibank?style=flat-square&logo=github)](https://github.com/socius-org/sentibank)
6-
![PyPI - Downloads](https://img.shields.io/pypi/dm/sentibank?style=flat-square&logo=python)
6+
[![PyPI - Downloads](https://img.shields.io/pypi/dm/sentibank?style=flat-square&logo=python)](https://pypistats.org/packages/sentibank)
77

88
**`sentibank`** is a comprehensive, open database of expert-curated sentiment dictionaries and lexicons to power sentiment analysis.
99

@@ -63,7 +63,7 @@ See below for the available predefined lexicon identifier.
6363
|**MASTER** <br> (Loughran and McDonland, 2011; Bodnaruk, Loughran and McDonald, 2015)| Financial lexicons covering expressions common in business writing. |Regulatory Filings (10-K)|Finance| `MASTER_v2022`|
6464
|**Norms of Valence, Arousal and Dominance (NoVAD)** <br> (Warriner, Kuperman and Brysbaert, 2013; Warriner and Kuperman, 2014)| A lexicon of 14,000 common English lemmas across valence, arousal, and dominance dimensions. | Vernacular (Day-to-Day Expression) | General, Psychology | `NoVAD_v2013_adjusted`, `NoVAD_v2013_bidimensional`|
6565
|**OpinionLexicon** <br> (Hu and Liu, 2004)| Opinion words tailored for sentiment analysis of product reviews.|Product Reviews|Consumer Products|`OpinionLexicon_v2004`|
66-
|**SenticNet** <br> (Cambria et al., 2010; Cambria, Havasi and Hussain, 2012; Cambria, Olsher and Rajagopal, 2014; Cambria et al., 2016; Cambria et al., 2018; Cambria et al., 2020; Cambria et al., 2022) | Conceptual lexicon providing multidimensional sentiment analysis for commonsense concepts and expressions. | General | General | `SenticNet_v2010`, `SenticNet_v2012`, `SenticNet_v2012_attributes`, `SenticNet_v2012_semantics`, `SenticNet_v2014`, `SenticNet_v2014_attributes`, `SenticNet_v2014_semantics`, `SenticNet_v2016`, `SenticNet_v2016_attributes`, `SenticNet_v2016_mood`, `SenticNet_v2016_semantics`, `SenticNet_v2018`, `SenticNet_v2018_attributes`, `SenticNet_v2018_mood`, `SenticNet_v2018_semantics`, `SenticNet_v2020`, `SenticNet_v2020_attributes`, `SenticNet_v2020_mood`, `SenticNet_v2020_semantics`, `SenticNet_v2022`, `SenticNet_v2022_attributes`, `SenticNet_v2022_mood`, `SenticNet_v2022_semantics` |
66+
|**SenticNet** <br> (Cambria et al., 2010; Cambria, Havasi and Hussain, 2012; Cambria, Olsher and Rajagopal, 2014; Cambria et al., 2016, 2018, 2020, 2022) | Conceptual lexicon providing multidimensional sentiment analysis for commonsense concepts and expressions. | General | General | `SenticNet_v2010`, `SenticNet_v2012`, `SenticNet_v2012_attributes`, `SenticNet_v2012_semantics`, `SenticNet_v2014`, `SenticNet_v2014_attributes`, `SenticNet_v2014_semantics`, `SenticNet_v2016`, `SenticNet_v2016_attributes`, `SenticNet_v2016_mood`, `SenticNet_v2016_semantics`, `SenticNet_v2018`, `SenticNet_v2018_attributes`, `SenticNet_v2018_mood`, `SenticNet_v2018_semantics`, `SenticNet_v2020`, `SenticNet_v2020_attributes`, `SenticNet_v2020_mood`, `SenticNet_v2020_semantics`, `SenticNet_v2022`, `SenticNet_v2022_attributes`, `SenticNet_v2022_mood`, `SenticNet_v2022_semantics` |
6767
|**SentiWordNet** <br> (Esuli and Sebastiani, 2006; Baccianella, Esuli and Sebastiani, 2010)| Lexicon associating WordNet synsets with positive, negative, and objective scores. |General|General| `SentiWordNet_v2010_logtransform`, `SentiWordNet_v2010_simple`|
6868
| **SO-CAL** <br> (Taboada et al., 2011) | Lexicon designed for domain-independent sentiment analysis. | General | General | `SO-CAL_v2011` |
6969
|**VADER** <br> (Hutto and Gilbert, 2014)| General purpose lexicon optimised for social media and microblogs. |Social Media|General| `VADER_v2014`|

doc/CONTRIBUTING.md

Lines changed: 4 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Contributions should meet the following criteria.
1010
- Determine the domain (i.e Psychology, Economics, Politics) or genre (i.e News, Social Media, Academic)
1111
- Refer to existing lexicons to identify gaps to fill
1212

13-
> ``ℹ️`` Before dedicating resources to creating new dictionary, please open an [issue](https://github.com/socius-org/sentibank/issues) outlining your proposed lexicon. This can serve as a transparent public record on how lexicons are created, processed and enhanced. We expect this record to: Allow all sentibank users to better understand the composition and quality of included dictionaries; Enable substantive community discussion; and Provide a learning opportunity for those developing lexicons in the future.
13+
> ``ℹ️`` Before dedicating resources to creating new dictionary, please open an [issue](https://github.com/socius-org/sentibank/issues) outlining your proposed lexicon. This can serve as a transparent public record on how lexicons are created, processed and enhanced. We expect this record to: (i) Allow all sentibank users to better understand the composition and quality of included dictionaries; (ii) Enable substantive community discussion; and (iii) Provide a learning opportunity for those developing lexicons in the future.
1414
1515
## 📝 Composition
1616
- Contains a minimum of 100 labeled sentiment terms/phrases. Smaller sample sizes tend not over-index specific sources.
@@ -30,23 +30,14 @@ Contributions should meet the following criteria.
3030

3131
## ⚖️ Licensing
3232
- Lexicon is under a permissive open license compatible with CC-BY-SA 4.0 or public domain.
33-
- Sources allow free re-use, re-distribution, and commercial use. Proprietary datasets cannot be accepted.
33+
- Sources allow free re-use and re-distribution for non-commercial use. Proprietary datasets cannot be accepted.
3434

3535
## 🗂️ Format
3636
- Submitted as a CSV file with column headers specifying key attributes of each entry.
3737
- Uses consistent text encoding (ideally UTF-8) and escapes special characters or markup.
3838
- Contains no personally identifiable or otherwise confidential information.
3939

4040

41-
> ``ℹ️`` We understand sentiment annotation can be new territory for many. Please don't hesitate to reach out - we're happy to provide guidance each step of the way.
41+
> ``ℹ️`` We understand sentiment annotation can be new territory for many. Please don't hesitate to reach out - we're happy to provide guidance each step of the way. Our team has experience developing guidelines, training annotators, measuring agreement, and related tasks. We can lend our knowledge to ensure your project goes smoothly.
4242
>
43-
> Our team has experience developing guidelines, training annotators, measuring agreement, and related tasks. We can lend our knowledge to ensure your project goes smoothly.
44-
>
45-
> If any part of the process seem unclear, let us know. We're glad to advise on effective instructions, corpus creation, quality control, and any other annotation needs.
46-
>
47-
> Feel free to reach out through research@socius.org to discuss best practices tailored to your goals. We look forward to hearing your ideas and providing customised support to make them a reality.
48-
>
49-
> Our hope is for the criteria to inspire creative solutions, not deter participation. We're here to lower barriers and see new lexicons brought to life through thoughtful partnership.
50-
51-
52-
By pooling our talents, we can transform sentibank into an even more powerful tool. Join us in unlocking sentibank's full potential through shared knowledge!
43+
> If any part of the process seem unclear, let us know. We're glad to advise on effective instructions, corpus creation, quality control, and any other annotation needs. Feel free to reach out through research@socius.org to discuss best practices tailored to your goals. We look forward to hearing your ideas and providing customised support to make them a reality.

0 commit comments

Comments
 (0)