You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/blog/back-to-the-future-retrospectively-harmonizing-questionnaire-data.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,7 @@ How can researchers harmonise such complex measures? One option would be to stan
29
29
30
30
An alternative approach is to apply retrospective harmonisation at the item-level. Although questionnaires can differ considerably on the number and nature of questions asked, there is often considerable overlap at the [semantic](https://harmonydata.ac.uk/semantic-text-matching-with-deep-learning-transformer-models)/content level. Let’s return to our earlier example of depression. Although there are many different questionnaires that can be used to assess this experience, they often ask the same types of questions. Below is an example of content overlap in two of the most common measures of psychological [distress](https://harmonydata.ac.uk/how-far-can-we-go-with-harmony-testing-on-kufungisisa-a-cultural-concept-of-distress-from-zimbabwe) used in children, the Revised Children’s [Anxiety](/harmonisation-validation/patient-reported-outcome-measure-information-system-promis-anxiety-subscale) and Depression Scale (RCADS), and the Mood and Feelings Questionnaire (MFQ).
By identifying, recoding, and testing the equivalence of subsets of [items](/item-harmonisation/harmony-a-free-ai-tool-for-longitudinal-study-in-psychology) from different questionnaires (for guidelines see our previous report), researchers can derive harmonised sub-scales that are directly comparable across studies. Our group has previously used this approach to study [trends in mental health](/ai-in-mental-health/) across different generations (Gondek et al., 2021), and examine how socio-economic deprivation impacted adolescent mental health across different [cohorts](/item-harmonisation/harmony-a-free-ai-tool-for-cross-cohort-research) (McElroy et al., 2022).
Copy file name to clipboardExpand all lines: content/en/blog/data-harmonisation-tools-frameworks.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ So, if the [data is not harmonised](/data-harmonisation/) using the proper tools
21
21
22
22
If you collected data using two questionnaires, such as GAD-7 and Becks Anxiety Inventory as in the below image, you would need to harmonise the datasets, by identifying correspondences between variables (the arrow in the image)
Copy file name to clipboardExpand all lines: content/en/blog/harmony-going-forward-5-things-implementation-science-has-taught-us-to-focus-on.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@ As our team embarks on the prototyping journey, I am reflecting on how we can ma
19
19
20
20
The successful implementation and [sustainability](/making-harmony-sustainable-long-term/) of digital products developed through research grant funding, has been shockingly low. We have seen this especially in the digital mental health field, where thousands of apps and platforms have been developed and only very few have been implemented and sustained in the wild. From this line of [research](https://www.psychiatrist.com/jcp/psychiatry/implementing-digital-mental-health-interventions/#ref16) and [my own work with colleagues](https://www.jmir.org/2022/11/e40347) I know that innovation and effectiveness alone are not sufficient to secure real-world adoption .
So how can we maximise the uptake and implementation of Harmony to give it a longer shelf-life? I’ll draw on the field of implementation science[[i\]](https://harmonydata.ac.uk/harmony-going-forward-5-things-implementation-science-has-taught-us-to-focus-on/#_edn1) which provides useful insights and [frameworks](https://implementationscience.biomedcentral.com/articles/10.1186/1748-5908-8-139#Abs1) on share some reflections on how this could be done and what we and our fellow teams may want to focus on at this stage.
25
25
@@ -54,7 +54,7 @@ So how can we maximise the uptake and implementation of Harmony to give it a lon
54
54
55
55
We wish all remaining [Wellcome](/ai-in-mental-health/radio-podcast-about-wellcome-data-prize) Mental Health Data Prize teams a great start into the next stage. We are beyond excited.
[[i\]](https://harmonydata.ac.uk/harmony-going-forward-5-things-implementation-science-has-taught-us-to-focus-on/#_ednref1) defined as the study of methods to promote the systematic uptake of research findings and evidence-based practices
Copy file name to clipboardExpand all lines: content/en/blog/harmony-multilingual.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,11 +18,11 @@ We're happy to share some exciting news with you. Harmony now supports at least
18
18
19
19
I evaluated Harmony's ability to match the [GAD-7](https://adaa.org/sites/default/files/GAD-7_Anxiety-updated_0.pdf) in 11 languages to the English version. I found that Harmony was able to achieve >95% AUC for 7 of the 11 non-English languages.
@@ -44,6 +44,6 @@ By supporting [multiple languages](https://fastdatascience.com/multilingual-natu
44
44
45
45
If you are interested in using Harmony or learning more about it, please visit [the Harmony website](https://harmonydata.ac.uk) or [contact us](/contact). We would love to hear from you and [get your feedback](/open-source-for-social-science/what-features-would-you-like-to-see-in-harmony/) on our [tool](/psychology-ai-tool/).
46
46
47
-
{{< image src="images/reiwa.svg" alt="Reiwa in Japanese" title="Reiwa in Japanese" >}}
47
+
{{< image src="/images/reiwa.svg" alt="Reiwa in Japanese" title="Reiwa in Japanese" >}}
48
48
49
49
_The Japanese characters above are pronounced "reiwa" and mean "beautiful harmony". [Reiwa](https://en.wikipedia.org/wiki/Reiwa_era) is the name of the current era in the Japanese official calendar, corresponding to Emperor Naruhito's reign as 126th Emperor of Japan, which began in 2019. The second character, [和](https://en.wiktionary.org/wiki/%E5%92%8C), signifies "peaceful" or "harmonious" in both Chinese and Japanese. In Chinese it's pronounced "hé", and in Japanese, "wa", as well as many other pronunciations._
Copy file name to clipboardExpand all lines: content/en/blog/harmony_social_media.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,11 +10,11 @@ Now you can share your harmonisations with your colleagues with a simple share l
10
10
11
11
We've added Firebase authentication to Harmony. You can log in with Google, Github or Twitter, and then you can see all your previous harmonisation work.
There are a number of approaches to quantify the [similarity](https://fastdatascience.com/finding-similar-documents-nlp) between strings of text. The simplest approach is known as the Bag-of-Words approach. This is *not* how Harmony currently works, but it is one of the first things we tried!
25
25
@@ -50,27 +50,27 @@ The obvious drawbacks of the Jaccard method are that
50
50
- It won’t notice negation (*I was not happy* and *I was very happy* both equally match *you were happy*).
51
51
- Most crucially, our remit for the Harmony [project](https://fastdatascience.com/starting-a-data-science-project) is that we want to harmonise data from different [languages](/psychology-ai-tool/harmony-many-languages/), such as Portuguese and English. Clearly the bag-of-words approach would not work when the texts are in different languages, unless you translated them first.
The next approach that we tried was a vector space model.
58
58
59
59
Vector space models allow us to represent words and concepts as numbers or points on a graph. For example, if *anxious* could be (2, 3), *worried* is (3, 4) and *relax* is (8, 2). The coordinates of each [concept](https://harmonydata.ac.uk/how-far-can-we-go-with-harmony-testing-on-kufungisisa-a-cultural-concept-of-distress-from-zimbabwe) are themselves meaningless, but if we calculate the distance between them we would see that *anxious* and *worried* are closer to each other than either is to *relax*.
It’s important to note that the values of the vectors are completely arbitrary. There’s no meaning at all to where a concept is assigned on the *x* or *y* axes, but there is meaning in the distances.
64
64
65
65
Now we have a way to handle synonyms. This approach is called *word vector embeddings*
66
66
67
-
{{< image src="images/blog/image.png" >}}
67
+
{{< image src="/images/blog/image.png" >}}
68
68
69
69
Some real word vector values for terms occurring in our data. Typically the vectors are large, potentially up to 500 dimensions.
70
70
71
71
Word vector embeddings became popular in 2013 after the Czech computer scientist Tomáš Mikolov [proposed a way that an AI can generate vectors](https://arxiv.org/abs/1310.4546) for every word in the English language simply from a huge set of documents.
To visualise the word vectors, we can squash them down into two or three dimensions. This is a 2D visualisation of the terms in the table above. I used an algorithm called [t-SNE](https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding) to squash them into a flat surface.
76
76
@@ -80,7 +80,7 @@ If you want to use word vector embeddings to find synonyms, you could calculate
80
80
81
81
With the Harmony data, I found that the vector space models did not correctly identify the relationship between *child bullies others* and *child is bullied by others* – which are clearly very different questions and should not be harmonised together.
@@ -94,7 +94,7 @@ Vector representations of the [GAD-7](/compare-harmonise-instruments/gad-7-vs-ph
94
94
95
95
As an aside, transformers can also be used for machine translation (in fact Google Translate now uses transformers), and this attention enables a noun+adjective phrase to be translated to another language with the correct gender.
The word *red* could be translated in various different ways into Portuguese depending on the gender and the noun to be modified. Transformer models are adept at taking these clues into context and outputting the correct translation of a phrase.
100
100
@@ -104,7 +104,7 @@ GPT-2 converts the text of each question into a vector in 1600 dimensions.
104
104
105
105
The distance between any two questions is measured according to the cosine similarity metric between the two vectors. Two questions which are similar in meaning, even if worded differently or in different languages, will have a high degree of similarity between their vector representations. Questions which are very different tend to be far apart in the vector space.
@@ -114,7 +114,7 @@ We then find the closest matches and link them together in a graph.
114
114
115
115
Because this approach is potentially error-prone, we have provided the facility for a user to edit the network graph and add and remove edges if they disagree with Harmony’s decisions.
116
116
117
-
{{< image src="images/blog/image-2.png" >}}
117
+
{{< image src="/images/blog/image-2.png" >}}
118
118
119
119
The user has an option to add or remove edges from the graph.
Copy file name to clipboardExpand all lines: content/en/blog/how-far-can-we-go-with-harmony-testing-on-kufungisisa-a-cultural-concept-of-distress-from-zimbabwe.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ Can we use Harmony to harmonise mental health instruments designed for different
30
30
31
31
> Zvaita sei kuti chembere yorasika, bere rorutsa imvi? (How is it that the old woman is missing and the hyena is vomiting grey hairs?)
> Shona proverb (similar to English “there’s no smoke without fire”)
@@ -54,7 +54,7 @@ I tried using Harmony to see how it would harmonise “kufungisisa” (thinking
54
54
55
55
Although English is the best-resource language for [natural language processing](https://naturallanguageprocessing.com/), [multilingual NLP techniques](https://fastdatascience.com/multilingual-natural-language-processing/) are catching up even for lower-resourced [languages](/psychology-ai-tool/harmony-many-languages/). There exist some [NLP](https://fastdatascience.com/portfolio/nlp-consultant/)[models](https://harmonydata.ac.uk/semantic-text-matching-with-deep-learning-transformer-models) for Shona. I used the sentence [transformer](https://harmonydata.ac.uk/how-does-harmony-work) model `Davlan/xlm-roberta-base-finetuned-shona` which is a modification of ROBERTA trained on Shona texts[7]. I plugged one into Harmony and tried to match the [Shona symptom questionnaire for the detection of depression and anxiety](https://depts.washington.edu/edgh/zw/hit/web/project-resources/shona_symptom_questionnaire.pdf), which is used in Zimbabwe[6].
Harmony and the Shona transformer model matched the question about “kufungisisa” to GHQ-12 question 1 “been able to concentrate on whatever you’re doing?” which seems approximately OK. However, I would need a Shona native speaker to validate my results.
0 commit comments