update

woodthom2 · woodthom2 · commit 22f68f19fc01 · 2024-10-07T11:14:06.000+01:00
diff --git a/content/en/ada.md b/content/en/ada.md
@@ -13,7 +13,7 @@ The [Australian Data Archive (ADA)](https://ada.edu.au/) is a national service f
 
 The ADA provides data access through the [ADA Dataverse](https://dataverse.ada.edu.au/). The collection includes polls on housing conditions in Australian states, political views over time across the country, questions about employment or health, and other datasets that the ADA has collected over the years (such as the Australian election study).
 
-In 2023, the ADA embarked on a project to harmonise a vast collection of survey questions, seeking a solution that could effectively identify and group similar items across different studies. Researchers at the ADA found Harmony, a [data harmonisation](/data-harmonisation-unifying-data-for-deeper-insights/) tool powered by [natural language processing](https://naturallanguageprocessing.com/) (NLP), and the ADA recognised its potential to streamline this process.
+In 2023, the ADA embarked on a project to harmonise a vast collection of survey questions, seeking a solution that could effectively identify and group similar items across different studies. Researchers at the ADA found Harmony, a [data harmonisation](/data-harmonisation/) tool powered by [natural language processing](https://naturallanguageprocessing.com/) (NLP), and the ADA recognised its potential to streamline this process.
 
 ## Challenges
 
@@ -25,7 +25,7 @@ The ADA faces several challenges in managing its extensive questionnaire data:
 ## Integrating Harmony into the ADA’s workflow
 
 The ADA may integrate Harmony into its processes, using its powerful NLP capabilities to address the challenges and expedite questionnaire harmonisation:
-1. Automated item comparison: Harmony's NLP algorithms [automatically compared and grouped questionnaire items based on their semantic similarity](/how-does-harmony-work/), eliminating the need for manual effort.
+1. Automated item comparison: Harmony's NLP algorithms [automatically compared and grouped questionnaire items based on their semantic similarity](/nlp-semantic-text-matching/how-does-harmony-work/), eliminating the need for manual effort.
 2. Enhanced consistency: Harmony's intelligent approach ensured consistent categorisation of questionnaire items, reducing inconsistencies and improving data integrity.
 
 ## Impact of Harmony on ADA's Operations
diff --git a/content/en/blog/aidl-meetup.md b/content/en/blog/aidl-meetup.md
@@ -79,8 +79,9 @@ Our session will explore the transformative potential of Generative AI, focusing
 
 ## See also our past events
 
-* 11 and 12 September 2024: [Harmony at MethodsCon Futures](/harmony-at-methodscon-futures-in-manchester/) in Manchester
-* 2 July 2024: [Harmony: NLP and generative models for psychology research](/pydata)  at Pydata London
+* 11 and 12 September 2024: [Harmony at MethodsCon Futures](/ai-in-mental-health/harmony-at-methodscon-futures/
+) in Manchester
+* 2 July 2024: [Harmony: NLP and generative models for psychology research](/psychology-ai-tool/pydata-meetup/)  at Pydata London
 * 3 June 2024: [Harmony Hackathon](/hackathon/) at UCL
 * 5 May 2024: [Harmony: A global platform for harmonisation, translation and cooperation in mental health](/harmony-at-lifecourse-seminar/) at  Melbourne Children’s LifeCourse Initiative seminar series.
 * 27 March 2024: [Harmony at AI Camp](/upcoming-tech-talk-at-aicamp-meetup/)
diff --git a/content/en/blog/back-to-the-future-retrospectively-harmonizing-questionnaire-data.md b/content/en/blog/back-to-the-future-retrospectively-harmonizing-questionnaire-data.md
@@ -19,7 +19,7 @@ Now more than ever, the international research community are keen to determine w
 
 As an alternative to direct replication, researchers may choose to reach out to others in the field who either have access to, or are in the process of collecting, comparable data. Indeed many researchers, particularly those in the life and social sciences, routinely make use of large, ongoing studies that collect a variety of data for multiple purposes (e.g. [longitudinal](/item-harmonisation/harmony-a-free-ai-tool-to-merge-longitudinal-studies) population studies). In practice however, much of our research is designed and carried out in silos – with different research groups tackling similar research questions using widely different designs and measures. Even if a researcher is successful in identifying data that are similar to their original work, minor differences in the design or measures may limit the comparability. What are researchers to do in such situations?
 
-One increasingly popular option is retrospective [harmonisation](/data-harmonisation). This involves taking existing data from two or more disparate sources, and transforming the data in some way in order to make it directly comparable across sources. Let’s look at a simple, hypothetical example. Say a researcher wants to examine the relationship between level of [education](/data-harmonisation-in-education) and [depression](/harmonisation-validation/promis-depression-subscale), and whether this varies across two datasets, each from a different country. In dataset A, participants were asked to report their highest qualification out of a list of 10 options ranging from “no formal education” to “doctoral education”, whereas in dataset B there was a simple question that asked participants whether they completed a Bachelor’s degree (yes/no). The 10-option question in dataset A could be recoded to match the variable in dataset B, by collapsing all of the categories above and below Bachelor’s level. In many cases, retrospective harmonisation can be applied on an ad-hoc basis, using simple, logical recoding strategies such as this.
+One increasingly popular option is retrospective [harmonisation](/data-harmonisation). This involves taking existing data from two or more disparate sources, and transforming the data in some way in order to make it directly comparable across sources. Let’s look at a simple, hypothetical example. Say a researcher wants to examine the relationship between level of [education](/data-harmonisation/data-harmonisation-in-education/) and [depression](/harmonisation-validation/promis-depression-subscale), and whether this varies across two datasets, each from a different country. In dataset A, participants were asked to report their highest qualification out of a list of 10 options ranging from “no formal education” to “doctoral education”, whereas in dataset B there was a simple question that asked participants whether they completed a Bachelor’s degree (yes/no). The 10-option question in dataset A could be recoded to match the variable in dataset B, by collapsing all of the categories above and below Bachelor’s level. In many cases, retrospective harmonisation can be applied on an ad-hoc basis, using simple, logical recoding strategies such as this.
 
 However, not all constructs can be measured with such simple, categorical questions. Take the above outcome variable (depression) for instance. Depression is a complex, heterogeneous experience, characterized by a multitude of symptoms that can be experienced to various degrees and in different combinations. In large-scale surveys, depression is typically measured with standardized questionnaires – participants are asked to report on a range of symptoms, their responses are assigned numerical values, and these are summed to form a “total depression score” for each individual. Although this remains the most viable and plausible strategy for measuring something as complex as depression, there is no “gold standard” questionnaire that is universally adopted by researchers. Instead, there are well over 200 established depression scales. In a [recent review](https://www.closer.ac.uk/wp-content/uploads/210715-Harmonisation-measurement-properties-mental-health-measures-british-cohorts.pdf) (McElroy et al., 2020), we noted that the content of these questionnaires can differ markedly, e.g. different symptoms are assessed, or different response options are used.
 
@@ -31,7 +31,7 @@ An alternative approach is to apply retrospective harmonisation at the item-leve
 
 By identifying, recoding, and testing the equivalence of subsets of [items](/item-harmonisation/harmony-a-free-ai-tool-for-longitudinal-study-in-psychology) from different questionnaires (for guidelines see our previous report), researchers can derive harmonised sub-scales that are directly comparable across studies. Our group has previously used this approach to study trends in mental health across different generations (Gondek et al., 2021), and examine how socio-economic deprivation impacted adolescent mental health across different [cohorts](/item-harmonisation/harmony-a-free-ai-tool-for-cross-cohort-research) (McElroy et al., 2022).
 
-One of the main challenges to retrospectively harmonising questionnaire data is identifying the specific items that are comparable across the measures. In the above example, we used expert opinion to match candidate items based on their content, and used psychometric tests to determine how plausible it was to assume that matched items were directly comparable. Although our results were promising, this process was time-consuming, and the reliance on expert opinion introduces an element of human [bias](https://fastdatascience.com/how-can-we-eliminate-bias-from-ai-algorithms-the-pen-testing-manifesto) – i.e. different experts may disagree on which items match. As such, we are currently working on a [project](https://fastdatascience.com/starting-a-data-science-project) supported by [Wellcome](/radio-podcast-about-wellcome-data-prize), in which we aim to develop an online tool, ‘Hamony’, that uses machine learning to help researchers match items from different questionnaires based on their underlying meaning. Our overall aim is to streamline and add consistency and replicability to the harmonisation process. We plan to test the utility of this tool by using it to harmonise measures of mental health and social connectedness across two cohort of young people from the UK and and Brazil.
+One of the main challenges to retrospectively harmonising questionnaire data is identifying the specific items that are comparable across the measures. In the above example, we used expert opinion to match candidate items based on their content, and used psychometric tests to determine how plausible it was to assume that matched items were directly comparable. Although our results were promising, this process was time-consuming, and the reliance on expert opinion introduces an element of human [bias](https://fastdatascience.com/how-can-we-eliminate-bias-from-ai-algorithms-the-pen-testing-manifesto) – i.e. different experts may disagree on which items match. As such, we are currently working on a [project](https://fastdatascience.com/starting-a-data-science-project) supported by [Wellcome](/ai-in-mental-health/radio-podcast-about-wellcome-data-prize/), in which we aim to develop an online tool, ‘Hamony’, that uses machine learning to help researchers match items from different questionnaires based on their underlying meaning. Our overall aim is to streamline and add consistency and replicability to the harmonisation process. We plan to test the utility of this tool by using it to harmonise measures of mental health and social connectedness across two cohort of young people from the UK and and Brazil.
 
 Follow this blog for updates on our Harmony project!
 
diff --git a/content/en/blog/contribute-open-source-project.md b/content/en/blog/contribute-open-source-project.md
@@ -57,7 +57,7 @@ You might find this guide helpful: https://opensource.guide/how-to-contribute as
 
 Read our [guide to contributing to Harmony](/contributing-to-harmony/).
 
-Harmony is a powerful [data harmonisation tool](/data-harmonisation-unifying-data-for-deeper-insights/) which uses [natural language processing](https://naturallanguageprocessing.com/) (NLP) to [bridge the gap between diverse research studies](/ppie-for-secondary-data-analysis/), automatically comparing and grouping similar items across datasets.  Here are a few ways you can get involved in the project:
+Harmony is a powerful [data harmonisation tool](/data-harmonisation/) which uses [natural language processing](https://naturallanguageprocessing.com/) (NLP) to [bridge the gap between diverse research studies](/ai-in-mental-health/ppie-for-secondary-data-analysis/), automatically comparing and grouping similar items across datasets.  Here are a few ways you can get involved in the project:
 
 ### 1. Get coding
 
diff --git a/content/en/blog/data-harmonisation-healthcare.md b/content/en/blog/data-harmonisation-healthcare.md
@@ -34,7 +34,8 @@ Data harmonisation is a critical endeavor in healthcare, underpinning efforts to
 ---
 ### Methodological Considerations
 
-Harmonisation methods in data science and healthcare research aim to standardize disparate data sources to ensure consistency, comparability, and reliability across datasets. These methods are critical in the context of big data and the increasing reliance on electronic health records (EHRs), where data is often collected from various sources with different standards and formats. Harmonisation can be approached retrospectively, after data collection, or prospectively, before data collection begins. The choice between these approaches depends on the constraints of the existing datasets and the theoretical [frameworks](/data-harmonisation-tools-frameworks) guiding the research or clinical objectives.
+Harmonisation methods in data science and healthcare research aim to standardize disparate data sources to ensure consistency, comparability, and reliability across datasets. These methods are critical in the context of big data and the increasing reliance on electronic health records (EHRs), where data is often collected from various sources with different standards and formats. Harmonisation can be approached retrospectively, after data collection, or prospectively, before data collection begins. The choice between these approaches depends on the constraints of the existing datasets and the theoretical [frameworks](/data-harmonisation/data-harmonisation-tools-frameworks/
+) guiding the research or clinical objectives.
 
 ---
 ## Key Strategies for Harmonisation
@@ -105,7 +106,7 @@ Despite these challenges, the benefits of harmonising health data are substantia
 ---
 ## Implementing Data Harmonisation
 
-Data harmonisation can be implemented retrospectively, after data collection, or prospectively, before data is collected. [Retrospective](/back-to-the-future-retrospectively-harmonising-questionnaire-data) harmonisation, also known as ex-post or output harmonisation, aligns existing datasets. Prospective harmonisation, or ex-ante/input harmonisation, involves planning data collection methods and standards in advance to ensure compatibility. Each approach has its merits, and the choice between them often depends on the goals of the harmonisation effort and the nature of the data involved.
+Data harmonisation can be implemented retrospectively, after data collection, or prospectively, before data is collected. [Retrospective harmonisation](/data-harmonisation/back-to-the-future-retrospectively-harmonising-questionnaire-data/), also known as ex-post or output harmonisation, aligns existing datasets. Prospective harmonisation, or ex-ante/input harmonisation, involves planning data collection methods and standards in advance to ensure compatibility. Each approach has its merits, and the choice between them often depends on the goals of the harmonisation effort and the nature of the data involved.
 
 The process involves defining the scope of harmonisation, identifying relevant data sources, standardizing data formats and terminologies, and employing technologies such as natural language processing (NLP) to ensure data quality and consistency. Numerous initiatives support data harmonisation efforts, such as the Common Data Model Harmonisation project, which aims to enhance data utility and interoperability across healthcare networks. Tools and technologies like CDASH and the NIH's Common Data Elements facilitate registry interoperability.
 
diff --git a/content/en/blog/hackathon.md b/content/en/blog/hackathon.md
@@ -14,7 +14,7 @@ The Hackathon event will be held at Chandler House (UCL), providing a vibrant an
 
 **This is an in-person hackathon happening in London on 3 June 2024.**
 
-Make sure to also join our [community](/community) on Discord, check out the [ideas list](/ideas) and try our [Kaggle competition](/kaggle)!
+Make sure to also join our [community](/community) on Discord, check out the [ideas list](/ideas) and try our [Kaggle competition](/psychology-ai-tool/kaggle/)!
 
 {{< card heading="Register for the Harmony hackathon" copy="Sign up on Eventbrite" url="https://www.eventbrite.com/e/harmony-hackathon-tickets-887795278577" >}}
 
diff --git a/content/en/blog/measuring-the-performance-of-nlp-algorithms.md b/content/en/blog/measuring-the-performance-of-nlp-algorithms.md
@@ -15,7 +15,7 @@ _Harmony was able to reconstruct the matches of the questionnaire harmonisation
 
 The content of this blog post has been written up as a [preprint for publication on OSF](https://osf.io/9x5ej).
 
-Harmony is a tool for comparing questions in natural language from different surveys or instruments. In order to develop the tool, we had to be able to quantify how good it is at recognising equivalent or similar questions. You can read about how Harmony works [in my earlier blog post](/how-does-harmony-work/).
+Harmony is a tool for comparing questions in natural language from different surveys or instruments. In order to develop the tool, we had to be able to quantify how good it is at recognising equivalent or similar questions. You can read about how Harmony works [in my earlier blog post](/nlp-semantic-text-matching/how-does-harmony-work/).
 
 For example, we might consider _Tries to Stop Quarrels_ is equivalent to _Is helpful if someone is hurt, upset or feeling ill_, even though there are no words in common between the two texts. But this is subjective, and if we are using AI to make this kind of matches, how can we put a number on our AI’s performance?
 
diff --git a/content/en/blog/pydata.md b/content/en/blog/pydata.md
@@ -6,7 +6,7 @@ image: "/images/thomas-wood-pydata.jpg"
 
 aliases:
   - "/pydata/"
-url: "/psychology-ai-tool/aidl-meetup/"
+url: "/psychology-ai-tool/pydata-meetup/"
 ---
 
 ## Harmony at PyData London - 86th Meetup
diff --git a/content/en/blog/semantic-text-matching-with-deep-learning-transformer-models.md b/content/en/blog/semantic-text-matching-with-deep-learning-transformer-models.md
@@ -27,7 +27,7 @@ In the case of Harmony, we want to measure the similarity of every item in a que
 
 Recent advancements in deep learning have enabled a new type of semantic text matching technique through [Transformer models](https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)), such as [BERT](https://en.wikipedia.org/wiki/BERT_%28language_model%29), [GPT-3](https://openai.com/api/), and the recently announced [Google BARD](https://blog.google/technology/ai/bard-google-ai-search-updates/).
 
-[Transformer](/how-does-harmony-work) models operate on sequences of words, and transform entire sentences in many languages into a vector representation in high-dimensional space. Then we can quantify the similarity between sentences with a simple metric such as Euclidean or cosine distance. This enables us to measure the similarity between words.
+[Transformer](/nlp-semantic-text-matching/how-does-harmony-work/) models operate on sequences of words, and transform entire sentences in many languages into a vector representation in high-dimensional space. Then we can quantify the similarity between sentences with a simple metric such as Euclidean or cosine distance. This enables us to measure the similarity between words.
 
 In developing Harmony, the [most performant algorithm](/nlp-semantic-text-matching/measuring-the-performance-of-nlp-algorithms/) tested so far was GPT-3, however, as the field is evolving so rapidly, this is likely to be out of date very soon. So please watch our blog, and in the meantime you can [test out Harmony](https://harmonydata.ac.uk/app/) on your data.
 
diff --git a/content/en/frequently-asked-questions.md b/content/en/frequently-asked-questions.md
@@ -18,7 +18,7 @@ Harmony is a tool that helps researchers automate the process of harmonisation u
 
 ## How do I cite Harmony?
 
-If you would like to cite our [validation study](/bmc-psychiatry-paper/), published in BMC Psychiatry, you can cite:
+If you would like to cite our [validation study](/ai-in-mental-health/bmc-psychiatry-paper/), published in BMC Psychiatry, you can cite:
 
 * McElroy, E., Wood, T.A., Bond, R., Mulvenna M., Shevlin M., Ploubidis G., Scopel Hoffmann M., Moltrecht B., [Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data](https://bmcpsychiatry.biomedcentral.com/articles/10.1186/s12888-024-05954-2#citeas). BMC Psychiatry 24, 530 (2024). https://doi.org/10.1186/s12888-024-05954-2