In Alzheimer's & dementia : the journal of the Alzheimer's Association
BACKGROUND : Alzheimer's disease (AD) tends to affect the parietal lobe, which is responsible for language functions in the patient's brain. Thus, one significant impairment associated with AD is language impairment which is the cause that patients with AD have deficits at the word level (vocabulary size in speech can be an early sign of cognitive impairment [1]), sentence-level, and discourse-level of their languages. Thus, language disorders can be considered as markers to diagnose AD in a patient at its earliest stage [2]. The recent progress in the machine learning (ML) domain has revolutionized the early detection of AD in particular developing and deploying ML-based language assessment (MLLA) methods for detecting AD at its mild cognitive impairment stage [3]. The ML community needs to develop fair MLLA methods for detecting individuals with AD.
METHOD : To develop a fair MLLA first we should consider how we can formulate the fairness for developing a MLLA method and how potential bias in language data can be identified and quantified, how text preprocessing, data augmentation and word embedding techniques can be applied without adding biases to MLLA, how can we identify protected and unprotected linguistic features? and finally how can we protect underserved groups in the deployment process of MLLA methods?
RESULT : We suggest a fair ML pipeline includes 1) preprocessing (i.e., eliminating sources of bias in data collection and data sharing, text augmentation and text preprocessing, defining unknown sensitive features including linguistic diversity, which are highly correlated with sensitive attributes such as races and genders. Note that sensitive attributes can be used to alleviate the bias). 2) In-processing (i.e., adjusting machine learning process, e.g., using a regularization approach). 3) Post-processing (i.e., adjusting trained model).
CONCLUSION : The development of fair MLLA methods is crucial to ensure all groups are treated fairly and there are no results that unfairly harm any subgroups of our population. Using our suggested pipeline, we can foster the confidence of clinical services to use MLLA methods and motivate patients to accept the results.
Parsapoor Mahboobeh
2022-Dec