Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Standard

Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements. / Ekstrøm, Claus Thorn; Carstensen, Bendix.

I: The International Journal of Biostatistics, 2024.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Harvard

Ekstrøm, CT & Carstensen, B 2024, 'Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements', The International Journal of Biostatistics. https://doi.org/10.1515/ijb-2023-0037

APA

Ekstrøm, C. T., & Carstensen, B. (2024). Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements. The International Journal of Biostatistics. https://doi.org/10.1515/ijb-2023-0037

Vancouver

Ekstrøm CT, Carstensen B. Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements. The International Journal of Biostatistics. 2024. https://doi.org/10.1515/ijb-2023-0037

Author

Ekstrøm, Claus Thorn ; Carstensen, Bendix. / Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements. I: The International Journal of Biostatistics. 2024.

Bibtex

@article{699863b6988c45508d2584d30e9a1e4e,

title = "Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements",

abstract = "Agreement between methods for quantitative measurements are typically assessed by computing limits of agreement between pairs of methods and/or by illustration through Bland-Altman plots. We consider the situation where the observed measurement methods are considered a random sample from a population of possible methods, and discuss how the underlying linear mixed effects model can be extended to this situation. This is relevant when, for example, the methods represent raters/judges that are used to score specific individuals or items. In the case of random methods, we are not interested in estimates pertaining to the specific methods, but are instead interested in quantifying the variation between the methods actually involved making measurements, and accommodating this as an extra source of variation when generalizing to the clinical performance of a method. In the model we allow raters to have individual precision/skill and permit linked replicates (i.e., when the numbering, labeling or ordering of the replicates within items is important). Applications involving estimation of the limits of agreement for two datasets are shown: A dataset of spatial perception among a group of students as well as a dataset on consumer preference of French chocolate. The models are implemented in the MethComp package for R [Carstensen B, Gurrin L, Ekstr{\o}m CT, Figurski M. MethComp: functions for analysis of agreement in method comparison studies; 2013. R package version 1.22, R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2012].",

author = "Ekstr{\o}m, {Claus Thorn} and Bendix Carstensen",

note = "{\textcopyright} 2024 Walter de Gruyter GmbH, Berlin/Boston.",

year = "2024",

doi = "10.1515/ijb-2023-0037",

language = "English",

journal = "International Journal of Biostatistics",

issn = "1557-4679",

publisher = "De Gruyter",

}

RIS

TY - JOUR

T1 - Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements

AU - Ekstrøm, Claus Thorn

AU - Carstensen, Bendix

PY - 2024

Y1 - 2024

N2 - Agreement between methods for quantitative measurements are typically assessed by computing limits of agreement between pairs of methods and/or by illustration through Bland-Altman plots. We consider the situation where the observed measurement methods are considered a random sample from a population of possible methods, and discuss how the underlying linear mixed effects model can be extended to this situation. This is relevant when, for example, the methods represent raters/judges that are used to score specific individuals or items. In the case of random methods, we are not interested in estimates pertaining to the specific methods, but are instead interested in quantifying the variation between the methods actually involved making measurements, and accommodating this as an extra source of variation when generalizing to the clinical performance of a method. In the model we allow raters to have individual precision/skill and permit linked replicates (i.e., when the numbering, labeling or ordering of the replicates within items is important). Applications involving estimation of the limits of agreement for two datasets are shown: A dataset of spatial perception among a group of students as well as a dataset on consumer preference of French chocolate. The models are implemented in the MethComp package for R [Carstensen B, Gurrin L, Ekstrøm CT, Figurski M. MethComp: functions for analysis of agreement in method comparison studies; 2013. R package version 1.22, R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2012].

AB - Agreement between methods for quantitative measurements are typically assessed by computing limits of agreement between pairs of methods and/or by illustration through Bland-Altman plots. We consider the situation where the observed measurement methods are considered a random sample from a population of possible methods, and discuss how the underlying linear mixed effects model can be extended to this situation. This is relevant when, for example, the methods represent raters/judges that are used to score specific individuals or items. In the case of random methods, we are not interested in estimates pertaining to the specific methods, but are instead interested in quantifying the variation between the methods actually involved making measurements, and accommodating this as an extra source of variation when generalizing to the clinical performance of a method. In the model we allow raters to have individual precision/skill and permit linked replicates (i.e., when the numbering, labeling or ordering of the replicates within items is important). Applications involving estimation of the limits of agreement for two datasets are shown: A dataset of spatial perception among a group of students as well as a dataset on consumer preference of French chocolate. The models are implemented in the MethComp package for R [Carstensen B, Gurrin L, Ekstrøm CT, Figurski M. MethComp: functions for analysis of agreement in method comparison studies; 2013. R package version 1.22, R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2012].

U2 - 10.1515/ijb-2023-0037

DO - 10.1515/ijb-2023-0037

M3 - Journal article

C2 - 38379532

JO - International Journal of Biostatistics

JF - International Journal of Biostatistics

SN - 1557-4679

ER -

ID: 384741389