JPL Technical Report Server

Salvaging Data Records with Missing Data: Data Imputation using the Multivariate t Distribution

Show simple item record Hooke, Melissa A Mrozinski, Joseph DiNicola, Michael 2022-03-01T00:44:28Z 2022-03-01T00:44:28Z 2021-03-06
dc.identifier.citation 2021 IEEE Aerospace Conference, Big Sky, Montana, March 6-13, 2021
dc.identifier.clearanceno CL#21-0068
dc.description.abstract When doing multivariate data analysis, one commonobstacle is the presence of incomplete observations, i.e., observationsfor which one or more key fields are blank. Missing datais often countered by deleting entire observations that containmissing data. The negative effects of deleting entire observationsare multiple: deleting observations reduces sample size andcan also result in biased inferences even if data is missing atrandom. In addition, knowledge contained within incompleteobservations is knowledge lost when they are deleted– and theeffort spent collecting that knowledge is effort wasted. Data imputationmethods, or methods of statistically “filling-in” missingdata, can help combat small sample sizes by using the existinginformation in partially complete observations with the end goalof producing less biased and higher confidence inferences. Whena sample from a multivariate normal population is only partiallycomplete, and the missing data meets appropriate assumptions(missing at random), robust data imputation of the missing datacan be implemented with monotone data augmentation (MDA)using the multivariate t distribution.Missing data imputation is applied to data from the NASA InstrumentCost Model (NICM) using the MDA algorithm underthe assumption of having a multivariate t distribution with fixeddegrees of freedom. A sensitivity analysis to the degrees offreedom parameter is presented to demonstrate robustness ofthe multivariate t distribution when dealing with small samplesas compared to the multivariate normal distribution.
dc.description.sponsorship NASA/JPL en_US
dc.language.iso en_US
dc.publisher Pasadena, CA: Jet Propulsion Laboratory, National Aeronautics and Space Administration, 2021
dc.title Salvaging Data Records with Missing Data: Data Imputation using the Multivariate t Distribution
dc.type Preprint

Files in this item

This item appears in the following Collection(s)

Show simple item record



My Account