JPL Technical Report Server

Salvaging Data Records with Missing Data: Data Imputation using the Multivariate t Distribution

Show simple item record

dc.contributor.author Hooke, Melissa A
dc.contributor.author Mrozinski, Joseph
dc.contributor.author DiNicola, Michael
dc.date.accessioned 2022-03-01T00:44:28Z
dc.date.available 2022-03-01T00:44:28Z
dc.date.issued 2021-03-06
dc.identifier.citation 2021 IEEE Aerospace Conference, Big Sky, Montana, March 6-13, 2021
dc.identifier.clearanceno CL#21-0068
dc.identifier.uri http://hdl.handle.net/2014/54252
dc.description.abstract When doing multivariate data analysis, one commonobstacle is the presence of incomplete observations, i.e., observationsfor which one or more key fields are blank. Missing datais often countered by deleting entire observations that containmissing data. The negative effects of deleting entire observationsare multiple: deleting observations reduces sample size andcan also result in biased inferences even if data is missing atrandom. In addition, knowledge contained within incompleteobservations is knowledge lost when they are deleted– and theeffort spent collecting that knowledge is effort wasted. Data imputationmethods, or methods of statistically “filling-in” missingdata, can help combat small sample sizes by using the existinginformation in partially complete observations with the end goalof producing less biased and higher confidence inferences. Whena sample from a multivariate normal population is only partiallycomplete, and the missing data meets appropriate assumptions(missing at random), robust data imputation of the missing datacan be implemented with monotone data augmentation (MDA)using the multivariate t distribution.Missing data imputation is applied to data from the NASA InstrumentCost Model (NICM) using the MDA algorithm underthe assumption of having a multivariate t distribution with fixeddegrees of freedom. A sensitivity analysis to the degrees offreedom parameter is presented to demonstrate robustness ofthe multivariate t distribution when dealing with small samplesas compared to the multivariate normal distribution.
dc.description.sponsorship NASA/JPL en_US
dc.language.iso en_US
dc.publisher Pasadena, CA: Jet Propulsion Laboratory, National Aeronautics and Space Administration, 2021
dc.title Salvaging Data Records with Missing Data: Data Imputation using the Multivariate t Distribution
dc.type Preprint


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Browse

My Account