dc.contributor.author |
Hooke, Melissa A |
|
dc.contributor.author |
Mrozinski, Joseph |
|
dc.contributor.author |
DiNicola, Michael |
|
dc.date.accessioned |
2022-03-01T00:44:28Z |
|
dc.date.available |
2022-03-01T00:44:28Z |
|
dc.date.issued |
2021-03-06 |
|
dc.identifier.citation |
2021 IEEE Aerospace Conference, Big Sky, Montana, March 6-13, 2021 |
|
dc.identifier.clearanceno |
CL#21-0068 |
|
dc.identifier.uri |
http://hdl.handle.net/2014/54252 |
|
dc.description.abstract |
When doing multivariate data analysis, one commonobstacle is the presence of incomplete observations, i.e., observationsfor which one or more key fields are blank. Missing datais often countered by deleting entire observations that containmissing data. The negative effects of deleting entire observationsare multiple: deleting observations reduces sample size andcan also result in biased inferences even if data is missing atrandom. In addition, knowledge contained within incompleteobservations is knowledge lost when they are deleted– and theeffort spent collecting that knowledge is effort wasted. Data imputationmethods, or methods of statistically “filling-in” missingdata, can help combat small sample sizes by using the existinginformation in partially complete observations with the end goalof producing less biased and higher confidence inferences. Whena sample from a multivariate normal population is only partiallycomplete, and the missing data meets appropriate assumptions(missing at random), robust data imputation of the missing datacan be implemented with monotone data augmentation (MDA)using the multivariate t distribution.Missing data imputation is applied to data from the NASA InstrumentCost Model (NICM) using the MDA algorithm underthe assumption of having a multivariate t distribution with fixeddegrees of freedom. A sensitivity analysis to the degrees offreedom parameter is presented to demonstrate robustness ofthe multivariate t distribution when dealing with small samplesas compared to the multivariate normal distribution. |
|
dc.description.sponsorship |
NASA/JPL |
en_US |
dc.language.iso |
en_US |
|
dc.publisher |
Pasadena, CA: Jet Propulsion Laboratory, National Aeronautics and Space Administration, 2021 |
|
dc.title |
Salvaging Data Records with Missing Data: Data Imputation using the Multivariate t Distribution |
|
dc.type |
Preprint |
|