Function that allows for the imputation of missing values using 4 possible algorithms: missForest, kNN, LLS, SVD.
impute.counts(
DEprot.object,
method = "missForest",
use.normalized.data = TRUE,
overwrite.imputation = FALSE,
missForest.max.iterations = 100,
missForest.variable.wise.OOBerror = TRUE,
missForest.cores = 1,
missForest.parallel.mode = "variables",
kNN.n.nearest.neighbours = 10,
LLS.k = 2,
pcaMethods.nPCs.to.test = 5,
RegImpute.max.iterations = 10,
RegImpute.fillmethod = "row_mean",
seed = NULL,
verbose = FALSE
)
A DEprot object
, as generated by load.counts.
String indicating the imputation method to use. One among: 'missForest', 'kNN' (VIM), 'tkNN' (imputomics), 'corkNN' (imputomics), 'LLS' (pcaMethods), 'SVD' (a.k.a svdImpute, pcaMethods), 'BPCA' (pcaMethods), 'PPCA' (pcaMethods), 'RegImpute' (DreamAI). Default: "missForest"
.
Logical value indicating whether the imputation should be performed based on the rationalized data. Default: TRUE
.
Logical value to indicate whether, in the case already available, the table of imputed counts should be overwritten. Default: FALSE
.
Max number of iterations for the missForest algorithm. Default: 100
.
Logical value to define whether the OOB error is returned for each variable separately. Default: TRUE
.
Number of cores used to run the missForest
algorithm. If missForest.cores
is 1 (or lower), the imputation will be run in parallel. Two modes are possible and can be defined by the parameter missForest.parallel.mode
. Default: 1
.
Define the mode to use for the parallelization, ignored when cores
is more than 1. One among: 'variables', 'forests'. Default: "variables"
. See also the documentation of the missForest function.
Numeric value indicating the number of nearest neighbors to use to perform the kNN
imputation. Default: 10
.
Cluster size, this is the number of similar genes used for regression. Default: 2
.
Numeric value indicating the number of Principal Components to test in order to find the optimal number of PCs to used in the imputation methods from the pcaMethods
package. This includes: 'LLS', 'SVD' (a.k.a 'svdImpute'), 'BPCA-pcaMethods', and 'PPCA'. Default: 5
.
Numeric value indicating the number of maximum iteration for the imputation with RegImpute
(from DreamAI
). Default: 10
.
String identifying the fill method to be used in the RegImpute
method (fromDreamAI
). One among "row_mean"
and "zeros"
. Default: "row_mean"
. It throws an warning if "row_median"
is used.
Numeric value indicating the seed to use for the randomization. Default: NULL
, automatically generated (saved in the seed
element in the final imputation method list).
Logical valued indicating whether processing messages should be printed. Default: FALSE
.
A DEprot
object. The boxplot showing the distribution of the protein intensity is remade and added to the slot (boxplot.imputed
). A list with parameters and other info about the imputation is added as well in the imputation
slot.
missForest, VIM, pcaMethods R-packages, DreamAI, imputomics.
dpo <- impute.counts(DEprot.object = DEprot::test.toolbox$dpo.norm,
method = "bPCA")