Package 'trafo' reference manual

Title:	Estimation, Comparison and Selection of Transformations
Description:	Estimation, selection and comparison of several families of transformations. The families of transformations included in the package are the following: Bickel-Doksum (Bickel and Doksum 1981 <doi:10.2307/2287831>), Box-Cox, Dual (Yang 2006 <doi:10.1016/j.econlet.2006.01.011>), Glog (Durbin et al. 2002 <doi:10.1093/bioinformatics/18.suppl_1.S105>), gpower (Kelmansky et al. 2013 <doi:10.1515/sagmb-2012-0030>), Log, Log-shift opt (Feng et al. 2016 <doi:10.1002/sta4.104>), Manly, modulus (John and Draper 1980 <doi:10.2307/2986305>), Neglog (Whittaker et al. 2005 <doi:10.1111/j.1467-9876.2005.00520.x>), Reciprocal and Yeo-Johnson. The package simplifies to compare linear models with untransformed and transformed dependent variable as well as linear models where the dependent variable is transformed with different transformations. Furthermore, the package employs maximum likelihood approaches, moments optimization and divergence minimization to estimate the optimal transformation parameter.
Authors:	Lily Medina [aut, cre], Piedad Castro [aut], Ann-Kristin Kreutzmann [aut], Natalia Rojas-Perilla [aut]
Maintainer:	Lily Medina <[email protected]>
License:	GPL-2
Version:	1.0.3
Built:	2025-03-24 03:52:29 UTC
Source:	https://github.com/akreutzmann/trafo

Data frame with transformed variables

Description

The data frame that is returned contains the variables that are used in the model and additionally a variable with the transformed dependent variable. To the variable name of the dependent variable a t is added for transformed.

Usage

## S3 method for class 'trafo'
as.data.frame(x, row.names = NULL, optional = FALSE,
  std = FALSE, ...)
## S3 method for class 'trafo'
as.data.frame(x, row.names = NULL, optional = FALSE,
  std = FALSE, ...)

Arguments

`x`	an object of type `trafo`.
`row.names`	NULL or a character vector giving the row names for the data frame. Missing values are not allowed.
`optional`	logical. If TRUE, setting row names and converting column names (to syntactic names: see make.names) is optional. Note that all of R's base package as.data.frame() methods use optional only for column names treatment, basically with the meaning of data.frame(*, check.names = !optional)
`std`	logical. If `TRUE`, the data is transformed by the standardized/scaled transformation. Defaults to `FALSE`.
`...`	other parameters that can be passed to the function.

Value

A data frame with the original variables and the transformed variable.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following
# Kolmogorov-Smirnov
logshiftopt_trafo <- logshiftopt(object = lm_cars, method = "div.ks", 
plotit = FALSE)

# Get a data frame with the added transformed variable
as.data.frame(logshiftopt_trafo)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following
# Kolmogorov-Smirnov
logshiftopt_trafo <- logshiftopt(object = lm_cars, method = "div.ks", 
plotit = FALSE)

# Get a data frame with the added transformed variable
as.data.frame(logshiftopt_trafo)

First check of assumptions to find suitable transformations

Description

Gives a first overview if a transformation is useful and which transformation is promising to fulfill the model assumptions normality, homoscedasticity and linearity.

Usage

assumptions(object, method = "ml", std = FALSE, ...)
assumptions(object, method = "ml", std = FALSE, ...)

Arguments

`object`	an object of type `lm`.
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`std`	logical. If `TRUE`, the transformed model is returned based on the standardized/scaled transformation. Defaults to `FALSE`.
`...`	other parameters that can be passed to the function, e.g. other lambdaranges. Self-defined lambdaranges are given to the function as an argument that is the combination of the name of the transformation and lr and the range needs to be a numeric vector of length 2. For instance, changing the lambdarange for the Manly transformation would mean to add an argument `manly_lr = manly_lr = c(0.000005,0.00005)`. For the default values that are used for the lambdaranges see the documentation for the provided transformations.

Value

A table with tests for normality and homoscedasticity. Furthermore, scatterplots are returned to check the linearity assumption.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

assumptions(lm_cars)
assumptions(lm_cars, method = "skew", manly_lr = c(0.000005,0.00005))
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

assumptions(lm_cars)
assumptions(lm_cars, method = "skew", manly_lr = c(0.000005,0.00005))

Bickel-Doksum transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Bickel-Doksum transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

bickeldoksum(object, lambda = "estim", method = "ml",
  lambdarange = c(1e-11, 2), plotit = TRUE)
bickeldoksum(object, lambda = "estim", method = "ml",
  lambdarange = c(1e-11, 2), plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. The Bickel-Doksum transformation is only defined for positive values of lambda. Defaults to `c(1e-11, 2)`.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Bickel PJ, Doksum KA (1981). An analysis of transformations revisited. Journal of the American Statistical Association, 76, 296-311.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
bickeldoksum(object = lm_cars, plotit = FALSE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
bickeldoksum(object = lm_cars, plotit = FALSE)

Box-Cox transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Box-Cox transformation. The transformation parameter can either be estimated using different estimation methods or given. The Box-Cox transformation is only defined for positive response values. In case the response contains zero or negative values a shift is automatically added such that y + shift > 0.

Usage

boxcox(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)
boxcox(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)

Arguments

`object`	an object of type `lm`.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `c(-2, 2)`.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Box GEP, Cox DR (1964). An Analysis of Transformations. Journal of the Royal Statistical Society B, 26(2), 211-252.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using skewness minimization
boxcox(object = lm_cars, method = "skew", plotit = FALSE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using skewness minimization
boxcox(object = lm_cars, method = "skew", plotit = FALSE)

Diagnostics for fitted models

Description

Returns information about the transformation and selected diagnostics to check model assumptions.

Usage

diagnostics(object, ...)
diagnostics(object, ...)

Arguments

`object`	an object that contains two models that should be compared.
`...`	other parameters that can be passed to the function.

Value

The return depends on the class of its argument. The documentation of particular methods gives detailed information about the return of that method.

Diagnostics for two differently transformed models

Description

Returns information about the applied transformations and selected diagnostics to check model assumptions. Two models are compared where the dependent variable is transformed by different transformations.

Usage

## S3 method for class 'trafo_compare'
diagnostics(object, ...)
## S3 method for class 'trafo_compare'
diagnostics(object, ...)

Arguments

`object`	an object of type `trafo_compare`
`...`	additional arguments that are not used in this method

Value

An object of class diagnostics.trafo_compare. The method print.diagnostics.trafo_compare can be used for this class.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform with Bickel-Doksum transformation
bd_trafo <- bickeldoksum(object = lm_cars, plotit = FALSE)

# Transform with Box-Cox transformation
bc_trafo <- boxcox(object = lm_cars, method = "skew", plotit = FALSE)

# Compare transformed models
compare <- trafo_compare(object = lm_cars, trafos = list(bd_trafo, bc_trafo))

# Get diagnostics
diagnostics(compare)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform with Bickel-Doksum transformation
bd_trafo <- bickeldoksum(object = lm_cars, plotit = FALSE)

# Transform with Box-Cox transformation
bc_trafo <- boxcox(object = lm_cars, method = "skew", plotit = FALSE)

# Compare transformed models
compare <- trafo_compare(object = lm_cars, trafos = list(bd_trafo, bc_trafo))

# Get diagnostics
diagnostics(compare)

Diagnostics for an untransformed and a transformed model

Description

Returns information about the applied transformation and selected diagnostics to check model assumptions. The return helps to compare the untransformed and the transformed model with regard to model assumptions.

Usage

## S3 method for class 'trafo_lm'
diagnostics(object, ...)
## S3 method for class 'trafo_lm'
diagnostics(object, ...)

Arguments

`object`	an object of type `trafo_lm`
`...`	additional arguments that are not used in this method

Value

An object of class diagnostics.trafo_lm. The method print.diagnostics.trafo_lm can be used for this class.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Compare transformed models
BD_lm <- trafo_lm(object = lm_cars, trafo = "bickeldoksum", 
method = "skew", lambdarange = c(1e-11, 2))

# Get diagnostics
diagnostics(BD_lm)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Compare transformed models
BD_lm <- trafo_lm(object = lm_cars, trafo = "bickeldoksum", 
method = "skew", lambdarange = c(1e-11, 2))

# Get diagnostics
diagnostics(BD_lm)

Dual transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Dual transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

dual(object, lambda = "estim", method = "ml", lambdarange = c(0, 2),
  plotit = TRUE)
dual(object, lambda = "estim", method = "ml", lambdarange = c(0, 2),
  plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. The Dual transformation is not defined for negative values of lambda. Defaults to `c(0, 2)`.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Yang Z (2006). A modified family of power transformations. Economics Letters, 92(1), 14-19.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following
# Cramer-von-Mises
dual(object = lm_cars, method = "div.cvm", plotit = TRUE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following
# Cramer-von-Mises
dual(object = lm_cars, method = "div.cvm", plotit = TRUE)

Glog transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Glog transformation.

Usage

glog(object)
glog(object)

Arguments

object

an object of type lm.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Durbin BP, Hardin JS, Hawkins DM, Rocke DM (2002). A Variance-stabilizing Transformation for Gene-expression Microarray Data. Bioinformatics, 18, 105-110.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
glog(object = lm_cars)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
glog(object = lm_cars)

Gpower transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Gpower transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

gpower(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)
gpower(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `c(-2, 2)`.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Kelmansky DM, Martinez EJ, Leiva V (2013). A New Variance Stabilizing Transformation for Gene Expression Data Analysis. Statistical applications in genetics and molecular biology, 12(6), 653-666.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following 
# Kullback-Leibler
gpower(object = lm_cars, method = "div.kl", plotit = FALSE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following 
# Kullback-Leibler
gpower(object = lm_cars, method = "div.kl", plotit = FALSE)

Log shift opt transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Log shift opt transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

logshiftopt(object, lambda = "estim", method = "ml",
  lambdarange = NULL, plotit = TRUE)
logshiftopt(object, lambda = "estim", method = "ml",
  lambdarange = NULL, plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `NULL`. In this case the lambdarange is set to the range of the data. In case the lowest value is negative the absolute value of the lowest value plus 1 is the lower bound for the range.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following
# Kolmogorov-Smirnof
logshiftopt(object = lm_cars, method = "div.ks", plotit = FALSE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using divergence minimization following
# Kolmogorov-Smirnof
logshiftopt(object = lm_cars, method = "div.ks", plotit = FALSE)

Log transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Log transformation. The Log transformation is only defined for positive response values. In case the response contains zero or negative values a shift is automatically added such that y + shift > 0.

Usage

logtrafo(object)
logtrafo(object)

Arguments

object

an object of type lm.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Box GEP, Cox DR (1964). An Analysis of Transformations. Journal of the Royal Statistical Society B, 26(2), 211-252.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
logtrafo(object = lm_cars)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
logtrafo(object = lm_cars)

Manly transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Manly transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

manly(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)
manly(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `c(-2, 2)`.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Manly BFJ (1976). Exponential data transformations. Journal of the Royal Statistical Society: Series D, 25, 37-42.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
manly(object = lm_cars, plotit = FALSE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
manly(object = lm_cars, plotit = FALSE)

Modulus transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Modulus transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

modulus(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)
modulus(object, lambda = "estim", method = "ml", lambdarange = c(-2,
  2), plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `c(-2, 2)`.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

John JA, Draper NR (1980). An alternative family of transformations. Journal of the Royal Statistical Society: Series C, 29, 190-197.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable with fixed lambda 
modulus(object = lm_cars, lambda = 0.8, plotit = FALSE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable with fixed lambda 
modulus(object = lm_cars, lambda = 0.8, plotit = FALSE)

Neg log transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Neg log transformation.

Usage

neglog(object)
neglog(object)

Arguments

object

an object of type lm.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Whittaker J, Whitehead C, Somers M (2005). The neglog transformation and quantile regression for the analysis of a large credit scoring database. Journal of the Royal Statistical Society. Series C (Applied Statistics), 54(4), 863-878.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
neglog(object = lm_cars)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
neglog(object = lm_cars)

Plots for linear models with transformed dependent variable

Description

For the two transformed models a range of plots is returned in order to check model assumptions graphically.

Usage

## S3 method for class 'trafo_compare'
plot(x, ...)
## S3 method for class 'trafo_compare'
plot(x, ...)

Arguments

`x`	an object of type `trafo_compare`
`...`	additional arguments that are not used in this method

Plot for regression models with untransformed and transformed dependent variable

Description

For the untransformed and transformed model a range of plots is returned in order to check model assumptions graphically.

Usage

## S3 method for class 'trafo_lm'
plot(x, which = "all", ...)
## S3 method for class 'trafo_lm'
plot(x, which = "all", ...)

Arguments

`x`	an object of type `trafo_lm`
`which`	one element character that determines the return of plots. For single plot returns the possible values are: "qqplot", "hist", "residVsFitted", "residVsObs", "scatter", "cooks", "scaleLoc", "residLev". The default is set to "all".
`...`	additional arguments that are not used in this method

References

for panel.cor function used in scatter plot: Smith, C.A., Want, E.J, O'Maille, G.,Abagyan,R. and Siuzdak, G. (2006). XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification. Analytical Chemistry, 78, 779-787. also: https://github.com/sneumann/xcms/blame/master/R/functions-xcmsSet.R#L654

Prints diagnostics of two trafo objects

Description

Prints diagnostics of two trafo objects.

Usage

## S3 method for class 'diagnostics.trafo_compare'
print(x, ...)
## S3 method for class 'diagnostics.trafo_compare'
print(x, ...)

Arguments

`x`	an object of type `diagnostics.trafo_compare`
`...`	additional arguments that are not used in this method

Prints diagnostics of an untransformed and a transformed model

Description

Prints diagnostics of an untransformed and a transformed model.

Usage

## S3 method for class 'diagnostics.trafo_lm'
print(x, ...)
## S3 method for class 'diagnostics.trafo_lm'
print(x, ...)

Arguments

`x`	an object of type `diagnostics.trafo_lm`
`...`	additional arguments that are not used in this method

Prints summary of trafo_compare objects

Description

Prints objects to be shown in the summary function for objects of type trafo_compare.

Usage

## S3 method for class 'summary.trafo_compare'
print(x, ...)
## S3 method for class 'summary.trafo_compare'
print(x, ...)

Arguments

`x`	an object of type `summary.trafo_compare`
`...`	additional arguments that are not used in this method

Print summary trafo

Description

prints objects to be shown in the summary function for objects of type trafo_lm

Usage

## S3 method for class 'summary.trafo_lm'
print(x, ...)
## S3 method for class 'summary.trafo_lm'
print(x, ...)

Arguments

`x`	an object of type `summary.trafo_lm`
`...`	additional arguments that are not used in this method

Prints object of type trafo

Description

Prints object of type trafo

Usage

## S3 method for class 'trafo'
print(x, ...)
## S3 method for class 'trafo'
print(x, ...)

Arguments

`x`	an object of type `trafo`.
`...`	other parameters that can be passed to the function.

Prints object of type trafo_compare

Description

Prints object of type trafo_compare

Usage

## S3 method for class 'trafo_compare'
print(x, ...)
## S3 method for class 'trafo_compare'
print(x, ...)

Arguments

`x`	an object of type trafo_compare.
`...`	other parameters that can be passed to the function.

Prints object of type trafo_lm

Description

Prints object of type trafo_lm

Usage

## S3 method for class 'trafo_lm'
print(x, ...)
## S3 method for class 'trafo_lm'
print(x, ...)

Arguments

`x`	an object of type `trafo_lm`.
`...`	other parameters that can be passed to the function.

Reciprocal transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Reciprocal transformation.

Usage

reciprocal(object)
reciprocal(object)

Arguments

object

an object of type lm.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
reciprocal(object = lm_cars)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable 
reciprocal(object = lm_cars)

Square-root shift transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Square-root shift transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

sqrtshift(object, lambda = "estim", method = "ml",
  lambdarange = NULL, plotit = TRUE)
sqrtshift(object, lambda = "estim", method = "ml",
  lambdarange = NULL, plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `NULL`. In this case the lambdarange is set to the range of the data. In case the lowest value is negative the absolute value of the lowest value plus 1 is the lower bound for the range.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
sqrtshift(object = lm_cars, plotit = TRUE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
sqrtshift(object = lm_cars, plotit = TRUE)

Summary for two differently transformed models

Description

The summary contains the summary for two transformed models. The summary is based on the summary for objects of type lm.

Usage

## S3 method for class 'trafo_compare'
summary(object, ...)
## S3 method for class 'trafo_compare'
summary(object, ...)

Arguments

`object`	an object of type `trafo_compare`
`...`	additional arguments that are not used in this method

Value

An object of class summary.trafo_compare. The method print.summary.trafo_compare can be used for this class.

Summary for linear models with untransformed and transformed dependent variable

Description

The summary method for class trafo_lm contains a summary for an untransformed and a transformed model. The resulting summary is based on the summary for objects of type lm.

Usage

## S3 method for class 'trafo_lm'
summary(object, ...)
## S3 method for class 'trafo_lm'
summary(object, ...)

Arguments

`object`	an object of type `trafo_lm`
`...`	additional arguments that are not used in this method

Value

An object of class summary.trafo_lm. The method print.summary.trafo_lm can be used for this class.

An R package supporting the selection of a suitable transformation

Description

Estimation, selection and comparison of several families of transformations. The families of transformations included in the package are the following: Bickel-Doksum, Box-Cox, Dual, Glog, Gpower, Log, Log-shift opt, Manly, Modulus, Neglog, Reciprocal and Yeo-Johnson. The package simplifies to compare linear models with untransformed and transformed dependent variable as well as linear models where the dependent variable is transformed with different transformations. Furthermore, the package employs maximum likelihood approaches, skewness and divergence minimization to estimate the optimal transformation parameter.

Details

An overview of all currently provided functions can be requested by library(help=trafo).

Back-transforms vectors with used transformation in trafo_lm

Description

The function transforms vectors as the prediction or confidence intervals naively back to the original scale of the used transformation in trafo_lm.

Usage

trafo_back(object, prediction)
trafo_back(object, prediction)

Arguments

`object`	an object of type `trafo_lm`.
`prediction`	the return of the predict.lm method.

Value

The backtransformed prediction.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Compare untransformed and transformed model
trafo_cars <- trafo_lm(object = lm_cars, trafo = "bickeldoksum", method = "skew", 
lambdarange = c(1e-11, 2))

# Back-transform prediction and confidence interval
trafo_back(trafo_cars, predict(trafo_cars$trafo_mod, interval = "confidence"))

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Compare untransformed and transformed model
trafo_cars <- trafo_lm(object = lm_cars, trafo = "bickeldoksum", method = "skew", 
lambdarange = c(1e-11, 2))

# Back-transform prediction and confidence interval
trafo_back(trafo_cars, predict(trafo_cars$trafo_mod, interval = "confidence"))

Compares linear models with transformed dependent variable

Description

Function trafo_compare compares linear models where the dependent variable is transformed by different transformations.

Usage

trafo_compare(object, trafos, std = FALSE)
trafo_compare(object, trafos, std = FALSE)

Arguments

`object`	an object of type lm
`trafos`	a list of two `trafo` objects based on the same model given in object.
`std`	logical. If `TRUE`, the transformed models are returned based on the standardized/scaled transformation. Defaults to `FALSE`.

Value

An object of class trafo_compare. Methods such as diagnostics.trafo_compare, print.trafo_compare, plot.trafo_compare and summary.trafo_compare can be used for this class.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform with Bickel-Doksum transformation
bd_trafo <- bickeldoksum(object = lm_cars, plotit = FALSE)

# Transform with Box-Cox transformation
bc_trafo <- boxcox(object = lm_cars, method = "skew", plotit = FALSE)

# Compare transformed models
trafo_compare(object = lm_cars, trafos = list(bd_trafo, bc_trafo))
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform with Bickel-Doksum transformation
bd_trafo <- bickeldoksum(object = lm_cars, plotit = FALSE)

# Transform with Box-Cox transformation
bc_trafo <- boxcox(object = lm_cars, method = "skew", plotit = FALSE)

# Compare transformed models
trafo_compare(object = lm_cars, trafos = list(bd_trafo, bc_trafo))

Fits transformed linear models

Description

Function trafo_lm fits linear models with transformed dependent variable. The main return are two lm objects where one is the untransformed linear model and the other one the transformed linear model.

Usage

trafo_lm(object, trafo = "boxcox", lambda = "estim", method = "ml",
  lambdarange = NULL, std = FALSE, custom_trafo = NULL)
trafo_lm(object, trafo = "boxcox", lambda = "estim", method = "ml",
  lambdarange = NULL, std = FALSE, custom_trafo = NULL)

Arguments

`object`	an object of type `lm`.
`trafo`	a character string. Different transformations can be used for transforming the dependent variable in a linear model: (i) "bickeldoksum", (ii) "boxcox", (iii) "dual", (iv) "glog", (v) "gpower", (vi) "log", (vii) "logshiftopt", (viii) "manly", (ix) "modulus", (x) "neglog", (xi) "reciprocal", (xii) "yeojohnson". Defaults to "boxcox".
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `NULL` which means that the default value of the chosen transformation is used.
`std`	logical. If `TRUE`, the transformed model is returned based on the standardized/scaled transformation. Defaults to `FALSE`.
`custom_trafo`	a list. The list has two elements where the first element is a function specifying the desired transformation and the second element is a function specifying the corresponding standardized transformation. Defaults to `NULL`.

Value

An object of class trafo_lm. Methods such as diagnostics.trafo_lm, print.trafo_lm, plot.trafo_lm and summary.trafo_lm can be used for this class.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Compare untransformed and transformed model
trafo_lm(object = lm_cars, trafo = "bickeldoksum", method = "skew", 
lambdarange = c(1e-11, 2))
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Compare untransformed and transformed model
trafo_lm(object = lm_cars, trafo = "bickeldoksum", method = "skew", 
lambdarange = c(1e-11, 2))

Yeo-Johnson transformation for linear models

Description

The function transforms the dependent variable of a linear model using the Yeo-Johnson transformation. The transformation parameter can either be estimated using different estimation methods or given.

Usage

yeojohnson(object, lambda = "estim", method = "ml",
  lambdarange = c(-2, 2), plotit = TRUE)
yeojohnson(object, lambda = "estim", method = "ml",
  lambdarange = c(-2, 2), plotit = TRUE)

Arguments

`object`	an object of type lm.
`lambda`	either a character named "estim" if the optimal transformation parameter should be estimated or a numeric value determining a given value for the transformation parameter. Defaults to "estim".
`method`	a character string. Different estimation methods can be used for the estimation of the optimal transformation parameter: (i) Maximum likelihood approach ("ml"), (ii) Skewness minimization ("skew"), (iii) Kurtosis optimization ("kurt"), (iv) Divergence minimization by Kolmogorov-Smirnov ("div.ks"), by Cramer-von-Mises ("div.cvm") or by Kullback-Leibler ("div.kl"). Defaults to "ml".
`lambdarange`	a numeric vector with two elements defining an interval that is used for the estimation of the optimal transformation parameter. Defaults to `c(-2, 2)`.
`plotit`	logical. If `TRUE`, a plot that illustrates the optimal transformation parameter or the given transformation parameter is returned. Defaults to `TRUE`.

Value

An object of class trafo. Methods such as as.data.frame.trafo and print.trafo can be used for this class.

References

Yeo IK, Johnson RA (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954-959.

Examples

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
yeojohnson(object = lm_cars, plotit = FALSE)
# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
yeojohnson(object = lm_cars, plotit = FALSE)

Package 'trafo'

Help Index

Data frame with transformed variables

Description

Usage

Arguments

Value

See Also

Examples

First check of assumptions to find suitable transformations

Description

Usage

Arguments

Value

See Also

Examples

Bickel-Doksum transformation for linear models

Description

Usage

Arguments

Value

References

Examples

Box-Cox transformation for linear models

Description

Usage

Arguments

Value

References

Examples

Diagnostics for fitted models

Description

Usage

Arguments

Value

See Also

Diagnostics for two differently transformed models

Description

Usage

Arguments

Value

Examples

Diagnostics for an untransformed and a transformed model

Description

Usage

Arguments

Value

Examples

Dual transformation for linear models

Description

Usage

Arguments

Value

References

Examples

Glog transformation for linear models

Description

Usage

Arguments

Value

References

Examples

Gpower transformation for linear models

Description

Usage

Arguments

Value

References

Examples

Log shift opt transformation for linear models

Description

Usage

Arguments

Value

Examples

Log transformation for linear models

Description

Usage

Arguments

Value