Title: | Additive Heredity Model: Method for the Mixture-of-Mixtures Experiments |
---|---|
Description: | An implementation of the additive heredity model for the mixture-of-mixtures experiments of Shen et al. (2019) in Technometrics <doi:10.1080/00401706.2019.1630010>. The additive heredity model considers an additive structure to inherently connect the major components with the minor components. The additive heredity model has a meaningful interpretation for the estimated model because of the hierarchical and heredity principles applied and the nonnegative garrote technique used for variable selection. |
Authors: | Sumin Shen [aut, cre], Lulu Kang [aut], Xinwei Deng [aut] |
Maintainer: | Sumin Shen <[email protected]> |
License: | GPL-3 |
Version: | 1.0.1 |
Built: | 2025-02-14 05:12:55 UTC |
Source: | https://github.com/cran/AHM |
This is one of the main functions. The function ahm computes the proposed additive heredity model.
ahm(y, x, num_major = 3, dist_minor = c(2, 2, 1), type = "weak", alpha = 0, lambda_seq = seq(0, 5, 0.01), nfolds = NULL, mapping_type = c("power"), powerh = 0, rep_gcv = 100)
ahm(y, x, num_major = 3, dist_minor = c(2, 2, 1), type = "weak", alpha = 0, lambda_seq = seq(0, 5, 0.01), nfolds = NULL, mapping_type = c("power"), powerh = 0, rep_gcv = 100)
y |
numeric vector |
x |
data.frame Note the column names of the x should be in the order of major components, minor components, and no interactions are needed. |
num_major |
number of major components |
dist_minor |
the allocation of number of minor components nested under major components |
type |
heredity type, weak heredity is the current support type |
alpha |
0 is for the ridge in glmnet https://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html |
lambda_seq |
a numeric vector for the options of lambda used in ridge regression for estimating the initials |
nfolds |
used in cv.glmnet for initial value of parameters in the non-negative garrote method |
mapping_type |
the form of the coefficient function of major components in front of corresponding minor terms. Currently only support "power" |
powerh |
the power parameter used for the power function |
rep_gcv |
the number of choices of tuning parameter used in the GCV selection |
Return a list
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) summary(out)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) summary(out)
Check column correlations
check_col_correlation(dat)
check_col_correlation(dat)
dat |
data.frame |
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] check_col_correlation (dat=x)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] check_col_correlation (dat=x)
Photoresist-coating experiment data
data(coating)
data(coating)
data.frame
Cornell, J.A. and Ramsey, P.J. (1998). A Generalized mixture model for categorized-components problems with an application to a photoresist-coating experiment. Technometrics, 40(1), 48-61. (tandfonline)
data(coating) print(coating)
data(coating) print(coating)
Coefficient method for the fitted ahm object
## S3 method for class 'ahm' coef(object, ...)
## S3 method for class 'ahm' coef(object, ...)
object |
ahm object |
... |
not used |
a numerical vector
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) coef(out)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) coef(out)
Coefficient method for the fitted cv.ahm object
## S3 method for class 'cv.ahm' coef(object, metric = "mse", ...)
## S3 method for class 'cv.ahm' coef(object, metric = "mse", ...)
object |
cv.ahm object |
metric |
"mse" or "aicc" |
... |
not used |
a numerical vector
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] powerh_path = round(seq(0.001,2,length.out =15),3) num_major = 3; dist_minor = c(2,2,1) res = cv.ahm (y, x, powerh_path=powerh_path, metric = "mse", num_major, dist_minor, type = "weak" , alpha=0, lambda_seq=seq(0,5,0.01), nfolds=NULL, mapping_type = c("power"), rep_gcv=100) coefficients = coef(res)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] powerh_path = round(seq(0.001,2,length.out =15),3) num_major = 3; dist_minor = c(2,2,1) res = cv.ahm (y, x, powerh_path=powerh_path, metric = "mse", num_major, dist_minor, type = "weak" , alpha=0, lambda_seq=seq(0,5,0.01), nfolds=NULL, mapping_type = c("power"), rep_gcv=100) coefficients = coef(res)
compute AICc
compute_aicc(rss, n, p, type = "AICc")
compute_aicc(rss, n, p, type = "AICc")
rss |
residual sum of squares |
n |
number of observation |
p |
number of nonzero parameters |
type |
character "AICc" |
Calculating AIC “by hand” in R in Stack Overflow
compute_aicc (rss=10, n=30, p=6, type = "AICc")
compute_aicc (rss=10, n=30, p=6, type = "AICc")
This is one of the main functions. It perform the cross validation on ahm models to select the optimal setting of hyper parameter h
cv.ahm(y, x, powerh_path = NULL, metric = c("mse", "AICc"), num_major = 3, dist_minor = c(2, 2, 1), type = "weak", alpha = 0, lambda_seq = seq(0, 5, 0.01), nfolds = NULL, mapping_type = c("power"), rep_gcv = 100)
cv.ahm(y, x, powerh_path = NULL, metric = c("mse", "AICc"), num_major = 3, dist_minor = c(2, 2, 1), type = "weak", alpha = 0, lambda_seq = seq(0, 5, 0.01), nfolds = NULL, mapping_type = c("power"), rep_gcv = 100)
y |
numeric vector |
x |
data.frame Note the column names of the x should be in the order of major components, minor components, and no interactions between major or minor components are needed. |
powerh_path |
if NULL, then the default is the vector: round(seq(0.001,2,length.out =15),3) |
metric |
"mse" or "AICc" the metric used in cross validtion where the minimum is selected as the optimal |
num_major |
number of major components |
dist_minor |
the allocation of number of minor components nested under major components |
type |
heredity type, weak heredity is the current support type |
alpha |
0 is for the ridge in glmnet https://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html |
lambda_seq |
a numeric vector for the options of lambda used in ridge regression for estimating the initials |
nfolds |
used in cv.glmnet for initial value of parameters in the non-negative garrote method |
mapping_type |
the form of the coefficient function of major components in front of corresponding minor terms. Currently only support "power" |
rep_gcv |
the number of choices of tuning parameter used in the GCV selection |
Return a list
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] powerh_path = round(seq(0.001,2,length.out =15),3) num_major = 3; dist_minor = c(2,2,1) res = cv.ahm (y, x, powerh_path=powerh_path, metric = "mse", num_major, dist_minor, type = "weak" , alpha=0, lambda_seq=seq(0,5,0.01), nfolds=NULL, mapping_type = c("power"), rep_gcv=100) object = res$metric_mse
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] powerh_path = round(seq(0.001,2,length.out =15),3) num_major = 3; dist_minor = c(2,2,1) res = cv.ahm (y, x, powerh_path=powerh_path, metric = "mse", num_major, dist_minor, type = "weak" , alpha=0, lambda_seq=seq(0,5,0.01), nfolds=NULL, mapping_type = c("power"), rep_gcv=100) object = res$metric_mse
Design points for the simplex centroid design with 3 components
data(design_simplex_centroid_design_3_major_component)
data(design_simplex_centroid_design_3_major_component)
data.frame
data(design_simplex_centroid_design_3_major_component) print(design_simplex_centroid_design_3_major_component)
data(design_simplex_centroid_design_3_major_component) print(design_simplex_centroid_design_3_major_component)
Create a list
enlist(...)
enlist(...)
... |
object to be included as elements in the list |
item = c(1:10) enlist(item)
item = c(1:10) enlist(item)
Expand the interaction terms for each subset group, say x11, x12, or c1, c2, c3
expand_interactions(dat, sel_names)
expand_interactions(dat, sel_names)
dat |
data frame |
sel_names |
characters |
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] expand_interactions (dat=x, sel_names=c("c1", "c2", "c3"))
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] expand_interactions (dat=x, sel_names=c("c1", "c2", "c3"))
Compute the conditional number of design matrix
find_condition_num(x)
find_condition_num(x)
x |
matrix to be used in svd |
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] find_condition_num (x)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] find_condition_num (x)
Mapping_function is a function to add the functional coefficients of major components in front of minor components terms
mapping_function(x, num_major = 3, dist_minor = C(2, 2, 1), mapping_type = c("power"), powerh = 0)
mapping_function(x, num_major = 3, dist_minor = C(2, 2, 1), mapping_type = c("power"), powerh = 0)
x |
data.frame Note the column names of the x should be in the order of major components, minor components, and no interactions are needed. |
num_major |
number of major components |
dist_minor |
the allocation of number of minor components nested under major components |
mapping_type |
the form of the coefficient function of major components in front of corresponding minor terms. Currently only support "power" |
powerh |
the power parameter used for the power function |
data frame
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] mapping_function(x=x, num_major=3, dist_minor=c(2,2,1), mapping_type = c("power"), powerh=0)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] mapping_function(x=x, num_major=3, dist_minor=c(2,2,1), mapping_type = c("power"), powerh=0)
This method is modified based on Prof. Bobby Gramacy's Computer Experiment lecture at Virginia Tech. Prof. Gramacy's lecture website
mymaximin(pool, n = 50, m = 3, iter = 1e+05, Xorig = NULL)
mymaximin(pool, n = 50, m = 3, iter = 1e+05, Xorig = NULL)
pool |
partition the base design points provided to the function |
n |
numeric, sample size |
m |
numeric, 3 stands for 3 components, i.e. c1, c2, and c3 |
iter |
numeric, iterations used in the stochastic sampling |
Xorig |
matrix, initial design points |
Return a matrixt of the design points for the major components
# The case of unconstrainted experiments library(mixexp) num_size = 8 # num points in the design for the major component Xorig = as.matrix(SCD(3)) # all possible combinations sum to 1 pool_3d =partitions::compositions(1000, 3,include.zero = TRUE)/1000 res_C = mymaximin(pool=pool_3d, n=num_size, m=3, iter=1e5, Xorig=Xorig) DesignPoints(res_C,cornerlabs = c("c3","c2","c1"),axislabs=c("c1","c2","c3")) # The case of constrainted experiments library(mixexp) num_size = 8 # num points in the design for the major component # all possible combinations sum to 1 pool_3d =partitions::compositions(1000, 3,include.zero = TRUE)/1000 c1_min=0.2 c1_max=0.45 c2_min=0.4 c2_max=0.6 c3_min=0.1 c3_max=0.25 tmp = Xvert(nfac=3,lc=c(c1_min,c2_min,c3_min),uc =c(c1_max,c2_max,c3_max),ndm=1,pseudo=FALSE) Xorig=tmp[c(1:6,13),c(1:3)] colnames(Xorig)=c("V1","V2","V3") pool_3d = t(dplyr::filter(as.data.frame(t(as.matrix(pool_3d))),t(pool_3d)[,1] > c1_min & t(pool_3d)[,1] <= c1_max & t(pool_3d)[,2] > c2_min & t(pool_3d)[,2] <= c2_max & t(pool_3d)[,3] > c3_min & t(pool_3d)[,3] <= c3_max ) ) res_C = mymaximin(pool=pool_3d, n=num_size, m=3, iter=1e5, Xorig=Xorig) DesignPoints(res_C,cornerlabs = c("c3","c2","c1"),axislabs=c("c1","c2","c3") ,x1lower=c1_min,x2lower=c2_min,x3lower=c3_min ,x1upper=c1_max,x2upper=c2_max,x3upper=c3_max, pseudo=FALSE)
# The case of unconstrainted experiments library(mixexp) num_size = 8 # num points in the design for the major component Xorig = as.matrix(SCD(3)) # all possible combinations sum to 1 pool_3d =partitions::compositions(1000, 3,include.zero = TRUE)/1000 res_C = mymaximin(pool=pool_3d, n=num_size, m=3, iter=1e5, Xorig=Xorig) DesignPoints(res_C,cornerlabs = c("c3","c2","c1"),axislabs=c("c1","c2","c3")) # The case of constrainted experiments library(mixexp) num_size = 8 # num points in the design for the major component # all possible combinations sum to 1 pool_3d =partitions::compositions(1000, 3,include.zero = TRUE)/1000 c1_min=0.2 c1_max=0.45 c2_min=0.4 c2_max=0.6 c3_min=0.1 c3_max=0.25 tmp = Xvert(nfac=3,lc=c(c1_min,c2_min,c3_min),uc =c(c1_max,c2_max,c3_max),ndm=1,pseudo=FALSE) Xorig=tmp[c(1:6,13),c(1:3)] colnames(Xorig)=c("V1","V2","V3") pool_3d = t(dplyr::filter(as.data.frame(t(as.matrix(pool_3d))),t(pool_3d)[,1] > c1_min & t(pool_3d)[,1] <= c1_max & t(pool_3d)[,2] > c2_min & t(pool_3d)[,2] <= c2_max & t(pool_3d)[,3] > c3_min & t(pool_3d)[,3] <= c3_max ) ) res_C = mymaximin(pool=pool_3d, n=num_size, m=3, iter=1e5, Xorig=Xorig) DesignPoints(res_C,cornerlabs = c("c3","c2","c1"),axislabs=c("c1","c2","c3") ,x1lower=c1_min,x2lower=c2_min,x3lower=c3_min ,x1upper=c1_max,x2upper=c2_max,x3upper=c3_max, pseudo=FALSE)
Predict method for the fitted ahm object
## S3 method for class 'ahm' predict(object, newx, ...)
## S3 method for class 'ahm' predict(object, newx, ...)
object |
ahm object |
newx |
Matrix of new values for x at which predictions are to be made. |
... |
not used |
predicted value(s) at newx
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) predict(out)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) predict(out)
Predict method for the fitted cv.ahm object
## S3 method for class 'cv.ahm' predict(object, newx, metric = "mse", ...)
## S3 method for class 'cv.ahm' predict(object, newx, metric = "mse", ...)
object |
cv.ahm object |
newx |
Matrix of new values for x at which predictions are to be made. |
metric |
"mse" or "aicc" |
... |
not used |
Return a list
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] powerh_path = round(seq(0.001,2,length.out =15),3) num_major = 3; dist_minor = c(2,2,1) res = cv.ahm (y, x, powerh_path=powerh_path, metric = "mse", num_major, dist_minor, type = "weak" , alpha=0, lambda_seq=seq(0,5,0.01), nfolds=NULL, mapping_type = c("power"), rep_gcv=100) pred = predict(res)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] powerh_path = round(seq(0.001,2,length.out =15),3) num_major = 3; dist_minor = c(2,2,1) res = cv.ahm (y, x, powerh_path=powerh_path, metric = "mse", num_major, dist_minor, type = "weak" , alpha=0, lambda_seq=seq(0,5,0.01), nfolds=NULL, mapping_type = c("power"), rep_gcv=100) pred = predict(res)
The candidate search points in the nonlinear optimization for the optimal value in the Pringles experiment
data(pringles_candidates2search)
data(pringles_candidates2search)
matrix
data(pringles_candidates2search) print(pringles_candidates2search)
data(pringles_candidates2search) print(pringles_candidates2search)
Pringles experiment data set with the percent of Fat as the response
data(pringles_fat)
data(pringles_fat)
data.frame
Kang, L., Joseph, V.R. and Brenneman, W.A. (2011). Design and modeling strategies for mixture-of-mixtures experiments. Technometrics, 53(2), 125–36. (tandfonline)
data(pringles_fat) print(pringles_fat)
data(pringles_fat) print(pringles_fat)
Pringles experiment data set with the Hardness as the response
data(pringles_hardness)
data(pringles_hardness)
data.frame
Kang, L., Joseph, V.R. and Brenneman, W.A. (2011). Design and modeling strategies for mixture-of-mixtures experiments. Technometrics, 53(2), 125–36. (tandfonline)
data(pringles_hardness) print(pringles_hardness)
data(pringles_hardness) print(pringles_hardness)
Summary method for the fitted ahm object
## S3 method for class 'ahm' summary(object, ...)
## S3 method for class 'ahm' summary(object, ...)
object |
fitted ahm object |
... |
not used |
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) summary(out)
data("pringles_fat") data_fat = pringles_fat h_tmp = 1.3 x = data_fat[,c("c1","c2","c3","x11","x12","x21","x22")] y = data_fat[,1] out = ahm (y, x, num_major = 3, dist_minor = c(2,2,1), type = "weak", alpha=0, lambda_seq=seq(0,5,0.01), nfold = NULL, mapping_type = c("power"), powerh = h_tmp, rep_gcv=100) summary(out)