Title: | Assessing Risk Predictions for Clustered Data |
---|---|
Description: | Assessing and comparing risk prediction rules for clustered data. The method is based on the paper: Rosner B, Qiu W, and Lee MLT.(2013) <doi: 10.1007/s10985-012-9240-6>. |
Authors: | Bernard Rosner [aut, ctb], Weiliang Qiu [aut, cre], Meiling T. Lee [aut, ctb] |
Maintainer: | Weiliang Qiu <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.2.6 |
Built: | 2024-11-04 21:43:27 UTC |
Source: | https://github.com/cran/riskPredictClustData |
Generate simulated data from logistic mixed effects model based on the AMD data.
genSimDataGLMEM( nSubj = 131, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33)
genSimDataGLMEM( nSubj = 131, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33)
nSubj |
integer. Number of subjects. Each subject would have data for 2 eyes. |
beta0 |
mean of intercept |
sd.beta0i |
standard deviation |
beta1 |
slope for the binary covariate |
beta2 |
slope for the continuous mean-centered covariate |
beta3 |
slope for the binary covariate |
beta4 |
slope for the binary covariate |
beta5 |
slope for the binary covariate |
beta6 |
slope for the binary covariate |
beta7 |
slope for the binary covariate |
p.smkcur |
proportion of current smokers. |
p.inieye31 |
proportion of left eye having inital grade equal to 3. |
p.inieye32 |
proportion of right eye having inital grade equal to 3. |
p.inieye41 |
proportion of left eye having inital grade equal to 4. |
p.inieye42 |
proportion of right eye having inital grade equal to 4. |
sd.lncalorc |
standard deviation for |
We generate simulated data set from the following generalized linear mixed effects model:
,
A data frame with 8 columns: cid, subuid, prog, smkcur, lncalorc, inieye3, inieye4, and rtotfat,
where cid is the subject id, subuid is the unit id, and prog is the progression status.
indicates the eye is progressed.
indicates the eye is not progressed.
There are
nSubj*2
rows. The first nSubj
rows
are for the left eyes and the second nSubj
rows are for the right eyes.
Bernard Rosner <[email protected]>, Weiliang Qiu <[email protected]>, Meiling Ting Lee <[email protected]>
Rosner B, Qiu W, and Lee MLT. Assessing Discrimination of Risk Prediction Rules in a Clustered Data Setting. Lifetime Data Anal. 2013 Apr; 19(2): 242-256.
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,])
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,])
Get data frame for the function riskPredict.
getScore(fmla, cidVar, subuidVar, statusVar, datFrame, mycorstr = "exchangeable", verbose = FALSE)
getScore(fmla, cidVar, subuidVar, statusVar, datFrame, mycorstr = "exchangeable", verbose = FALSE)
fmla |
A formula object for the function |
cidVar |
character. Phenotype variable name for cluster id |
subuidVar |
character. Phenotype variable name for unit id |
statusVar |
character. Phenotype variable name for progression status |
datFrame |
A data frame with at least 3 columns corresponding to |
mycorstr |
character. indicates correlation structure. see the manual for the function |
verbose |
logical. indicating if summary of gee results should be printed out. |
A list with two elements: frame
and gee.obj
.
frame
is a data frame with at least 4 columns: cid, subuid, status, and score.
cid
indicates cluster id; subuid
indicates unit ID within a cluster;
status=1
indicates an eye is progressed;
status=0
indicates an eye is not progressed;
score
represents the risk score.
gee.obj
is the object returned by gee
function.
Bernard Rosner <[email protected]>, Weiliang Qiu <[email protected]>, Meiling Ting Lee <[email protected]>
Rosner B, Qiu W, and Lee MLT. Assessing Discrimination of Risk Prediction Rules in a Clustered Data Setting. Lifetime Data Anal. 2013 Apr; 19(2): 242-256.
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame gee.obj=tt1$gee.obj print(summary(gee.obj)) print(dim(myframe1)) print(myframe1[1:3,])
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame gee.obj=tt1$gee.obj print(summary(gee.obj)) print(dim(myframe1)) print(myframe1[1:3,])
Calculate the power for testing .
powerCal( nSubj, mu1, triangle, rho, rho11, rho22, rho12, p11, p10, p01, alpha = 0.05)
powerCal( nSubj, mu1, triangle, rho, rho11, rho22, rho12, p11, p10, p01, alpha = 0.05)
nSubj |
integer. number of subjects to be generated. Assume each subject has two observations. |
mu1 |
|
triangle |
the difference of the expected value the the extended Mann-Whitney U statistics
between two prediction rules, i.e., |
rho |
|
rho11 |
|
rho22 |
|
rho12 |
|
p11 |
|
p10 |
|
p01 |
|
alpha |
type I error rate |
the power
Bernard Rosner <[email protected]>, Weiliang Qiu <[email protected]>, Meiling Ting Lee <[email protected]>
Rosner B, Qiu W, and Lee MLT. Assessing Discrimination of Risk Prediction Rules in a Clustered Data Setting. Lifetime Data Anal. 2013 Apr; 19(2): 242-256.
set.seed(1234567) mu1 = 0.8 power = powerCal(nSubj = 30, mu1 = mu1, triangle = 0.05, rho = 0.93, rho11 = 0.59, rho22 = 0.56, rho12 = 0.52, p11 = 0.115, p10 = 0.142, p01 = 0.130, alpha = 0.05) print(power)
set.seed(1234567) mu1 = 0.8 power = powerCal(nSubj = 30, mu1 = mu1, triangle = 0.05, rho = 0.93, rho11 = 0.59, rho22 = 0.56, rho12 = 0.52, p11 = 0.115, p10 = 0.142, p01 = 0.130, alpha = 0.05) print(power)
based on a dataset
Calculate the power for testing based on a dataset.
powerCalData( nSubj, triangle, frame, alpha = 0.05)
powerCalData( nSubj, triangle, frame, alpha = 0.05)
nSubj |
integer. number of subjects to be generated. Assume each subject has two observations. |
triangle |
the difference of the expected value the the extended Mann-Whitney U statistics
between two prediction rules, i.e., |
frame |
A data frame with 5 columns: cid, subuid, status, score1, and score2.
|
alpha |
type I error rate |
A list with 11 elements.
power |
the esstimated power |
rho |
|
rho11 |
|
rho22 |
|
rho12 |
|
p11 |
|
p10 |
|
p01 |
|
p00 |
|
mu1 |
|
mu2 |
|
Bernard Rosner <[email protected]>, Weiliang Qiu <[email protected]>, Meiling Ting Lee <[email protected]>
Rosner B, Qiu W, and Lee MLT. Assessing Discrimination of Risk Prediction Rules in a Clustered Data Setting. Lifetime Data Anal. 2013 Apr; 19(2): 242-256.
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) # prediction rule 1 tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame print(dim(myframe1)) print(myframe1[1:3,]) #### # prediction rule 2 tt2 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4, cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe2=tt2$frame print(dim(myframe2)) print(myframe2[1:3,]) # combine scores from two prediction rules myframe12=myframe1[, c("cid", "subuid", "status")] myframe12$score1=myframe1$score myframe12$score2=myframe2$score print(dim(myframe12)) print(myframe12[1:3,]) res = powerCalData(nSubj = 30, triangle = 0.05, frame=myframe12, alpha = 0.05) print(res)
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) # prediction rule 1 tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame print(dim(myframe1)) print(myframe1[1:3,]) #### # prediction rule 2 tt2 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4, cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe2=tt2$frame print(dim(myframe2)) print(myframe2[1:3,]) # combine scores from two prediction rules myframe12=myframe1[, c("cid", "subuid", "status")] myframe12$score1=myframe1$score myframe12$score2=myframe2$score print(dim(myframe12)) print(myframe12[1:3,]) res = powerCalData(nSubj = 30, triangle = 0.05, frame=myframe12, alpha = 0.05) print(res)
Assessing risk prediction performance for clustered data.
riskPredict(frame, alpha=0.05)
riskPredict(frame, alpha=0.05)
frame |
A data frame with 4 columns: cid, subuid, status, and score.
|
alpha |
numeric. confidence level for |
To obtain 95% confidence interval of ,
we first obtain 95% confidence interval
for
, then
transform back:
.
A list of 6 elements:
stat |
the test statistics
|
se.stat |
standard error of the test statistic under the null hypothesis. |
z |
z score |
pval |
p-value of the test |
rho |
correlation between |
mu.hat |
estimated |
theta.hat |
estimated |
theta.c.hat |
estimated |
E.stat.Ha |
expectation of |
se.stat.Ha |
standard error for |
CIlow |
lower confidence limit for |
CIupp |
upper confidence limit for |
datHk |
A nSubj by 2 matrix of probit transformed risk scores by using only the first 2 observations of each subject. |
ci |
the vector of |
di |
the vector of |
Bernard Rosner <[email protected]>, Weiliang Qiu <[email protected]>, Meiling Ting Lee <[email protected]>
Rosner B, Qiu W, and Lee MLT. Assessing Discrimination of Risk Prediction Rules in a Clustered Data Setting. Lifetime Data Anal. 2013 Apr; 19(2): 242-256.
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame print(dim(myframe1)) print(myframe1[1:3,]) res1 = riskPredict(myframe1) print(names(res1)) print(res1)
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame print(dim(myframe1)) print(myframe1[1:3,]) res1 = riskPredict(myframe1) print(names(res1)) print(res1)
Difference of two risk prediction rules for clustered data.
riskPredictDiff(frame, alpha = 0.05)
riskPredictDiff(frame, alpha = 0.05)
frame |
A data frame with 5 columns: cid, subuid, status, score1, and score2.
|
alpha |
numeric. The confidence level. |
A list of 7 elements:
diff |
the difference of test statistics
|
se.diff |
standard error of the difference under the null hypothesis. |
z |
z score |
pval |
p-value of the test |
res1 |
output object of the function |
res2 |
output object of the function |
rhoVec |
A vector of 4 correlations:
|
E.diff.Ha |
expectation of the difference under the alternative hypothesis. |
se.diff.Ha |
standard error of the difference under the alternative hypothesis. |
CIlow.diff |
Lower confidence limit. |
CIup.diff |
Upper confidence limit. |
Bernard Rosner <[email protected]>, Weiliang Qiu <[email protected]>, Meiling Ting Lee <[email protected]>
Rosner B, Qiu W, and Lee MLT. Assessing Discrimination of Risk Prediction Rules in a Clustered Data Setting. Lifetime Data Anal. 2013 Apr; 19(2): 242-256.
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) # prediction rule 1 tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame print(dim(myframe1)) print(myframe1[1:3,]) #### # prediction rule 2 tt2 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4, cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe2=tt2$frame print(dim(myframe2)) print(myframe2[1:3,]) # combine scores from two prediction rules myframe12=myframe1[, c("cid", "subuid", "status")] myframe12$score1=myframe1$score myframe12$score2=myframe2$score print(dim(myframe12)) print(myframe12[1:3,]) #### resDiff = riskPredictDiff(frame=myframe12) print(names(resDiff)) print(resDiff)
set.seed(1234567) datFrame = genSimDataGLMEM(nSubj = 30, beta0 = -6, sd.beta0i = 1.58, beta1 = 1.58, beta2 = -3.95, beta3 = 3.15, beta4 = 2.06, beta5 = 0.51, beta6 = 1.47, beta7 = 3.11, p.smkcur = 0.08, p.inieye31 = 0.44, p.inieye32 = 0.42, p.inieye41 = 0.12, p.inieye42 = 0.11, sd.lncalorc = 0.33) print(dim(datFrame)) print(datFrame[1:2,]) # prediction rule 1 tt1 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4+factor(rtotfat), cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe1=tt1$frame print(dim(myframe1)) print(myframe1[1:3,]) #### # prediction rule 2 tt2 = getScore(fmla = prog~smkcur+lncalorc+inieye3+inieye4, cidVar = "cid", subuidVar = "subuid", statusVar = "prog", datFrame = datFrame, mycorstr = "exchangeable", verbose = FALSE) myframe2=tt2$frame print(dim(myframe2)) print(myframe2[1:3,]) # combine scores from two prediction rules myframe12=myframe1[, c("cid", "subuid", "status")] myframe12$score1=myframe1$score myframe12$score2=myframe2$score print(dim(myframe12)) print(myframe12[1:3,]) #### resDiff = riskPredictDiff(frame=myframe12) print(names(resDiff)) print(resDiff)