Step 1: fitting the null Poisson mixed model
- The example scripts are located in the ./extdata folder https://github.com/weizhou0/qtl/tree/main/extdata
#go to the folder
cd extdata
#check the help info for step 1
Rscript step1_fitNULLGLMM_qtl.R --help
Fit the null model
With the following settings / assumptions:
- the data set does not have sample relatedness –useSparseGRMtoFitNULL=FALSE and –useGRMtoFitNULL=FALSE
- fit a Poisson mixed model –traitType=count
- specify covariates in the model using –covarColList=
- specify covariates for samples, which have the same values for all cells from the same individual, using –sampleCovarColList=. Note that covariates in –sampleCovarColList= need to also be specified in –covarColList=
Rscript step1_fitNULLGLMM_qtl.R \
--useSparseGRMtoFitNULL=FALSE \
--useGRMtoFitNULL=FALSE \
--phenoFile=./input/seed_1_100_nindep_100_ncell_100_lambda_2_tauIntraSample_0.5_Poisson.txt \
--phenoCol=gene_1 \
--covarColList=X1,X2,pf1,pf2 \
--sampleCovarColList=X1,X2 \
--sampleIDColinphenoFile=IND_ID \
--traitType=count \
--outputPrefix=./output/nindep_100_ncell_100_lambda_2_tauIntraSample_0.5_gene_1 \
--skipVarianceRatioEstimation=FALSE \
--isRemoveZerosinPheno=FALSE \
--isCovariateOffset=FALSE \
--isCovariateTransform=TRUE \
--skipModelFitting=FALSE \
--tol=0.00001 \
--plinkFile=./input/n.indep_100_n.cell_1_01.step1 \
--IsOverwriteVarianceRatioFile=TRUE
Input files
- (Required) Phenotype file (contains covariates if any, such as gender and age). The file can be either space or tab-delimited with a header. It is required that the file contains one column for sample IDs and one column for the phenotype. It may contain columns for covariates.
```
less -S ./input/seed_1_100_nindep_100_ncell_100_lambda_2_tauIntraSample_0.5_Poisson.txt
```
- (Required) Plink file
Estimating variance ratio with random markers.
- Use –skipVarianceRatioEstimation=FALSE
- Specify the plink file containing markers that will be used for variance ratio estimation with –plinkFile. e.g. –plinkFile=./input/n.indep_100_n.cell_1_01.step1
- Include 1000 random markers with MAC >= 20 in the plink file
Output files
-
model file (input for Step 2)
#load the model file in R load("./output/nindep_100_ncell_100_lambda_2_tauIntraSample_0.5_gene_1.rda") names(modglmm) modglmm$theta
-
variance ratio file (if variance ratio is estimated in Step 1, this will be an input for Step 2)
less -S ./output/nindep_100_ncell_100_lambda_2_tauIntraSample_0.5_gene_1.varianceRatio.txt