Extract variants common to all GWAS studies
Source:R/extract_common_variants.R
extract_common_variants.RdOpens multiple parquet files, filters to variants with minor allele
frequency above maf_threshold, and returns only variants present in
every file. Optionally selects one variant per megabase window per
chromosome for LD-thinned analyses.
Usage
extract_common_variants(
parquet_files,
trait_type,
maf_threshold = 0.05,
select_per_mb_window = FALSE,
window_size_mb = 1,
seed = NULL,
eaf_col = EAF,
chr_col = CHR,
pos_col = POS_38,
rsid_col = RSID,
ea_col = EffectAllele,
oa_col = OtherAllele,
n_col = N
)Arguments
- parquet_files
Character vector of paths to parquet files.
- trait_type
"binary"or"quantitative"(required).- maf_threshold
Minimum minor allele frequency. Default
0.05.- select_per_mb_window
Logical; if
TRUE, select one variant perwindow_size_mbMb window per chromosome. DefaultFALSE.- window_size_mb
Window size in megabases for thinning. Default
1.- seed
Random seed for reproducible window selection. Default
NULL.- eaf_col
Bare column name for EAF. Default
EAF.- chr_col
Bare column name for chromosome. Default
CHR.- pos_col
Bare column name for position. Default
POS_38.- rsid_col
Bare column name for rsid. Default
RSID.- ea_col
Bare column name for effect allele. Default
EffectAllele.- oa_col
Bare column name for other allele. Default
OtherAllele.- n_col
Bare column name for N. Default
N.
Value
A tibble::tibble() with columns file, chromosome, position,
rsid, effect allele, other allele, and minor_af. One row per
variant × study combination.
Examples
if (FALSE) { # \dontrun{
common <- extract_common_variants(
parquet_files = c("study1.parquet", "study2.parquet"),
trait_type = "binary"
)
} # }