Skip to content

SNP weights #104

@Tannervanorden

Description

@Tannervanorden

Hello, I am really struggling to get individual weights of SNPs from a pca analysis. I can get the weights from each individual, but I would really like to get the results from each SNP. I am also wondering why of 1.1 million SNPs in my plink file why the pca plot says it can only use 600,000. It says because they are not all on autosomes, but they are. They are in a species with 34 chromosomes. Any help would be appreciated. I am attaching my code. # Load the SNPRelate package
library(SNPRelate)

Set the file paths

bed.fn <- "/Users/tannervanorden/Desktop/Plink/Original_edited_2.bed"
fam.fn <- "/Users/tannervanorden/Desktop/Plink/Original_edited_2.fam"
bim.fn <- "/Users/tannervanorden/Desktop/Plink/Original_edited_2.bim"

Define the output GDS file

Specify a new GDS file name

new_gds_fn <- "/Users/tannervanorden/Desktop/Plink/PCA.gds"

Convert PLINK BED file to GDS file with cvt.chr="char"

snpgdsBED2GDS(bed.fn, fam.fn, bim.fn, new_gds_fn)

Perform PCA

genofile <- snpgdsOpen(new_gds_fn)
pca_result <- snpgdsPCA(genofile)

Extract population labels

Hendrys <- rep("Hendry's Creek", 42)
PineR <- rep('Pine & Ridge Creeks', 29)
Mill <- rep("Mill Creek", 18)
Trout = rep("Trout Creek", 4)
Mountain_Dell = rep("Mountain Dell", 2)
Birch = rep("Birch Creek", 3)
Willard = rep("Willard Creek", 30)

population_labels <- c(Hendrys, PineR, Mill, Trout, Mountain_Dell, Birch, Willard)

Create a data frame for ggplot

pca_plot_data <- data.frame(
LD1 = pca_result$eigenvect[, 2],
LD2 = pca_result$eigenvect[, 1],
Group = population_labels
)

Define population colors

population_colors <- c("darkorange", "blue", 'goldenrod1', 'red', 'chartreuse1', 'purple', 'pink', "cornflowerblue", 'bisque2', "black", "cyan", "brown", "lightgrey", "olivedrab3")

Create ggplot

pca_plot <- ggplot(pca_plot_data, aes(x = LD1, y = LD2, color = Group)) +
geom_point() +
scale_color_manual(values = population_colors) +
labs(x = "LD1", y = "LD2", title = "PCA Plot") + # Adjust title as needed
theme_linedraw()

Display the plot

print(pca_plot)

Close the GDS file

snpgdsClose(genofile)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions