-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Description
I am confused how the data was transformed between the variable genes matrix and the full matrix provided. It seems the variable genes matrix is normalized + log-transformed but I don't get perfect correlation after this transformation. Could you provide the code for that preprocessing ?
PATH = '../../git/wot/notebooks/data/'
CELL_DAYS_PATH = 'data/cell_days.txt'
FULL_DS_PATH = 'data/ExprMatrix.h5ad'
VAR_DS_PATH = 'data/ExprMatrix.var.genes.h5ad'
FLE_COORDS_PATH ='data/fle_coords.txt'
coord_df = pd.read_csv(PATH+FLE_COORDS_PATH, index_col='id', sep='\t')
days_df = pd.read_csv(PATH+CELL_DAYS_PATH, index_col='id', sep='\t')
mask = [ind in days_df.index for ind in coord_df.index]
adataf = sc.read_h5ad(PATH+FULL_DS_PATH)[mask]
adata = sc.read_h5ad(PATH+VAR_DS_PATH)
df = adata[:1000,:].to_df()
dff = adataf[:1000,:].to_df()
dff_hvg = dff[df.columns]
assert np.all(df.index==dff.index)
dff_norm_hvg = np.log1p(10000*dff_hvg.div(dff_hvg.sum(axis=1),axis=0))
dff_norm = np.log1p(10000*dff.div(dff.sum(axis=1),axis=0))
#correlation is not one
plt.scatter(x=dff_norm['Fam150a'],y=df['Fam150a'],s=1)
#correlation is not one also with hvg
plt.scatter(x=dff_norm_hvg['Fam150a'],y=df['Fam150a'],s=1)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels