Running NormalizeData() & FindVariableFeatures() prior to SketchData(), then SCTransform on the sketched data #9400

tjm-sci · 2024-10-15T10:37:12Z

tjm-sci
Oct 15, 2024

Hi,
I have a theoretical question regarding the validity of using NormalizeData() and FindVaraibleFeatures() prior to creating a small sketch of my data and then, once I have built the representative sketch, proceeding with the SCTransform workflow.

I can see from @yuhanH's answers to #7336 that SCTransform is not optimised for large data, however, after reading the SCTrasnform vignette, I am keen to use this on my small sketch of the data to reap the benefits of using this method of normalization.

Given that the data will already have been normalized, is this a valid thing to do?

The code looks like this:

allen_250k_bpc <- NormalizeData(allen_250k_bpc) %>% 
  FindVariableFeatures()

allen_250k_bpc <- SketchData(
  object = allen_250k_bpc,
  ncells = 100000,
  method = "LeverageScore",
  sketched.assay = "sketch"
)

allen_250k_bpc

# switch to use the sketched dataset for subsequent analysis
DefaultAssay(allen_250k_bpc) <- "sketch"
# Perform clustering on the sketched dataset (SCTransform)
allen_250k_bpc <- SCTransform(allen_250k_bpc) %>% 
  FindVariableFeatures() %>% 
  RunPCA() %>% 
  RunUMAP(dims = 1:50) %>% 
  FindNeighbors(dims = 1:50) %>% 
  FindClusters()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running NormalizeData() & FindVariableFeatures() prior to SketchData(), then SCTransform on the sketched data #9400

{{title}}

Replies: 0 comments

Select a reply

Running NormalizeData() & FindVariableFeatures() prior to SketchData(), then SCTransform on the sketched data #9400

tjm-sci Oct 15, 2024

Replies: 0 comments

tjm-sci
Oct 15, 2024