-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional Labels for Immgen Data #68
Comments
i actually think better curation of the entire Immgen data set may be helpful. For example, there exists:
in the fine label category of ImmGenData. I don't think most people using this package have much use for time points (8 wk vs 5wk), and information such as "Ep.8wk.MEChi" is too obscure for me to figure out what it is and relate it to my own dataset. An intermediate data layer or better curation of one level would be extremely helpful. That being said, this issue is most apparent to me for ImmGenData. MonacoImmuneData, for example, has much more helpful fine categories (but is not mouse, so it doesn't help me). |
@anoronh4 Funny you say that, because - thanks to the efforts of @j-andrews7 - the latest version of SingleR has Cell Ontology mappings returned for all labels in |
@anoronh4 -- the colData()$label.ont has Cell Ontology mappings. You can check slack discussion around https://community-bioc.slack.com/archives/CE8AB163W/p1580737521140000 to see some relevant concepts. I don't see uptake of the subset_descendants and common_classes methods discussed there so have not pursued it further; I need to update the ontoProc vignette to deal with the |
|
Right. Having poked around, I think To restate the problem; the user has a bunch of terms near the tips of the ontology DAG. They want to scale back the granularity of these terms to something that is broader. I propose the following workflow:
|
I see -- by the way, I didn't know that MRCA = most recent common ancestor. These are the lines from
Once we have the |
Having tried this, I don't think it's reasonable to expect people to poke through the plot: library(ontoProc)
library(ontologyPlot)
library(SingleR)
cl <- getCellOnto()
imm <- ImmGenData()
pl <- ontologyPlot::onto_plot(cl, imm$label.ont) The graph is too large, the words are too small and you can't easily copy and paste the terms. I think the plot would be all right to look at for an overview but not as the frontline tool for the details. After some more thought, one possible option is to have a function that takes a set of terms and then simply prints out a |
Along the line of comments here, working with ImmGen has been troubling due to its naming convention like "Ep.8wk.MEChi", which is adding unnecessary details (time points 8 wk) to the base "Epithelial cells" annotation. I have a code that cleaned up the entire ImmGen labels to meaningful and easy to comprehend annotations. I am happy to share code or create a pull request, if interested? |
@namit-k Hi! I am using the InmGen labels, could you please provide the code to clean up the labels to extract a meaningful annotation? Many thanks! |
Proposing an additional layer of annotations for the T cell population. Splitting on CD4 vs CD8 would be first and foremost (looks like its already mostly doable by grepping on T.4 or T.8 from the fine labels). Adding the different subsets would be next, again, mostly can get this from the name already.
In case this is useful, leaving this here for consideration. Obviously way too many different ways to cut this data to really have it all in a single object and make everybody happy.
To add this to the current immgen dataset, below be some code. Please excuse the tidyverse coding.
The text was updated successfully, but these errors were encountered: