Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Datasets_BBGLAB.md #188

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions docs/Datasets/Datasets_BBGLAB.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Datasets - BBGLab

In the BBG Lab we have a [google spreadsheet](https://docs.google.com/spreadsheets/d/10eVPI8X9dObmSdypmcID0DTxO1XW8h3AGHbm_IT0El8/edit?usp=sharing) that contains all the information about the datasets we use.

## Description
This spreadheet contains 4 sheets:
- **External**: This table contains information about all the external datasets (related to sample sequencing data) used in the BBG Lab. It includes downloaded datasets (and the path to the cluster where you can find it) as well as datasets of which we have access through a cloud or external connection (e.g. Genomics England, UKbiobank).
- **Internal**: This table contains information about all the internal datasets sequenced at the BBG Lab (e.g. samples from patients from Sant Joan de Déu).
- **Form External**: This table contains new external datasets added by any BBG user. You can [add new external datasets using this form](https://forms.gle/dBAJD3wZV2MyvVx79). New datasets will be added as new rows.
- **Form Internal**: This table contains new internal datasets added by any BBG user. You can [add new internal datasets using this form](https://forms.gle/HExJEwgjRvW7angNA). New datasets will be added as new rows.

The idea is to have a table with the minimum information necessary to identify the dataset, so the table is realistically maintainable. Each user can search for more detailed information just by going to the path where the data is located in the cluster, by looking at the original online repository or simply by asking the BBG user of the specific dataset.

This spreadsheet can only be modified by Mònica, Martina and Paula. If you know about a dataset that is not included in the spreadsheet, you can use the forms to add them. Mònica, Martina and Paula will review the new datasets and add them in the main tables.

To add new datasets, it is only mandatory to write the name of the dataset (as descriptive as possible) and the user e-mail. Mònica, Martina and Paula will try to find the rest of the information.

## Reference
Mònica Sánchez Guxé\
Martina Gasull\
Paula Gomis