Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pass dest project id #9023

Open
wants to merge 4 commits into
base: ah_var_store
Choose a base branch
from
Open

pass dest project id #9023

wants to merge 4 commits into from

Conversation

yonghaoy
Copy link

@yonghaoy yonghaoy commented Oct 29, 2024

The current IsUsingCompressedReferences
It does not pass project id that hosted dataset.
When the project_id is missing in a BigQuery SQL query, the bq command will use the --project_id flag specified in the command as the default project for resolving dataset and table references.
Add additional parameter to allow passing dest project.

In our case
Error we saw in GCP console:

Access Denied: Table terra-vpc-sc-dev-7ee328ad:1kg_wgs_2022q1.INFORMATION_SCHEMA.COLUMNS: User does not have permission to query table terra-vpc-sc-dev-7ee328ad:1kg_wgs_2022q1.INFORMATION_SCHEMA.COLUMNS, or perhaps it does not exist.

terra-vpc-sc-dev-7ee328ad:1kg_wgs_2022q1.INFORMATION_SCHEMA.COLUMNS is wrong - terra-vpc-sc-dev-7ee328ad is the user workspace
It should be fc-aou-cdr-synth-test-2.1kg_wgs_2022q1 - fc-aou-cdr-synth-test-2 is the project that contains CDR data.

@Qi77Qi Qi77Qi requested a review from mcovarr October 29, 2024 20:15
@@ -78,7 +78,8 @@ workflow GvsExtractAvroFilesForHail {

call Utils.IsUsingCompressedReferences {
input:
project_id = project_id,
query_project_id = project_id,
dest_project_id = project_id,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see Hail contains dest project and I don't know the use case. So I just pass the same project

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make one of the project_ids optional then? I dont imagine we will ever extract avro files into another project

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this is fine--let's just get it working!

@@ -811,7 +812,7 @@ task IsUsingCompressedReferences {
SELECT
column_name
FROM
`~{dataset_name}.INFORMATION_SCHEMA.COLUMNS`
`~{dest_project_id}.~{dataset_name}.INFORMATION_SCHEMA.COLUMNS`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lovely

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I don't know wdl. Is this syntax correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is query_project_id being introduced here if it's never used? Possibly related, line 811 is still referencing project_id which no longer exists.

womtool validate will check the syntactical correctness of WDL for you, and is among the actions that execute automatically against ah_var_store PRs in this repo. I'm not sure why it didn't flag this error. 🤔

Copy link
Author

@yonghaoy yonghaoy Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good finding! It is line 811

@@ -811,7 +812,7 @@ task IsUsingCompressedReferences {
SELECT
column_name
FROM
`~{dataset_name}.INFORMATION_SCHEMA.COLUMNS`
`~{dest_project_id}.~{dataset_name}.INFORMATION_SCHEMA.COLUMNS`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is query_project_id being introduced here if it's never used? Possibly related, line 811 is still referencing project_id which no longer exists.

womtool validate will check the syntactical correctness of WDL for you, and is among the actions that execute automatically against ah_var_store PRs in this repo. I'm not sure why it didn't flag this error. 🤔

@yonghaoy yonghaoy requested a review from mcovarr October 30, 2024 13:04
Copy link
Collaborator

@mcovarr mcovarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've manually kicked off our integration tests on this branch. If that passes (it should take a couple of hours), I'll give this a thumb.

@yonghaoy
Copy link
Author

I've manually kicked off our integration tests on this branch. If that passes (it should take a couple of hours), I'll give this a thumb.

SG! I also pushed this to Agora and will test it out using our genomic extraction workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants