-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pass dest project id #9023
base: ah_var_store
Are you sure you want to change the base?
pass dest project id #9023
Conversation
@@ -78,7 +78,8 @@ workflow GvsExtractAvroFilesForHail { | |||
|
|||
call Utils.IsUsingCompressedReferences { | |||
input: | |||
project_id = project_id, | |||
query_project_id = project_id, | |||
dest_project_id = project_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not see Hail contains dest project and I don't know the use case. So I just pass the same project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make one of the project_ids optional then? I dont imagine we will ever extract avro files into another project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this is fine--let's just get it working!
@@ -811,7 +812,7 @@ task IsUsingCompressedReferences { | |||
SELECT | |||
column_name | |||
FROM | |||
`~{dataset_name}.INFORMATION_SCHEMA.COLUMNS` | |||
`~{dest_project_id}.~{dataset_name}.INFORMATION_SCHEMA.COLUMNS` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lovely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I don't know wdl. Is this syntax correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is query_project_id
being introduced here if it's never used? Possibly related, line 811 is still referencing project_id
which no longer exists.
womtool validate
will check the syntactical correctness of WDL for you, and is among the actions that execute automatically against ah_var_store
PRs in this repo. I'm not sure why it didn't flag this error. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good finding! It is line 811
@@ -811,7 +812,7 @@ task IsUsingCompressedReferences { | |||
SELECT | |||
column_name | |||
FROM | |||
`~{dataset_name}.INFORMATION_SCHEMA.COLUMNS` | |||
`~{dest_project_id}.~{dataset_name}.INFORMATION_SCHEMA.COLUMNS` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is query_project_id
being introduced here if it's never used? Possibly related, line 811 is still referencing project_id
which no longer exists.
womtool validate
will check the syntactical correctness of WDL for you, and is among the actions that execute automatically against ah_var_store
PRs in this repo. I'm not sure why it didn't flag this error. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've manually kicked off our integration tests on this branch. If that passes (it should take a couple of hours), I'll give this a thumb.
SG! I also pushed this to Agora and will test it out using our genomic extraction workflow |
The current
IsUsingCompressedReferences
It does not pass project id that hosted dataset.
When the project_id is missing in a BigQuery SQL query, the bq command will use the --project_id flag specified in the command as the default project for resolving dataset and table references.
Add additional parameter to allow passing dest project.
In our case
Error we saw in GCP console:
terra-vpc-sc-dev-7ee328ad:1kg_wgs_2022q1.INFORMATION_SCHEMA.COLUMNS
is wrong -terra-vpc-sc-dev-7ee328ad
is the user workspaceIt should be
fc-aou-cdr-synth-test-2.1kg_wgs_2022q1
-fc-aou-cdr-synth-test-2
is the project that contains CDR data.