Skip to content

4.1.6.0

Compare
Choose a tag to compare
@droazen droazen released this 25 Mar 16:35
· 610 commits to master since this release
bdb2e15

Download release: gatk-4.1.6.0.zip
Docker image: https://hub.docker.com/r/broadinstitute/gatk/

Highlights of the 4.1.6.0 release:

  • Funcotator now supports ENSEMBL GTF files (and non-human species)

  • A beta port of the GATK3 tool DepthOfCoverage, a tool to assess sequence coverage by a wide array of metrics, partitioned by sample, read group, library, or gene (#5913)

  • Several important bug fixes and enhancements to HaplotypeCaller and Mutect2, including:

    • A fix for an often-reported issue where HaplotypeCaller could produce reads starting with deletions during the realignment step and error out.
    • A fix for another often-reported issue where Mutect2 could emit MNPs despite --max-mnp-distance being 0, causing downstream errors in GenomicsDB about MNPs not being supported.

Full list of changes:

  • New Tools

    • A beta port of the GATK3 tool DepthOfCoverage, a tool to assess sequence coverage by a wide array of metrics, partitioned by sample, read group, library, or gene (#5913)
      • This port fixes several bugs and changes some behavior present in the GATK3 version:
        • Fixed a longstanding bug in GATK3 DepthOfCoverage where using multiple partition types results in column header and body lines having mismatching ordering causing incorrect output.
        • The old version used to merge adjacent and overlapping intervals when generating interval summary files. This is no longer the case as in GATK4 adjacent and overlapping intervals are tabulated as separate lines in the output (This also applies to gene lists which would previously have been merged as well).
        • Changed the behavior of gene list coverage to no longer count introns when generating interval summaries for gene lists.
        • Added support for RefSeqGeneList files as optional gene list input.
  • HaplotypeCaller

    • Fixed a bug where single-base intervals led to no calls (#6507)
      • This fixes the issue reported in #6495 "HaplotypeCaller doesn't detect alternate alleles with 1 bp intervals"
    • Clean leading deletions from reads realigned to best haplotypes (#6498)
      • This fixes the issue reported in #6490 "HaplotypeCaller might be producing bogus reads with deletions at their alignment start during realignment to best haplotype step"
    • Fixed an edge case when haplotypes have leading insertion after trimming (#6518)
  • Mutect2

    • Mutect2 can now filter MNVs with orientation bias (#6486)
    • Added an experimental pileup-based read error corrector, which in our evaluations reduces false positives and improves speed at no cost to sensitivity (#6470)
    • Switched CigarBuilder's order for adjacent indels to be deletion first (#6510)
      • Fixes #6473 "Mutect2 (GATK 4.1.5.0) emitting MNPs despite max-mnp-distance 0"
      • This also resolves downstream errors in GenomicsDB about not supporting MNPs
    • Fixed several bugs involving getReadCoordinateForReferenceCoordinate() (#6485)
      • Fixes #6342 "Mutect2 occasionally writes nonsense / invalid values for MPOS info tag"
      • Fixes #6314 "GATK4.1.3.0 Mutect2 enable-all-annotations option error"
      • Fixes #6294 "ReadPosRankSumTest with leading insertions"
      • Fixes #5492 "ReadPosRankSumTest doesn't work for two deletions with one base in between"
  • Funcotator

    • Funcotator now supports ENSEMBL GTF files (and non-human species) (#6477) (#6492)
      • Users can now create datasources for any species for which ENSEMBL has an annotated GTF file and the corresponding coding region FASTA file
      • When creating new data sources, the user must still use gencode as the parent folder for the GTF data source subfolders. For example, for E. coli MG1655:
        • DATASOURCES
          • gencode
            • ASM584v2
              • Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.44.gtf
              • Escherichia_coli_str_k_12_substr_mg1655.ASM584v2.cds.all.fa
              • gencode.config
      • For more information on creating data sources see the Funcotator tutorial on the GATK Forums.
      • An example datasource for E. coli MG1655 can be found in the large test files for Funcotator
      • For ENSEMBL datasources for vertebrates: ftp://ftp.ensembl.org/pub/
      • For ENSEMBL datasources for other species: ftp://ftp.ensemblgenomes.org/pub/
  • CNV Calling

    • Upgrade CNV WDLs to 1.0 spec (#6506)
    • Fixed an off-by-one segmentation argument in ModelSegments. (#6497)
  • Miscellaneous Changes

    • Simplified cigar and clipping code; added tests and fixed a few bugs including #6130 (#6403)
    • Refactored and enhanced ArgumentsBuilder (#6474)
    • Allow all GATKSparkTools to set the SBI index granularity (#6458)
    • Delete NioBam and related classes (#6479)
    • Clean up old interval code (#6465)
    • Remove duplicate copy of the NIO prefetching code (#6464)
    • Fix ignored test in GATKReadAdaptersUnitTest (#6471)
    • Fix alternate spellings of De Bruijn in the codebase (#6472)
  • Documentation

    • Fix a broken set of javadoc references in FeatureDataSource (#6478)