MIND prediction

Merge gene predictions of M AKER (Maker-final.gff3) with gene predictions IN ferred D irectly (DI-final.gff3).

See the scripts used for MIND here .

Note: See the details to generate these two predictions in maker and DirectInf .

Input files for MIND

BRAKER prediction: maker-final.gff3
Direct Inference prediction: DI-final.gff3
reference genome fasta file: TAIR10_chr_all.fas
splice junction file: junctions.bed # from DirectInf step
list of input prediction: list_MIND.txt

Consolidate all the transcripts from MAKER and DirInf, and predict potential protein coding sequence

  1. Make a configure file and prepare transcripts:

    You should prepare a list_MIND.txt as below to include gtf path (1st column), gtf abbrev (2nd column), stranded-specific or not (3rd column):

    1maker-final.gff3    mk    False
    2DI-final.gff3 DI     False
    

    Then run the script as below:

    1./01_runMikado_round1.sh TAIR10_chr_all.fas junctions.bed list_MIND.txt MIND
    

    This will generate MIND_prepared.fasta file that will be used for predicting ORFs in the next step.

    Note: junctions.bed is the same file generate from DirectInf step.

2. Predict potential CDS from transcripts:
1./02_runTransDecoder.sh MIND_prepared.fasta

We will use MIND_prepared.fasta.transdecoder.bed in the next step.

Note: Here we only kept complete CDS for next step. You can revise 02_runTransDecoder.sh to use both incomplete and complete CDS if you need.

3. Pick best transcripts for each locus and annotate them as gene:
1./03_runMikado_round2.sh MIND_prepared.fasta.transdecoder.bed MIND

This will generate:

1mikado.metrics.tsv
2mikado.scores.tsv
3MIND.loci.gff3

Optional: Filter out transcripts with redundant CDS

1./04_rm_redundance.sh MIND.loci.gff3 TAIR10_chr_all.fas

Optional: Filter out transcripts whose predicted proteins mapped to transposon elements

Note: filter.pep.fa is an output from previous step for removing redundant CDSs. You can also use all protein sequence if you don’t want to remove redundant CDSs.