wFleaBase | BLAST | BioMart | GBrowse Maps | Genomics | Help
[?]

The Daphnia pulex official gene set is dpulex_jgi060905_JGI_V11

      Name                                                Last modified       Size  Description

[DIR] Parent Directory 09-Apr-2019 09:22 - [TXT] About.txt 09-Dec-2009 15:47 6k [TXT] dpulex1_JGI_V11_annotatedgene.description_count.txt 07-Nov-2007 14:32 71k [TXT] dpulex1_JGI_V11_annotatedgene.function_count.txt 07-Nov-2007 14:32 42k [   ] dpulex1_JGI_V11_annotatedgene.gff.gz 07-Nov-2007 14:05 1.3M [TXT] dpulex1_JGI_V11_annotatedgene.head 30-Aug-2007 18:05 1k [TXT] dpulex1_gnomon_annotatedgene.description_count.txt 28-Aug-2007 14:28 37k [   ] dpulex1_gnomon_annotatedgene.flat.gz 02-Sep-2007 19:41 2.9M [TXT] dpulex1_gnomon_annotatedgene.function_count.txt 28-Aug-2007 14:28 127k [   ] dpulex1_gnomon_annotatedgene.gff.gz 02-Sep-2007 17:27 2.9M [TXT] dpulex1_gnomon_annotatedgene.head 02-Sep-2007 19:40 2k [   ] dpulex1_gnomon_annotatedgene.ugp.xml.gz 02-Sep-2007 23:53 3.8M [TXT] dpulex1_gnomon_go.tab 08-Aug-2007 15:04 197k [TXT] dpulex1_gnomon_paralog.tab 08-Aug-2007 15:05 371k [TXT] dpulex1_gnomon_paralog_mcl2ids.tab 08-Aug-2007 15:19 355k [TXT] dpulex1_gnomon_uniprot.tab 08-Aug-2007 15:05 751k [   ] dpulex_jgi060905_DGIL_SNO.aa.gz 26-Sep-2006 10:34 8.2M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_DGIL_SNO.gff.gz 26-Sep-2006 10:34 3.4M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_DGIL_SNO.hmm.gz 25-Sep-2006 11:49 15k 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_DGIL_SNO.tr.gz 26-Sep-2006 10:34 12.8M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_Gnomon.aa.gz 24-May-2007 16:10 7.1M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_Gnomon.gff.gz 24-May-2007 16:07 3.9M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_Gnomon.tr.gz 24-May-2007 16:10 11.7M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_JGI_FM5.aa.gz 07-Apr-2007 16:08 5.6M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_JGI_FM5.gff.gz 07-Apr-2007 16:02 3.1M 2006.09 assembly annotations, EST and data [TXT] dpulex_jgi060905_JGI_FM5.info 07-Apr-2007 16:19 1k 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_JGI_FM5.tr.gz 07-Apr-2007 16:08 9.2M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_JGI_V11.aa.gz 27-Jul-2007 16:31 6.0M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_JGI_V11.cds.gz 18-Jan-2013 12:52 9.7M 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_JGI_V11.gff.gz 27-Jul-2007 16:24 3.1M 2006.09 assembly annotations, EST and data [TXT] dpulex_jgi060905_JGI_V11.head 27-Jul-2007 16:24 1k 2006.09 assembly annotations, EST and data [   ] dpulex_jgi060905_JGI_V11.tr.gz 27-Jul-2007 16:31 10.0M 2006.09 assembly annotations, EST and data [TXT] dpulex_jgiV11_annot2oneline.perl 07-Nov-2007 14:05 11k [   ] dpulex_jgiV11_annotsubset.gff.gz 06-Oct-2007 14:29 279k [   ] dpulex_jgiV11_annotsubset.txt.gz 06-Oct-2007 14:19 549k [   ] dpulex_jgiV11_annotsubset.xls.gz 20-Sep-2007 11:54 2.5M [TXT] functiontable.perl 28-Aug-2007 14:28 4k [   ] gnomon-uniprot-match.tab.gz 08-Aug-2007 15:47 65k [   ] gnomon-uniprot-records.swiss.gz 08-Aug-2007 15:48 9.4M [   ] gnomon-uniprot-summmary.tab.gz 08-Aug-2007 15:47 373k [TXT] gnomonstitch.pl 24-May-2007 16:08 19k


The Daphnia pulex official gene set is dpulex_jgi060905_JGI_V11

   Annotated gene prediction sets:
dpulex1_JGI_V11_annotatedgene.gff.gz : gene (mRNA-only) features from JGI V11 official preditions
        with added annotations from homology,GO functions, expression and cross-reference to Gnomon 

dpulex1_gnomon_annotatedgene.gff.gz  : gene (mRNA-only) features from Gnomon predictions, 
	with added Uniprot, GO, Pfam IDs and descriptions, using Gnomon best protein_hit ID.
        JGI v1.1 IDs are included (jgi= perfect match, jgiov= overlap matches)
        Also protein gene duplicates (paralog=) and tandem gene (tandy=) are identified.
dpulex1_gnomon_annotatedgene.flat.gz  : same as .gff data but in key: value lines flat file format

Summary annotation tables
dpulex1_gnomon_go.tab      : list of GO, Pfam IDs with gene counts
dpulex1_gnomon_paralog.tab : list of paralog (OrthoMCL, p<=1e-40) IDs, uniprot descript 
dpulex1_gnomon_uniprot.tab : list of uniprot IDs, and any assoc. paralogs.
dpulex1_gnomon_paralog_mcl2ids.tab : table of each Gnomon ID, OrthoMCL id (paralog= in gff)
Uniprot source records: gnomon-uniprot-records.swiss.gz, and   gnomon-uniprot-summary.tab.
	 gnomon-uniprot-match.tab has the uniprot.org lookup results, including some no_match
	 cases (7841 found, 242 missed).

Annotation key for  dpulex1_gnomon_annotatedgene.flat (and .gff) files:
#   ID: NCBI_GNO_336014         == mRNA gene prediction ID from dpulex_jgi060905_Gnomon.gff 
#   Location: scaffold_1:173179-177588:+  == location from gff
#   Type: mRNA:NCBI_GNO         ==  Type:Source field from gff
#   Score: 280.246              ==  Gene quality score from Gnomon
#   Dbxref:                     ==  All database IDs from Gnomon, EST and protein matches
#     NP_000115.1,CAH70291.1,NP_001074690.1,NP_080815.1,WFes0149391,WFes0149392,WFes0162361
#   Note:                       ==  Uniprot ID/Accessions/Species/Description from protein_hit
#     ERCC6_HUMAN/Q03468,Q5W0L9/Homo sapiens/DNA excision repair protein ERCC-6,ATP-dependent ...
#   Ontology_term:              == Uniprot GO and Pfam cross-refs
#     GO:0003678/F:DNA helicase activity,GO:0005515/F:protein binding,GO:0003702/F:RNA polyme... 
#   Parent: gene336014          == NCBI gene parent ID (a few alt-transcripts make this non-trivial)
#   flags: EST,Prot,Start,Stop  == NCBI prediction flags 
#                               (only genes with Start+Stop codons are included)
#   jgiov: JGI_V11_231974       == JGI v1.1 gene overlap IDs 
#                               (not perfect match; FIXME: includes trivial overlaps)
#   jgi: JGI_V11_nnnn           == JGI v1.1 perfect gene match
#   paralog: Omcl83,24          == Paralog ID,count from OrthoMCL of blastp of all Gnomon proteins
#   maxCDS: 173179 177401       == Gnomon value (?)
#   protCDS: 173368 177389      == Gnomon value (?)
#   protein_hit: gi|4557565|ref|NP_000115.1|  == best protein match from Gnomon pipeline
#   tandy: td_s1c2g0            == Tandem genes id (FIXME: needs near/far duplicate flag)
#   tilex: 39484                == Genome tiling expression maximum score 


   Gene prediction set: dpulex_jgi060905_JGI_V11
   .gff = feature annotations, locations.
   .aa  = amino translation (protein)
   .tr  = transcript
   dpulex_jgi060905_JGI_V11_annotgene.gff = annotation of gene features with Gnomon matching gene,
	JGI_FM5 matching gene, and tile expression
#species: Daphnia_pulex
#assembly-id: Dappu1 , dpulex_jgi060905
#annotation-group-id: JGI_V11
#algorithm: Filtered gene models as consensus (V11, version 1.1) with supporting evidence (EST, homology)
#   of several gene predictors with versions: fgenesh, SNAP, NCBI Gnomon, GeneWise
#   and curated gene models
#source: ftp://ftp.jgi-psf.org/pub/JGI_data/Daphnia_pulex/v1.0/FrozenGeneCatalog_2007_07_03.gff.gz


   Gene prediction set: dpulex_jgi060905_DGIL_SNO
#species: Dapnia_pulex
#assembly-id: dpulex_jgi060905
#annotation-group-id: DGIL_SNO
#algorithm: DGIL_SNO = SNAP gene predictor + protein homology guidance, version 2006-05-18, 
#         : SNAP ref=http://www.biomedcentral.com/1471-2105/5/59/abstract
#         : bootstrapped HMM predictor from Dmelanogaster.hmm on Dapnia_pulex assembly dna
#         : and guided with -xdef Drosophila, Mouse and C.elegans protein gene matches (tblastn)
#authors: gilbertd AT indiana.edu
#more-info: http://wfleabase.org/docs/
#date: 20060926

   Gene prediction set: dpulex_jgi060905_JGI_FM5
#species: Daphnia_pulex
#assembly-id: Dappu1 , dpulex_jgi060905
#annotation-group-id: JGI_FM5
#algorithm: Filtered gene models as consensus (FM5) with supporting evidence (EST, homology)
#   of several gene predictors with versions: fgenesh, SNAP, GeneWise
#authors:  Jeff Boore, Igor Grigoriev, Andrea Aerts and the Joint Genome Institute
#more-info:  http://shake.jgi-psf.org/Dappu1/
#date: 20070212
#
# FM5 gene model predictor counts:
# 7063 fgenesh_pg
# 7003 SNAP
# 3727 estExt_fgenesh_pg
# 2472 e_gw
# 1915 estExt_GenewisePlus
# 1357 estExt_Genewise
# 1220 gw
#  295 estExt_fgenesh_pm
#  262 estExt_fgenesh_kg
#  131 fgenesh_pm
#   32 fgenesh_kg


   Gene prediction set: dpulex_jgi060905_Gnomon
#species: Daphnia_pulex
#assembly-id: Dappu1,dpulex_jgi060905
#annotation-group-id: NCBI_GNO
#annotation group: NCBI
#authors: Alexander Souvorov, Yuri Kapustin, Boris Kiryutin, Vyacheslav Chetvernin,
#         Tatiana Tatusova, Paul Kitts, Victor Sapojnikov and Jim Ostell
#algorithm: Gnomon, http://www.ncbi.nlm.nih.gov/genome/guide/gnomon.html
#date: 20070522
# Dbxref from support=nnn links in gnomon/aligns.gff.gz gnomon/chains_for_annotation.gff.gz
#   37329   gene
#   37466 mRNA