Miyakogusa Predicted Gene

Lj4g3v3114020.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj4g3v3114020.1 Non Chatacterized Hit- tr|I1KJ25|I1KJ25_SOYBN
Uncharacterized protein OS=Glycine max PE=4 SV=1,76.33,0,LEA_2,Late
embryogenesis abundant protein, LEA-14; seg,NULL,CUFF.52372.1
         (229 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   185   3e-47
AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   166   8e-42
AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   137   4e-33
AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...   136   1e-32
AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late embry...   113   9e-26
AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein, g...   104   4e-23

>AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 258 Blast
           hits to 242 proteins in 39 species: Archae - 0; Bacteria
           - 11; Metazoa - 10; Fungi - 14; Plants - 198; Viruses -
           17; Other Eukaryotes - 8 (source: NCBI BLink). |
           chr1:17191502-17192870 FORWARD LENGTH=342
          Length = 342

 Score =  185 bits (469), Expect = 3e-47,   Method: Compositional matrix adjust.
 Identities = 105/219 (47%), Positives = 126/219 (57%), Gaps = 21/219 (9%)

Query: 31  KPWKDIDVIEEEGLLQSQDHDYTRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIF 90
           K WK+  VIEEEGLL   D D    RR                     +GA++PMKPKI 
Sbjct: 102 KQWKECAVIEEEGLLDDGDRDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKIT 161

Query: 91  VKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVI 150
           VKSI FE L++QAG DA GV TDMITMN+T+R  YRNTGTFFGVHV+STP+DLS+S+I I
Sbjct: 162 VKSITFETLKIQAGQDAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKI 221

Query: 151 ATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXX---------------------XXXX 189
            +G+VK+FY             +G KIPLY                              
Sbjct: 222 GSGSVKKFYQGRKSERTVLVHVIGEKIPLYGSGSTLLPPAPPAPLPKPKKKKGAPVPIPD 281

Query: 190 XXMPTVPVPLKLSFVIRSRAYVLGKLVKPKYYKRIECSI 228
              P  PVP+ LSFV+RSRAYVLGKLV+PK+YK+IEC I
Sbjct: 282 PPAPPAPVPMTLSFVVRSRAYVLGKLVQPKFYKKIECDI 320


>AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 19 plant structures; EXPRESSED DURING: 11
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT1G45688.1); Has 1807 Blast
           hits to 1807 proteins in 277 species: Archae - 0;
           Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
           Viruses - 0; Other Eukaryotes - 339 (source: NCBI
           BLink). | chr5:17183339-17184857 REVERSE LENGTH=320
          Length = 320

 Score =  166 bits (421), Expect = 8e-42,   Method: Compositional matrix adjust.
 Identities = 95/216 (43%), Positives = 120/216 (55%), Gaps = 21/216 (9%)

Query: 34  KDIDVIEEEGLLQSQDHDY-TRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIFVK 92
           K   +IEEEGLL   D +     RR                     + A++P KPKI VK
Sbjct: 83  KQFAMIEEEGLLDDGDREQEALPRRCYVLAFIVGFSLLFAFFSLILYAAAKPQKPKISVK 142

Query: 93  SIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVIAT 152
           SI FE L+VQAG DA G+ TDMITMN+T+R  YRNTGTFFGVHV+S+P+DLS+S+I I +
Sbjct: 143 SITFEQLKVQAGQDAGGIGTDMITMNATLRMLYRNTGTFFGVHVTSSPIDLSFSQITIGS 202

Query: 153 GNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXX--------------------XXXXM 192
           G++K+FY             +G KIPLY                                
Sbjct: 203 GSIKKFYQSRKSQRTVVVNVLGDKIPLYGSGSTLVPPPPPAPIPKPKKKKGPIVIVEPPA 262

Query: 193 PTVPVPLKLSFVIRSRAYVLGKLVKPKYYKRIECSI 228
           P  PVP++L+F +RSRAYVLGKLV+PK+YKRI C I
Sbjct: 263 PPAPVPMRLNFTVRSRAYVLGKLVQPKFYKRIVCLI 298


>AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 35333 Blast
           hits to 34131 proteins in 2444 species: Archae - 798;
           Bacteria - 22429; Metazoa - 974; Fungi - 991; Plants -
           531; Viruses - 0; Other Eukaryotes - 9610 (source: NCBI
           BLink). | chr1:17191502-17192464 FORWARD LENGTH=248
          Length = 248

 Score =  137 bits (346), Expect = 4e-33,   Method: Compositional matrix adjust.
 Identities = 69/125 (55%), Positives = 83/125 (66%)

Query: 31  KPWKDIDVIEEEGLLQSQDHDYTRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIF 90
           K WK+  VIEEEGLL   D D    RR                     +GA++PMKPKI 
Sbjct: 102 KQWKECAVIEEEGLLDDGDRDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKIT 161

Query: 91  VKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVI 150
           VKSI FE L++QAG DA GV TDMITMN+T+R  YRNTGTFFGVHV+STP+DLS+S+I I
Sbjct: 162 VKSITFETLKIQAGQDAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKI 221

Query: 151 ATGNV 155
            +G+V
Sbjct: 222 GSGSV 226


>AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr4:16736839-16738186 FORWARD LENGTH=299
          Length = 299

 Score =  136 bits (342), Expect = 1e-32,   Method: Compositional matrix adjust.
 Identities = 67/152 (44%), Positives = 93/152 (61%), Gaps = 1/152 (0%)

Query: 79  WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
           WG S+   P   +K +  E+L VQ+G+D +GV TDM+T+NSTVR  YRN  TFF VHV+S
Sbjct: 129 WGVSKSFAPIATLKEMVLENLNVQSGNDQSGVLTDMLTLNSTVRILYRNPATFFTVHVTS 188

Query: 139 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMP-TVPV 197
            PL LSYS++++A+G + +F               G +IPLY            P  V +
Sbjct: 189 APLQLSYSQLILASGQMGEFSQRRKSERIIETKVFGDQIPLYGGVPALFGQRAEPDQVVL 248

Query: 198 PLKLSFVIRSRAYVLGKLVKPKYYKRIECSIT 229
           PL L+F +R+RAYVLG+LVK  ++  I+CSIT
Sbjct: 249 PLNLTFTLRARAYVLGRLVKTTFHSNIKCSIT 280


>AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late
           embryogenesis abundant protein, group 2
           (InterPro:IPR004864); BEST Arabidopsis thaliana protein
           match is: Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family
           (TAIR:AT4G35170.1); Has 172 Blast hits to 168 proteins
           in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 172; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr2:17527396-17528527 FORWARD
           LENGTH=297
          Length = 297

 Score =  113 bits (283), Expect = 9e-26,   Method: Compositional matrix adjust.
 Identities = 63/151 (41%), Positives = 92/151 (60%), Gaps = 5/151 (3%)

Query: 79  WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
           WGAS+   PK+ VK +    L +QAG+D +GV TDM+++NSTVR  YRN  TFF VHV++
Sbjct: 134 WGASKSYPPKVTVKGMLVRDLNLQAGNDLSGVPTDMLSLNSTVRIYYRNPSTFFAVHVTA 193

Query: 139 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPVP 198
           +PL L YS +++++G + +F               G +IPLY           + T+ +P
Sbjct: 194 SPLLLHYSNLLLSSGEMNKFTVGRNGETNVVTVVQGHQIPLY-----GGVSFHLDTLSLP 248

Query: 199 LKLSFVIRSRAYVLGKLVKPKYYKRIECSIT 229
           L L+ V+ S+AY+LG+LV  K+Y RI CS T
Sbjct: 249 LNLTIVLHSKAYILGRLVTSKFYTRIICSFT 279


>AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein,
           group 2 | chr3:8972195-8974867 REVERSE LENGTH=506
          Length = 506

 Score =  104 bits (260), Expect = 4e-23,   Method: Compositional matrix adjust.
 Identities = 56/150 (37%), Positives = 79/150 (52%), Gaps = 2/150 (1%)

Query: 79  WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
           WGAS P  P + VKS+         G D TGVAT +++ NS+V+ T  +   +FG+HVSS
Sbjct: 336 WGASHPFSPIVSVKSVDIHSFYYGEGIDRTGVATKILSFNSSVKVTIDSPAPYFGIHVSS 395

Query: 139 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPVP 198
           +   L++S + +ATG +K +Y              G+++PLY              VPV 
Sbjct: 396 STFKLTFSALTLATGQLKSYYQPRKSKHISIVKLTGAEVPLYGAGPHLAASDKKGKVPV- 454

Query: 199 LKLSFVIRSRAYVLGKLVKPKYYKRIECSI 228
            KL F IRSR  +LGKLVK K+   + CS 
Sbjct: 455 -KLEFEIRSRGNLLGKLVKSKHENHVSCSF 483



 Score = 64.3 bits (155), Expect = 6e-11,   Method: Compositional matrix adjust.
 Identities = 37/120 (30%), Positives = 56/120 (46%), Gaps = 1/120 (0%)

Query: 79  WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
           +GAS+   P +++K +         GSD TGV T ++ +  +V  T  N  T FG+HVSS
Sbjct: 131 FGASQSSPPIVYIKGVNVRSFYYGEGSDNTGVPTKIMNVKCSVVITTHNPSTLFGIHVSS 190

Query: 139 TPLDLSYS-EIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPV 197
           T + L YS +  +A   +K ++             +GSK+PLY              VPV
Sbjct: 191 TAVSLIYSRQFTLANARLKSYHQPKQSNHTSRINLIGSKVPLYGAGAELVASDNSGGVPV 250