Miyakogusa Predicted Gene
- Lj4g3v3114020.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj4g3v3114020.1 Non Chatacterized Hit- tr|I1KJ25|I1KJ25_SOYBN
Uncharacterized protein OS=Glycine max PE=4 SV=1,76.33,0,LEA_2,Late
embryogenesis abundant protein, LEA-14; seg,NULL,CUFF.52372.1
(229 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT1G45688.1 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 185 3e-47
AT5G42860.1 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 166 8e-42
AT1G45688.2 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 137 4e-33
AT4G35170.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 136 1e-32
AT2G41990.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Late embry... 113 9e-26
AT3G24600.1 | Symbols: | Late embryogenesis abundant protein, g... 104 4e-23
>AT1G45688.1 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT5G42860.1); Has 258 Blast
hits to 242 proteins in 39 species: Archae - 0; Bacteria
- 11; Metazoa - 10; Fungi - 14; Plants - 198; Viruses -
17; Other Eukaryotes - 8 (source: NCBI BLink). |
chr1:17191502-17192870 FORWARD LENGTH=342
Length = 342
Score = 185 bits (469), Expect = 3e-47, Method: Compositional matrix adjust.
Identities = 105/219 (47%), Positives = 126/219 (57%), Gaps = 21/219 (9%)
Query: 31 KPWKDIDVIEEEGLLQSQDHDYTRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIF 90
K WK+ VIEEEGLL D D RR +GA++PMKPKI
Sbjct: 102 KQWKECAVIEEEGLLDDGDRDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKIT 161
Query: 91 VKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVI 150
VKSI FE L++QAG DA GV TDMITMN+T+R YRNTGTFFGVHV+STP+DLS+S+I I
Sbjct: 162 VKSITFETLKIQAGQDAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKI 221
Query: 151 ATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXX---------------------XXXX 189
+G+VK+FY +G KIPLY
Sbjct: 222 GSGSVKKFYQGRKSERTVLVHVIGEKIPLYGSGSTLLPPAPPAPLPKPKKKKGAPVPIPD 281
Query: 190 XXMPTVPVPLKLSFVIRSRAYVLGKLVKPKYYKRIECSI 228
P PVP+ LSFV+RSRAYVLGKLV+PK+YK+IEC I
Sbjct: 282 PPAPPAPVPMTLSFVVRSRAYVLGKLVQPKFYKKIECDI 320
>AT5G42860.1 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: 19 plant structures; EXPRESSED DURING: 11
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT1G45688.1); Has 1807 Blast
hits to 1807 proteins in 277 species: Archae - 0;
Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
Viruses - 0; Other Eukaryotes - 339 (source: NCBI
BLink). | chr5:17183339-17184857 REVERSE LENGTH=320
Length = 320
Score = 166 bits (421), Expect = 8e-42, Method: Compositional matrix adjust.
Identities = 95/216 (43%), Positives = 120/216 (55%), Gaps = 21/216 (9%)
Query: 34 KDIDVIEEEGLLQSQDHDY-TRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIFVK 92
K +IEEEGLL D + RR + A++P KPKI VK
Sbjct: 83 KQFAMIEEEGLLDDGDREQEALPRRCYVLAFIVGFSLLFAFFSLILYAAAKPQKPKISVK 142
Query: 93 SIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVIAT 152
SI FE L+VQAG DA G+ TDMITMN+T+R YRNTGTFFGVHV+S+P+DLS+S+I I +
Sbjct: 143 SITFEQLKVQAGQDAGGIGTDMITMNATLRMLYRNTGTFFGVHVTSSPIDLSFSQITIGS 202
Query: 153 GNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXX--------------------XXXXM 192
G++K+FY +G KIPLY
Sbjct: 203 GSIKKFYQSRKSQRTVVVNVLGDKIPLYGSGSTLVPPPPPAPIPKPKKKKGPIVIVEPPA 262
Query: 193 PTVPVPLKLSFVIRSRAYVLGKLVKPKYYKRIECSI 228
P PVP++L+F +RSRAYVLGKLV+PK+YKRI C I
Sbjct: 263 PPAPVPMRLNFTVRSRAYVLGKLVQPKFYKRIVCLI 298
>AT1G45688.2 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT5G42860.1); Has 35333 Blast
hits to 34131 proteins in 2444 species: Archae - 798;
Bacteria - 22429; Metazoa - 974; Fungi - 991; Plants -
531; Viruses - 0; Other Eukaryotes - 9610 (source: NCBI
BLink). | chr1:17191502-17192464 FORWARD LENGTH=248
Length = 248
Score = 137 bits (346), Expect = 4e-33, Method: Compositional matrix adjust.
Identities = 69/125 (55%), Positives = 83/125 (66%)
Query: 31 KPWKDIDVIEEEGLLQSQDHDYTRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIF 90
K WK+ VIEEEGLL D D RR +GA++PMKPKI
Sbjct: 102 KQWKECAVIEEEGLLDDGDRDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKIT 161
Query: 91 VKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVI 150
VKSI FE L++QAG DA GV TDMITMN+T+R YRNTGTFFGVHV+STP+DLS+S+I I
Sbjct: 162 VKSITFETLKIQAGQDAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKI 221
Query: 151 ATGNV 155
+G+V
Sbjct: 222 GSGSV 226
>AT4G35170.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr4:16736839-16738186 FORWARD LENGTH=299
Length = 299
Score = 136 bits (342), Expect = 1e-32, Method: Compositional matrix adjust.
Identities = 67/152 (44%), Positives = 93/152 (61%), Gaps = 1/152 (0%)
Query: 79 WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
WG S+ P +K + E+L VQ+G+D +GV TDM+T+NSTVR YRN TFF VHV+S
Sbjct: 129 WGVSKSFAPIATLKEMVLENLNVQSGNDQSGVLTDMLTLNSTVRILYRNPATFFTVHVTS 188
Query: 139 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMP-TVPV 197
PL LSYS++++A+G + +F G +IPLY P V +
Sbjct: 189 APLQLSYSQLILASGQMGEFSQRRKSERIIETKVFGDQIPLYGGVPALFGQRAEPDQVVL 248
Query: 198 PLKLSFVIRSRAYVLGKLVKPKYYKRIECSIT 229
PL L+F +R+RAYVLG+LVK ++ I+CSIT
Sbjct: 249 PLNLTFTLRARAYVLGRLVKTTFHSNIKCSIT 280
>AT2G41990.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Late
embryogenesis abundant protein, group 2
(InterPro:IPR004864); BEST Arabidopsis thaliana protein
match is: Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family
(TAIR:AT4G35170.1); Has 172 Blast hits to 168 proteins
in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 172; Viruses - 0; Other Eukaryotes -
0 (source: NCBI BLink). | chr2:17527396-17528527 FORWARD
LENGTH=297
Length = 297
Score = 113 bits (283), Expect = 9e-26, Method: Compositional matrix adjust.
Identities = 63/151 (41%), Positives = 92/151 (60%), Gaps = 5/151 (3%)
Query: 79 WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
WGAS+ PK+ VK + L +QAG+D +GV TDM+++NSTVR YRN TFF VHV++
Sbjct: 134 WGASKSYPPKVTVKGMLVRDLNLQAGNDLSGVPTDMLSLNSTVRIYYRNPSTFFAVHVTA 193
Query: 139 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPVP 198
+PL L YS +++++G + +F G +IPLY + T+ +P
Sbjct: 194 SPLLLHYSNLLLSSGEMNKFTVGRNGETNVVTVVQGHQIPLY-----GGVSFHLDTLSLP 248
Query: 199 LKLSFVIRSRAYVLGKLVKPKYYKRIECSIT 229
L L+ V+ S+AY+LG+LV K+Y RI CS T
Sbjct: 249 LNLTIVLHSKAYILGRLVTSKFYTRIICSFT 279
>AT3G24600.1 | Symbols: | Late embryogenesis abundant protein,
group 2 | chr3:8972195-8974867 REVERSE LENGTH=506
Length = 506
Score = 104 bits (260), Expect = 4e-23, Method: Compositional matrix adjust.
Identities = 56/150 (37%), Positives = 79/150 (52%), Gaps = 2/150 (1%)
Query: 79 WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
WGAS P P + VKS+ G D TGVAT +++ NS+V+ T + +FG+HVSS
Sbjct: 336 WGASHPFSPIVSVKSVDIHSFYYGEGIDRTGVATKILSFNSSVKVTIDSPAPYFGIHVSS 395
Query: 139 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPVP 198
+ L++S + +ATG +K +Y G+++PLY VPV
Sbjct: 396 STFKLTFSALTLATGQLKSYYQPRKSKHISIVKLTGAEVPLYGAGPHLAASDKKGKVPV- 454
Query: 199 LKLSFVIRSRAYVLGKLVKPKYYKRIECSI 228
KL F IRSR +LGKLVK K+ + CS
Sbjct: 455 -KLEFEIRSRGNLLGKLVKSKHENHVSCSF 483
Score = 64.3 bits (155), Expect = 6e-11, Method: Compositional matrix adjust.
Identities = 37/120 (30%), Positives = 56/120 (46%), Gaps = 1/120 (0%)
Query: 79 WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 138
+GAS+ P +++K + GSD TGV T ++ + +V T N T FG+HVSS
Sbjct: 131 FGASQSSPPIVYIKGVNVRSFYYGEGSDNTGVPTKIMNVKCSVVITTHNPSTLFGIHVSS 190
Query: 139 TPLDLSYS-EIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPV 197
T + L YS + +A +K ++ +GSK+PLY VPV
Sbjct: 191 TAVSLIYSRQFTLANARLKSYHQPKQSNHTSRINLIGSKVPLYGAGAELVASDNSGGVPV 250