Miyakogusa Predicted Gene
- Lj0g3v0312559.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj0g3v0312559.1 Non Chatacterized Hit- tr|I1K248|I1K248_SOYBN
Uncharacterized protein OS=Glycine max PE=4 SV=1,75.24,0,LEA_2,Late
embryogenesis abundant protein, LEA-14; seg,NULL,CUFF.21089.1
(322 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT5G42860.1 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 223 2e-58
AT1G45688.1 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 203 1e-52
AT1G45688.2 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 145 4e-35
AT4G35170.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 142 2e-34
AT2G41990.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Late embry... 130 1e-30
AT3G24600.1 | Symbols: | Late embryogenesis abundant protein, g... 116 3e-26
AT3G08490.1 | Symbols: | BEST Arabidopsis thaliana protein matc... 74 1e-13
>AT5G42860.1 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: 19 plant structures; EXPRESSED DURING: 11
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT1G45688.1); Has 1807 Blast
hits to 1807 proteins in 277 species: Archae - 0;
Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
Viruses - 0; Other Eukaryotes - 339 (source: NCBI
BLink). | chr5:17183339-17184857 REVERSE LENGTH=320
Length = 320
Score = 223 bits (567), Expect = 2e-58, Method: Compositional matrix adjust.
Identities = 130/324 (40%), Positives = 176/324 (54%), Gaps = 30/324 (9%)
Query: 15 AKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXXXXXXXXXXXX 74
AKTDSEV+SL+ SSP RS PRR AYFVQSPSR+S +D EKT SF
Sbjct: 3 AKTDSEVTSLSASSPTRS-PRRPAYFVQSPSRDS---HDGEKTATSFHSTPVLTSPMGSP 58
Query: 75 XXXXXXXVGLHSRESASTRYSRKTARKTPWRPRRDPIEEEGLLDPHDEAQLGFPRRCYFP 134
S+ + S R ++ IEEEGLLD D Q PRRCY
Sbjct: 59 PHSHSSSSRF-SKINGSKRKGHAGEKQFAM------IEEEGLLDDGDREQEALPRRCYVL 111
Query: 135 XXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTV 194
+ A++PQKP I +KSITF++ +QAG D G+ T +++MN+T+
Sbjct: 112 AFIVGFSLLFAFFSLILYAAAKPQKPKISVKSITFEQLKVQAGQDAGGIGTDMITMNATL 171
Query: 195 KLIFRNTATFFGVHVTSTPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYX 254
++++RNT TFFGVHVTS+P+D+S+ Q+T+G+G++ KFYQSRKSQR++ V V G IPLY
Sbjct: 172 RMLYRNTGTFFGVHVTSSPIDLSFSQITIGSGSIKKFYQSRKSQRTVVVNVLGDKIPLYG 231
Query: 255 XXXX-------------------XXXXXXXXXXEALPLKLRVMVRSRGYVLGKLVKPKFN 295
+P++L VRSR YVLGKLV+PKF
Sbjct: 232 SGSTLVPPPPPAPIPKPKKKKGPIVIVEPPAPPAPVPMRLNFTVRSRAYVLGKLVQPKFY 291
Query: 296 KKIECSVVMDPKKMGAPVSLVNKC 319
K+I C + + KK+ + + N C
Sbjct: 292 KRIVCLINFEHKKLSKHIPITNNC 315
>AT1G45688.1 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT5G42860.1); Has 258 Blast
hits to 242 proteins in 39 species: Archae - 0; Bacteria
- 11; Metazoa - 10; Fungi - 14; Plants - 198; Viruses -
17; Other Eukaryotes - 8 (source: NCBI BLink). |
chr1:17191502-17192870 FORWARD LENGTH=342
Length = 342
Score = 203 bits (516), Expect = 1e-52, Method: Compositional matrix adjust.
Identities = 128/341 (37%), Positives = 173/341 (50%), Gaps = 42/341 (12%)
Query: 15 AKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXX---------- 64
AKTDSEV+SL SSP RS PRR Y+VQSPSR+S +D EKT SF
Sbjct: 3 AKTDSEVTSLAASSPARS-PRRPVYYVQSPSRDS---HDGEKTATSFHSTPVLSPMGSPP 58
Query: 65 ----XXXXXXXXXXXXXXXXXVGLHSRESASTRYSRKTAR--KTPWRPRRDPIEEEGLLD 118
+ SR+ S++ + W+ IEEEGLLD
Sbjct: 59 HSHSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWK-ECAVIEEEGLLD 117
Query: 119 PHDEAQLGFPRRCYFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGA 178
D G PRRCY +GA++P KP I +KSITF+ +QAG
Sbjct: 118 DGDRDG-GVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQ 176
Query: 179 DMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTPLDISYYQLTLGTGNMPKFYQSRKSQ 238
D GV T +++MN+T+++++RNT TFFGVHVTSTP+D+S+ Q+ +G+G++ KFYQ RKS+
Sbjct: 177 DAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSVKKFYQGRKSE 236
Query: 239 RSIKVTVKGSHIPLYXXXXX--------------------XXXXXXXXXXEALPLKLRVM 278
R++ V V G IPLY +P+ L +
Sbjct: 237 RTVLVHVIGEKIPLYGSGSTLLPPAPPAPLPKPKKKKGAPVPIPDPPAPPAPVPMTLSFV 296
Query: 279 VRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLVNKC 319
VRSR YVLGKLV+PKF KKIEC + + K + + + C
Sbjct: 297 VRSRAYVLGKLVQPKFYKKIECDINFEHKNLNKHIVITKNC 337
>AT1G45688.2 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT5G42860.1); Has 35333 Blast
hits to 34131 proteins in 2444 species: Archae - 798;
Bacteria - 22429; Metazoa - 974; Fungi - 991; Plants -
531; Viruses - 0; Other Eukaryotes - 9610 (source: NCBI
BLink). | chr1:17191502-17192464 FORWARD LENGTH=248
Length = 248
Score = 145 bits (366), Expect = 4e-35, Method: Compositional matrix adjust.
Identities = 93/242 (38%), Positives = 128/242 (52%), Gaps = 26/242 (10%)
Query: 15 AKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXX---------- 64
AKTDSEV+SL SSP RS PRR Y+VQSPSR+S +D EKT SF
Sbjct: 3 AKTDSEVTSLAASSPARS-PRRPVYYVQSPSRDS---HDGEKTATSFHSTPVLSPMGSPP 58
Query: 65 ----XXXXXXXXXXXXXXXXXVGLHSRESASTRYSRKTAR--KTPWRPRRDPIEEEGLLD 118
+ SR+ S++ + W+ IEEEGLLD
Sbjct: 59 HSHSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWK-ECAVIEEEGLLD 117
Query: 119 PHDEAQLGFPRRCYFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGA 178
D G PRRCY +GA++P KP I +KSITF+ +QAG
Sbjct: 118 DGDRDG-GVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQ 176
Query: 179 DMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTPLDISYYQLTLGTGN----MPKFYQS 234
D GV T +++MN+T+++++RNT TFFGVHVTSTP+D+S+ Q+ +G+G+ + K Y+
Sbjct: 177 DAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSVSLPIQKLYRM 236
Query: 235 RK 236
R+
Sbjct: 237 RE 238
>AT4G35170.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr4:16736839-16738186 FORWARD LENGTH=299
Length = 299
Score = 142 bits (359), Expect = 2e-34, Method: Compositional matrix adjust.
Identities = 71/168 (42%), Positives = 97/168 (57%)
Query: 152 WGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTS 211
WG S+ P LK + + +Q+G D SGV T ++++NSTV++++RN ATFF VHVTS
Sbjct: 129 WGVSKSFAPIATLKEMVLENLNVQSGNDQSGVLTDMLTLNSTVRILYRNPATFFTVHVTS 188
Query: 212 TPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYXXXXXXXXXXXXXXXEAL 271
PL +SY QL L +G M +F Q RKS+R I+ V G IPLY L
Sbjct: 189 APLQLSYSQLILASGQMGEFSQRRKSERIIETKVFGDQIPLYGGVPALFGQRAEPDQVVL 248
Query: 272 PLKLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLVNKC 319
PL L +R+R YVLG+LVK F+ I+CS+ K+G + L C
Sbjct: 249 PLNLTFTLRARAYVLGRLVKTTFHSNIKCSITFYGDKLGKTLDLSKSC 296
>AT2G41990.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Late
embryogenesis abundant protein, group 2
(InterPro:IPR004864); BEST Arabidopsis thaliana protein
match is: Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family
(TAIR:AT4G35170.1); Has 172 Blast hits to 168 proteins
in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 172; Viruses - 0; Other Eukaryotes -
0 (source: NCBI BLink). | chr2:17527396-17528527 FORWARD
LENGTH=297
Length = 297
Score = 130 bits (328), Expect = 1e-30, Method: Compositional matrix adjust.
Identities = 101/309 (32%), Positives = 148/309 (47%), Gaps = 18/309 (5%)
Query: 15 AKTDSEVSSLTQS--SPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXXXXXXXXXX 72
AKTDSE +S+ + SP RS+ R Y+VQSPS ++D EK SF
Sbjct: 3 AKTDSEATSIDAAALSPPRSA-IRPLYYVQSPS-----NHDVEKM--SFGSGCSLMGSPT 54
Query: 73 XXXXXXXXXVGLHSRESASTRYS-RKTARKTPWRPRRDPIEEEGLLDPHDEAQLGFPRRC 131
+ HSRES+++R+S R R RR I + + F
Sbjct: 55 HPHYYHCSPIH-HSRESSTSRFSDRALLSYKSIRERRRYINDGDDKTDGGDDDDPFRNVR 113
Query: 132 YFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMN 191
+ WGAS+ P + +K + LQAG D+SGV T ++S+N
Sbjct: 114 LYVWLLLSVIFLFTVFSLILWGASKSYPPKVTVKGMLVRDLNLQAGNDLSGVPTDMLSLN 173
Query: 192 STVKLIFRNTATFFGVHVTSTPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIP 251
STV++ +RN +TFF VHVT++PL + Y L L +G M KF R + ++ V+G IP
Sbjct: 174 STVRIYYRNPSTFFAVHVTASPLLLHYSNLLLSSGEMNKFTVGRNGETNVVTVVQGHQIP 233
Query: 252 LYXXXXXXXXXXXXXXXEALPLKLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGA 311
LY +LPL L +++ S+ Y+LG+LV KF +I CS +D +
Sbjct: 234 LYGGVSFHLDTL------SLPLNLTIVLHSKAYILGRLVTSKFYTRIICSFTLDANHLPK 287
Query: 312 PVSLVNKCI 320
+SL+ CI
Sbjct: 288 SISLLRSCI 296
>AT3G24600.1 | Symbols: | Late embryogenesis abundant protein,
group 2 | chr3:8972195-8974867 REVERSE LENGTH=506
Length = 506
Score = 116 bits (290), Expect = 3e-26, Method: Compositional matrix adjust.
Identities = 60/167 (35%), Positives = 90/167 (53%), Gaps = 3/167 (1%)
Query: 152 WGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTS 211
WGAS P P + +KS+ F G D +GVAT ++S NS+VK+ + A +FG+HV+S
Sbjct: 336 WGASHPFSPIVSVKSVDIHSFYYGEGIDRTGVATKILSFNSSVKVTIDSPAPYFGIHVSS 395
Query: 212 TPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYXXXXXXXXXXXXXXXEAL 271
+ +++ LTL TG + +YQ RKS+ V + G+ +PLY +
Sbjct: 396 STFKLTFSALTLATGQLKSYYQPRKSKHISIVKLTGAEVPLYGAGPHLAASDKKGK---V 452
Query: 272 PLKLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLVNK 318
P+KL +RSRG +LGKLVK K + CS + K P+ +K
Sbjct: 453 PVKLEFEIRSRGNLLGKLVKSKHENHVSCSFFISSSKTSKPIEFTHK 499
Score = 79.3 bits (194), Expect = 3e-15, Method: Compositional matrix adjust.
Identities = 65/258 (25%), Positives = 107/258 (41%), Gaps = 40/258 (15%)
Query: 11 MSSLAKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXXXXXXXX 70
M K+DS+V+SL SSP +R Y+VQSPSR+S + TT+
Sbjct: 1 MKMYPKSDSDVTSLDLSSP-----KRPTYYVQSPSRDSDKSSSVALTTHQTTPTESP--- 52
Query: 71 XXXXXXXXXXXVGLHSRESASTRYSRKTARKTPWRPRRDPIEEEGLLDPHDEAQLGFPR- 129
S S ++R S W+ RR G+ P D+ + G R
Sbjct: 53 ---------------SHPSIASRVSNGGGGGFRWKGRRK--YHGGIWWPADKEEGGDGRY 95
Query: 130 -------------RCYFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQA 176
C +GAS+ P + +K + F
Sbjct: 96 EDLYEDNRGVSIVTCRLILGVVATLSIFFLLCSVLFGASQSSPPIVYIKGVNVRSFYYGE 155
Query: 177 GADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTPLDISYY-QLTLGTGNMPKFYQSR 235
G+D +GV T ++++ +V + N +T FG+HV+ST + + Y Q TL + ++Q +
Sbjct: 156 GSDNTGVPTKIMNVKCSVVITTHNPSTLFGIHVSSTAVSLIYSRQFTLANARLKSYHQPK 215
Query: 236 KSQRSIKVTVKGSHIPLY 253
+S + ++ + GS +PLY
Sbjct: 216 QSNHTSRINLIGSKVPLY 233
>AT3G08490.1 | Symbols: | BEST Arabidopsis thaliana protein match
is: Late embryogenesis abundant protein, group 2
(TAIR:AT3G24600.1); Has 161 Blast hits to 158 proteins
in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 161; Viruses - 0; Other Eukaryotes -
0 (source: NCBI BLink). | chr3:2574105-2575125 REVERSE
LENGTH=271
Length = 271
Score = 74.3 bits (181), Expect = 1e-13, Method: Compositional matrix adjust.
Identities = 41/167 (24%), Positives = 80/167 (47%), Gaps = 3/167 (1%)
Query: 154 ASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTP 213
A++P P+I + F++F+L+ G D GV+T ++ N + KLI N + FG+H+
Sbjct: 103 ATQPPHPNISFRIGRFNQFMLEEGVDSHGVSTKFLTFNCSTKLIIDNKSNVFGLHIHPPS 162
Query: 214 LDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYXXXXXXXXXXXXXXXEALPL 273
+ + L PK Y + ++ + ++ +Y LPL
Sbjct: 163 IKFFFGPLNFAKAQGPKLYGLSHESTTFQLYIATTNRAMYGAGTEMNDMLLSRA--GLPL 220
Query: 274 KLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLV-NKC 319
LR + S V+ ++ PK++ K+EC +++ K+ + V+++ KC
Sbjct: 221 ILRTSIISDYRVVWNIINPKYHHKVECLLLLADKERHSHVTMIREKC 267