Miyakogusa Predicted Gene
- Lj3g3v2982850.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj3g3v2982850.1 Non Chatacterized Hit- tr|I1LKN8|I1LKN8_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.20147 PE,71.54,0,FAMILY
NOT NAMED,NULL; seg,NULL; coiled-coil,NULL,CUFF.45046.1
(482 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma me... 245 4e-65
AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 142 4e-34
AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 129 3e-30
AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 121 1e-27
AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 102 6e-22
AT1G64180.1 | Symbols: | intracellular protein transport protei... 97 3e-20
AT2G46250.1 | Symbols: | myosin heavy chain-related | chr2:1899... 70 3e-12
>AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma
membrane; EXPRESSED IN: 22 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT5G22310.1);
Has 22320 Blast hits to 15179 proteins in 1213 species:
Archae - 372; Bacteria - 2307; Metazoa - 10906; Fungi -
1700; Plants - 1146; Viruses - 65; Other Eukaryotes -
5824 (source: NCBI BLink). | chr3:3660628-3663537
FORWARD LENGTH=622
Length = 622
Score = 245 bits (626), Expect = 4e-65, Method: Compositional matrix adjust.
Identities = 156/378 (41%), Positives = 225/378 (59%), Gaps = 18/378 (4%)
Query: 6 SLFGQRMKGFEGDGHKRRVSGLSHQLQSGDYLLEGLDSCSSARL--------IEKNCGKC 57
S +RM+ +RR S +L+ GD + D +S +E G
Sbjct: 157 SPVSERMERSGTGSRQRRASSTVQKLRLGDCNVGARDPINSGSFMDIETRSRVETPTG-S 215
Query: 58 TDGVKSRLKEARSGLSTSKKLLKVLNQVCIRE-HQSSTKSLILALGSELDRVCHQIDLLI 116
T GVK+RLK+ + L+TSK+LLK++N++ ++ SS+ SL+ AL SEL+R Q++ LI
Sbjct: 216 TVGVKTRLKDCSNALTTSKELLKIINRMWGQDDRPSSSMSLVSALHSELERARLQVNQLI 275
Query: 117 HEDRSNQNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKI 176
HE + ND+ Y++K FAEEKA WKS E+E + A+++ A EL+VE+KLRR+ E LNKK+
Sbjct: 276 HEHKPENNDISYLMKRFAEEKAVWKSNEQEVVEAAIESVAGELEVERKLRRRFESLNKKL 335
Query: 177 AKEMANVKASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXX 236
KE+A K++ +KA KE+E EKRA+ ++E++CDELA I ED+A+VEELKRES
Sbjct: 336 GKELAETKSALMKAVKEIENEKRARVMVEKVCDELARDISEDKAEVEELKRESFKVKEEV 395
Query: 237 XXXXXMLQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTKDEENGVVSPE 296
MLQLAD LREERVQMKLSEAK+Q EEKN ++KLRN+L+ +++ K + P
Sbjct: 396 EKEREMLQLADALREERVQMKLSEAKHQLEEKNAAVDKLRNQLQTYLKAKRCKEKTREPP 455
Query: 297 CEKIKDLEA--YFN-NIYGGLRNAEKXXXXXXXXXXXXXXXXXXXXXHSIELSMDNRGCM 353
++ + EA Y N +I G N E HSIEL++DN+
Sbjct: 456 QTQLHNEEAGDYLNHHISFGSYNIE-----DGEVENGNEEGSGESDLHSIELNIDNKSYK 510
Query: 354 WSYAFEDATQDDSKRVSV 371
W Y E+ + + R S+
Sbjct: 511 WPYGEENRGRKSTPRKSL 528
>AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: chloroplast;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT3G20350.1); Has 21445 Blast
hits to 15134 proteins in 1325 species: Archae - 461;
Bacteria - 2309; Metazoa - 11052; Fungi - 1737; Plants -
1035; Viruses - 42; Other Eukaryotes - 4809 (source:
NCBI BLink). | chr1:18771386-18774385 FORWARD LENGTH=725
Length = 725
Score = 142 bits (359), Expect = 4e-34, Method: Compositional matrix adjust.
Identities = 80/216 (37%), Positives = 138/216 (63%)
Query: 72 LSTSKKLLKVLNQVCIREHQSSTKSLILALGSELDRVCHQIDLLIHEDRSNQNDMEYVIK 131
L T +++ ++ + + + Q + SL+ +L +EL+ +I+ L E RS++ +E ++
Sbjct: 213 LDTMEEVHQIYSNMKRIDQQVNAVSLVSSLEAELEEAHARIEDLESEKRSHKKKLEQFLR 272
Query: 132 CFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMANVKASHLKAS 191
+EE+AAW+SRE EK+ + + ++ EKK R++ E +N K+ E+A+ K + +
Sbjct: 273 KVSEERAAWRSREHEKVRAIIDDMKTDMNREKKTRQRLEIVNHKLVNELADSKLAVKRYM 332
Query: 192 KELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXXXXXXXMLQLADVLRE 251
++ E+E++A+E++E++CDELA IGED+A++E LKRES MLQ+A+V RE
Sbjct: 333 QDYEKERKARELIEEVCDELAKEIGEDKAEIEALKRESMSLREEVDDERRMLQMAEVWRE 392
Query: 252 ERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTKD 287
ERVQMKL +AK EE+ + KL +LE F+R++D
Sbjct: 393 ERVQMKLIDAKVALEERYSQMNKLVGDLESFLRSRD 428
>AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G11590.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:7383742-7385345 REVERSE LENGTH=481
Length = 481
Score = 129 bits (325), Expect = 3e-30, Method: Compositional matrix adjust.
Identities = 90/245 (36%), Positives = 141/245 (57%), Gaps = 41/245 (16%)
Query: 58 TDGVKSRLKEARSGLSTSKKLLKVLNQV--CIREHQSSTKSLILALGSELDRVCHQIDLL 115
+ VK+R K GL+TSK+L+KVL ++ +H++++ LI AL ELDR + L
Sbjct: 172 ANSVKTRFKNVSDGLTTSKELVKVLKRIGELGDDHKTASNRLISALLCELDRARSSLKHL 231
Query: 116 IHEDRSNQNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKK 175
+ E + + +I+ EE VE+KLRR+TE++N++
Sbjct: 232 MSELDEEEEEKRRLIESLQEEAM----------------------VERKLRRRTEKMNRR 269
Query: 176 IAKEMANVKASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXX 235
+ +E+ K + K +E++REKRAK++LE++CDEL GIG+D+ ++E+ +RE
Sbjct: 270 LGRELTEAKETERKMKEEMKREKRAKDVLEEVCDELTKGIGDDKKEMEK-ERE------- 321
Query: 236 XXXXXXMLQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTKDEENGVVSP 295
M+ +ADVLREERVQMKL+EAK++FE+K +E+L+ EL R D E G S
Sbjct: 322 ------MMHIADVLREERVQMKLTEAKFEFEDKYAAVERLKKELR---RVLDGEEGKGSS 372
Query: 296 ECEKI 300
E +I
Sbjct: 373 EIRRI 377
>AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: cotyledon; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT1G50660.1);
Has 15095 Blast hits to 11224 proteins in 1051 species:
Archae - 223; Bacteria - 1586; Metazoa - 7000; Fungi -
1255; Plants - 746; Viruses - 40; Other Eukaryotes -
4245 (source: NCBI BLink). | chr3:7096602-7099372
FORWARD LENGTH=673
Length = 673
Score = 121 bits (304), Expect = 1e-27, Method: Compositional matrix adjust.
Identities = 70/181 (38%), Positives = 112/181 (61%), Gaps = 2/181 (1%)
Query: 107 RVCHQIDLLIHEDRSNQNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLR 166
R C I L E RS + +E +K +EE+AAW+SRE EK+ + + ++ EKK R
Sbjct: 227 RAC--IKDLESEKRSQKKKLEQFLKKVSEERAAWRSREHEKVRAIIDDMKADMNQEKKTR 284
Query: 167 RQTERLNKKIAKEMANVKASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELK 226
++ E +N K+ E+A+ K + + + ++E++A+E++E++CDELA I ED+A++E LK
Sbjct: 285 QRLEIVNSKLVNELADSKLAVKRYMHDYQQERKARELIEEVCDELAKEIEEDKAEIEALK 344
Query: 227 RESAXXXXXXXXXXXMLQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTK 286
ES MLQ+A+V REERVQMKL +AK EEK + KL ++E F+ ++
Sbjct: 345 SESMNLREEVDDERRMLQMAEVWREERVQMKLIDAKVTLEEKYSQMNKLVGDMEAFLSSR 404
Query: 287 D 287
+
Sbjct: 405 N 405
>AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast, plasma membrane; EXPRESSED IN: 9 plant
structures; EXPRESSED DURING: 6 growth stages; BEST
Arabidopsis thaliana protein match is: intracellular
protein transport protein USO1-related
(TAIR:AT1G64180.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:16646330-16648776 FORWARD LENGTH=623
Length = 623
Score = 102 bits (254), Expect = 6e-22, Method: Compositional matrix adjust.
Identities = 73/227 (32%), Positives = 133/227 (58%), Gaps = 3/227 (1%)
Query: 66 KEARSGLSTSKKLLKVLNQV-CIREHQSSTKSLILALGSELDRVCHQIDLLIHEDRSNQN 124
+E L TS +LLKVLN++ + E S SLI AL +E+ +I L+ +++++
Sbjct: 187 REPHYNLKTSTELLKVLNRIWSLEEQHVSNISLIKALKTEVAHSRVRIKELLRYQQADRH 246
Query: 125 DMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMANVK 184
+++ V+K AEEK K++E E++ A+++ + L+ E+KLR+++E L++K+A+E++ VK
Sbjct: 247 ELDSVVKQLAEEKLLSKNKEVERMSSAVQSVRKALEDERKLRKRSESLHRKMARELSEVK 306
Query: 185 ASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXX--XXXXXXXXXM 242
+S KELER ++ +++E +CDE A GI ++ LK+++ +
Sbjct: 307 SSLSNCVKELERGSKSNKMMELLCDEFAKGIKSYEEEIHGLKKKNLDKDWAGRGGGDQLV 366
Query: 243 LQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTKDEE 289
L +A+ +ER+QM+L + VL+KL E+E F++ K E
Sbjct: 367 LHIAESWLDERMQMRLEGGDTLNGKNRSVLDKLEVEIETFLQEKRNE 413
>AT1G64180.1 | Symbols: | intracellular protein transport protein
USO1-related | chr1:23821640-23824193 FORWARD LENGTH=593
Length = 593
Score = 96.7 bits (239), Expect = 3e-20, Method: Compositional matrix adjust.
Identities = 77/224 (34%), Positives = 125/224 (55%), Gaps = 16/224 (7%)
Query: 64 RLKEARSGLSTSKKLLKVLNQVCIREHQSSTK-SLILALGSELDRVCHQIDLLIHEDRSN 122
R E + + TS +LLKVLN++ I E Q S SLI +L +EL +I L+ +++
Sbjct: 177 RAGEPNNNIKTSTELLKVLNRIWILEEQHSANISLIKSLKTELAHSRARIKDLLRCKQAD 236
Query: 123 QNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMAN 182
+ DM+ +K AEEK + ++E HD L +A + L+ E+KLR+++E L +K+A+E++
Sbjct: 237 KRDMDDFVKQLAEEKLSKGTKE----HDRLSSAVQSLEDERKLRKRSESLYRKLAQELSE 292
Query: 183 VKASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXXXXXXXM 242
VK++ KE+ER +K+ILE++CDE A GI ++ LK++ M
Sbjct: 293 VKSTLSNCVKEMERGTESKKILERLCDEFAKGIKSYEREIHGLKQKLDKNWKGWDEQDHM 352
Query: 243 -LQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRT 285
L +A+ +ER+Q A LEKL E+E F++T
Sbjct: 353 ILCIAESWLDERIQSGNGSA----------LEKLEFEIETFLKT 386
>AT2G46250.1 | Symbols: | myosin heavy chain-related |
chr2:18991386-18993201 FORWARD LENGTH=468
Length = 468
Score = 70.1 bits (170), Expect = 3e-12, Method: Compositional matrix adjust.
Identities = 61/204 (29%), Positives = 109/204 (53%), Gaps = 20/204 (9%)
Query: 63 SRLKEARSGLSTSKKLLKVLNQV-CIREHQSSTKSLILALGSELDRVCHQIDLLIHEDRS 121
S +K A GL +S KLLKVLN++ + E ++ SL+ AL ELD +I + + R
Sbjct: 150 SAVKSASYGLGSSTKLLKVLNRIWSLEEQNTANMSLVRALKMELDECRAEIKEV--QQRK 207
Query: 122 NQNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMA 181
+D + K +E E++ D ++ EL E+K+R+++E L++K+ +E+
Sbjct: 208 KLSD-----------RPLRKKKEEEEVKDVFRSIKRELDDERKVRKESETLHRKLTRELC 256
Query: 182 NVKASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXXXXXXX 241
K KA K+LE+E + + ++E +CDE A + + +V + ++S
Sbjct: 257 EAKHCLSKALKDLEKETQERVVVENLCDEFAKAVKDYEDKVRRIGKKSP------VSDKV 310
Query: 242 MLQLADVLREERVQMKLSEAKYQF 265
++Q+A+V ++R+QMKL E F
Sbjct: 311 IVQIAEVWSDQRLQMKLEEDDKTF 334