Miyakogusa Predicted Gene
- Lj3g3v2982850.2
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj3g3v2982850.2 Non Chatacterized Hit- tr|I1LKN8|I1LKN8_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.20147 PE,68.4,0,FAMILY
NOT NAMED,NULL; seg,NULL; coiled-coil,NULL,CUFF.45046.2
(614 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma me... 334 1e-91
AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 177 2e-44
AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 145 1e-34
AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 123 4e-28
AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 106 5e-23
AT1G64180.1 | Symbols: | intracellular protein transport protei... 106 5e-23
AT2G46250.1 | Symbols: | myosin heavy chain-related | chr2:1899... 74 3e-13
>AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma
membrane; EXPRESSED IN: 22 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT5G22310.1);
Has 22320 Blast hits to 15179 proteins in 1213 species:
Archae - 372; Bacteria - 2307; Metazoa - 10906; Fungi -
1700; Plants - 1146; Viruses - 65; Other Eukaryotes -
5824 (source: NCBI BLink). | chr3:3660628-3663537
FORWARD LENGTH=622
Length = 622
Score = 334 bits (856), Expect = 1e-91, Method: Compositional matrix adjust.
Identities = 219/535 (40%), Positives = 306/535 (57%), Gaps = 45/535 (8%)
Query: 20 IVNKIRKRGCXXXXXXX---LVRKYRFKRAILVGKKAGSTTPVPLWKXXXXXXXXXXXXL 76
++ KIRKRGC L YRFKRAI+VGK+ GSTTPVP W+
Sbjct: 13 LLGKIRKRGCSSPTSSTSSILREGYRFKRAIVVGKRGGSTTPVPTWRLMGRSPSPRASGA 72
Query: 77 HHSQLPP------KDKELS----VSARKLAATLWEINDLPS------ADPVRGRDKAANF 120
H+ P K ++S VSARKLAATLWE+N++PS A P+ + +
Sbjct: 73 LHAAASPSSHCGSKTGKVSAPAPVSARKLAATLWEMNEMPSPRVVEEAAPMIRKSRKERI 132
Query: 121 -------SSSRSGLLRPHMSDPSQSPLLERMKGFEGDGHKRRVSGLSHQLQSGDYLLEGL 173
SS SG L PH+SDPS SP+ ERM+ +RR S +L+ GD +
Sbjct: 133 APLPPPRSSVHSGSLPPHLSDPSHSPVSERMERSGTGSRQRRASSTVQKLRLGDCNVGAR 192
Query: 174 DSCSSARL--------IEKNCGKCTDGVKSRLKEARSGLSTSKKLLKVLNQVCIREHQ-S 224
D +S +E G T GVK+RLK+ + L+TSK+LLK++N++ ++ + S
Sbjct: 193 DPINSGSFMDIETRSRVETPTG-STVGVKTRLKDCSNALTTSKELLKIINRMWGQDDRPS 251
Query: 225 STKSLILALGSELDRVCHQIDLLIHEDRSNQNDMEYVIKCFAEEKAAWKSREREKIHDAL 284
S+ SL+ AL SEL+R Q++ LIHE + ND+ Y++K FAEEKA WKS E+E + A+
Sbjct: 252 SSMSLVSALHSELERARLQVNQLIHEHKPENNDISYLMKRFAEEKAVWKSNEQEVVEAAI 311
Query: 285 KNAAEELKVEKKLRRQTERLNKKIAKEMANVKASHLKASKELEREKRAKEILEQICDELA 344
++ A EL+VE+KLRR+ E LNKK+ KE+A K++ +KA KE+E EKRA+ ++E++CDELA
Sbjct: 312 ESVAGELEVERKLRRRFESLNKKLGKELAETKSALMKAVKEIENEKRARVMVEKVCDELA 371
Query: 345 NGIGEDRAQVEELKRESAXXXXXXXXXXXMLQLADVLREERVQMKLSEAKYQFEEKNDVL 404
I ED+A+VEELKRES MLQLAD LREERVQMKLSEAK+Q EEKN +
Sbjct: 372 RDISEDKAEVEELKRESFKVKEEVEKEREMLQLADALREERVQMKLSEAKHQLEEKNAAV 431
Query: 405 EKLRNELECFMRTKDEENGVVSPECEKIKDLEA--YFN-NIYGGLRNAEKXXXXXXXXXX 461
+KLRN+L+ +++ K + P ++ + EA Y N +I G N E
Sbjct: 432 DKLRNQLQTYLKAKRCKEKTREPPQTQLHNEEAGDYLNHHISFGSYNIE-----DGEVEN 486
Query: 462 XXXXXXXXXXXHSIELSMDNRGCMWSYAFEDATQDDSKRVSVDSIGRKSLSGIQW 516
HSIEL++DN+ W Y E+ + + R S+ S+ R + W
Sbjct: 487 GNEEGSGESDLHSIELNIDNKSYKWPYGEENRGRKSTPRKSL-SLQRSISDCVDW 540
>AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G11590.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:7383742-7385345 REVERSE LENGTH=481
Length = 481
Score = 177 bits (449), Expect = 2e-44, Method: Compositional matrix adjust.
Identities = 154/433 (35%), Positives = 213/433 (49%), Gaps = 87/433 (20%)
Query: 23 KIRKRGCXXXXXXXLVRKYRFKRAILVGKKA-----GSTTPVPLWKXXXXXXXXXXXXLH 77
KIRKRG L R+ RFKRAI GK+A GS TPV K
Sbjct: 9 KIRKRGGSSSSSSSLARRNRFKRAIFAGKRAAQDDGGSGTPV---KSITAAKTPVLLSFS 65
Query: 78 HSQLPPKDKELS---VSARKLAATLWEIND------------LPSADPVRGR-DKAANFS 121
LP +L VSARKLAATLWEIND L S P R R K+ FS
Sbjct: 66 PENLPIDHHQLQKSCVSARKLAATLWEINDDADPPVNSDKDCLRSKKPSRYRAKKSTEFS 125
Query: 122 SSRSGLLRPHMSDPSQSPLLERMKGFEGDGHKRRVSGLSHQLQSGDYLLEGLDSCSSARL 181
S P SDP ER+ D RR S +L +Y + G +S
Sbjct: 126 SID---FPPRSSDPISRLSSERIDLC--DDMIRRRSTNPQKLNPIEYKIIGANS------ 174
Query: 182 IEKNCGKCTDGVKSRLKEARSGLSTSKKLLKVLNQV--CIREHQSSTKSLILALGSELDR 239
VK+R K GL+TSK+L+KVL ++ +H++++ LI AL ELDR
Sbjct: 175 -----------VKTRFKNVSDGLTTSKELVKVLKRIGELGDDHKTASNRLISALLCELDR 223
Query: 240 VCHQIDLLIHEDRSNQNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRR 299
+ L+ E + + +I+ EE VE+KLRR
Sbjct: 224 ARSSLKHLMSELDEEEEEKRRLIESLQEEAM----------------------VERKLRR 261
Query: 300 QTERLNKKIAKEMANVKASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKR 359
+TE++N+++ +E+ K + K +E++REKRAK++LE++CDEL GIG+D+ ++E+ +R
Sbjct: 262 RTEKMNRRLGRELTEAKETERKMKEEMKREKRAKDVLEEVCDELTKGIGDDKKEMEK-ER 320
Query: 360 ESAXXXXXXXXXXXMLQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTKD 419
E M+ +ADVLREERVQMKL+EAK++FE+K +E+L+ EL R D
Sbjct: 321 E-------------MMHIADVLREERVQMKLTEAKFEFEDKYAAVERLKKELR---RVLD 364
Query: 420 EENGVVSPECEKI 432
E G S E +I
Sbjct: 365 GEEGKGSSEIRRI 377
>AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: chloroplast;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT3G20350.1); Has 21445 Blast
hits to 15134 proteins in 1325 species: Archae - 461;
Bacteria - 2309; Metazoa - 11052; Fungi - 1737; Plants -
1035; Viruses - 42; Other Eukaryotes - 4809 (source:
NCBI BLink). | chr1:18771386-18774385 FORWARD LENGTH=725
Length = 725
Score = 145 bits (365), Expect = 1e-34, Method: Compositional matrix adjust.
Identities = 106/340 (31%), Positives = 180/340 (52%), Gaps = 35/340 (10%)
Query: 90 VSARKLAATLWEINDLPSADPVRGRDKAANFSSSRSGLLRPHMSDPSQSPLLERMKGFEG 149
VS RKLAA LW + +P A G K + GL GF+G
Sbjct: 114 VSVRKLAAGLWRLQ-VPDASSSGGERKG------KEGL------------------GFQG 148
Query: 150 DGHKRRVSGLSHQLQ--SG---DYLLEGLDSCSSAR-----LIEKNCGKCTDGVKSRLKE 199
+G V L H SG + + + + ++ + +E + ++ K
Sbjct: 149 NGGYMGVPYLYHHSDKPSGGQSNKIRQNPSTIATTKNGFLCKLEPSMPFPHSAMEGATKW 208
Query: 200 ARSGLSTSKKLLKVLNQVCIREHQSSTKSLILALGSELDRVCHQIDLLIHEDRSNQNDME 259
L T +++ ++ + + + Q + SL+ +L +EL+ +I+ L E RS++ +E
Sbjct: 209 DPVCLDTMEEVHQIYSNMKRIDQQVNAVSLVSSLEAELEEAHARIEDLESEKRSHKKKLE 268
Query: 260 YVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMANVKASH 319
++ +EE+AAW+SRE EK+ + + ++ EKK R++ E +N K+ E+A+ K +
Sbjct: 269 QFLRKVSEERAAWRSREHEKVRAIIDDMKTDMNREKKTRQRLEIVNHKLVNELADSKLAV 328
Query: 320 LKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXXXXXXXMLQLAD 379
+ ++ E+E++A+E++E++CDELA IGED+A++E LKRES MLQ+A+
Sbjct: 329 KRYMQDYEKERKARELIEEVCDELAKEIGEDKAEIEALKRESMSLREEVDDERRMLQMAE 388
Query: 380 VLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTKD 419
V REERVQMKL +AK EE+ + KL +LE F+R++D
Sbjct: 389 VWREERVQMKLIDAKVALEERYSQMNKLVGDLESFLRSRD 428
>AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: cotyledon; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT1G50660.1);
Has 15095 Blast hits to 11224 proteins in 1051 species:
Archae - 223; Bacteria - 1586; Metazoa - 7000; Fungi -
1255; Plants - 746; Viruses - 40; Other Eukaryotes -
4245 (source: NCBI BLink). | chr3:7096602-7099372
FORWARD LENGTH=673
Length = 673
Score = 123 bits (308), Expect = 4e-28, Method: Compositional matrix adjust.
Identities = 75/216 (34%), Positives = 124/216 (57%)
Query: 204 LSTSKKLLKVLNQVCIREHQSSTKSLILALGSELDRVCHQIDLLIHEDRSNQNDMEYVIK 263
L T + ++ V Q + SL ++ +L I L E RS + +E +K
Sbjct: 190 LDTRDDVHQIYTNVKWNNQQVNDVSLASSIELKLQEARACIKDLESEKRSQKKKLEQFLK 249
Query: 264 CFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMANVKASHLKAS 323
+EE+AAW+SRE EK+ + + ++ EKK R++ E +N K+ E+A+ K + +
Sbjct: 250 KVSEERAAWRSREHEKVRAIIDDMKADMNQEKKTRQRLEIVNSKLVNELADSKLAVKRYM 309
Query: 324 KELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXXXXXXXMLQLADVLRE 383
+ ++E++A+E++E++CDELA I ED+A++E LK ES MLQ+A+V RE
Sbjct: 310 HDYQQERKARELIEEVCDELAKEIEEDKAEIEALKSESMNLREEVDDERRMLQMAEVWRE 369
Query: 384 ERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRTKD 419
ERVQMKL +AK EEK + KL ++E F+ +++
Sbjct: 370 ERVQMKLIDAKVTLEEKYSQMNKLVGDMEAFLSSRN 405
>AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast, plasma membrane; EXPRESSED IN: 9 plant
structures; EXPRESSED DURING: 6 growth stages; BEST
Arabidopsis thaliana protein match is: intracellular
protein transport protein USO1-related
(TAIR:AT1G64180.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:16646330-16648776 FORWARD LENGTH=623
Length = 623
Score = 106 bits (265), Expect = 5e-23, Method: Compositional matrix adjust.
Identities = 106/372 (28%), Positives = 178/372 (47%), Gaps = 40/372 (10%)
Query: 88 LSVSARKLAATLWEINDLPSADPV---------------RGRDKAANFSSSR-------- 124
+ VS+RKLAA WE + D RG + A SS R
Sbjct: 44 IVVSSRKLAAAFWEFHQYHYKDEEDCSYSYLSSASAKMHRGPNGFAGASSRRQRHGKAVA 103
Query: 125 --------SGLLRPHMSD--PSQSPLLERMKGFEGDGHKRRVSGLSHQLQSGDYLLEG-- 172
S LR D P + L R G H + + +H LQ G
Sbjct: 104 VKENGLDLSQFLRDPSPDHQPDSAGSLRRQIGQMLIKHHQSIDRNNHALQPVSPASYGSS 163
Query: 173 LDSCSSARLIEKNCGKCTDGVKSRLKEARSGLSTSKKLLKVLNQV-CIREHQSSTKSLIL 231
L+ + + + + G SR E L TS +LLKVLN++ + E S SLI
Sbjct: 164 LEVTTYNKAVTPSSSLEFRGRPSR--EPHYNLKTSTELLKVLNRIWSLEEQHVSNISLIK 221
Query: 232 ALGSELDRVCHQIDLLIHEDRSNQNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEEL 291
AL +E+ +I L+ ++++++++ V+K AEEK K++E E++ A+++ + L
Sbjct: 222 ALKTEVAHSRVRIKELLRYQQADRHELDSVVKQLAEEKLLSKNKEVERMSSAVQSVRKAL 281
Query: 292 KVEKKLRRQTERLNKKIAKEMANVKASHLKASKELEREKRAKEILEQICDELANGIGEDR 351
+ E+KLR+++E L++K+A+E++ VK+S KELER ++ +++E +CDE A GI
Sbjct: 282 EDERKLRKRSESLHRKMARELSEVKSSLSNCVKELERGSKSNKMMELLCDEFAKGIKSYE 341
Query: 352 AQVEELKRESAXX--XXXXXXXXXMLQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRN 409
++ LK+++ +L +A+ +ER+QM+L + VL+KL
Sbjct: 342 EEIHGLKKKNLDKDWAGRGGGDQLVLHIAESWLDERMQMRLEGGDTLNGKNRSVLDKLEV 401
Query: 410 ELECFMRTKDEE 421
E+E F++ K E
Sbjct: 402 EIETFLQEKRNE 413
>AT1G64180.1 | Symbols: | intracellular protein transport protein
USO1-related | chr1:23821640-23824193 FORWARD LENGTH=593
Length = 593
Score = 106 bits (264), Expect = 5e-23, Method: Compositional matrix adjust.
Identities = 118/404 (29%), Positives = 188/404 (46%), Gaps = 65/404 (16%)
Query: 37 LVRKYRFKRAILVGKKA---GSTTPVPLWKXXXXXXXXXXXXLHHSQLPPKDKELSVSAR 93
+ K R +RA+ VG ++ +TPV H P S S+R
Sbjct: 25 FIEKLR-RRAVFVGHRSVFRRPSTPV------------------HISFNPNKNPSSASSR 65
Query: 94 KLAATLWEI-----ND----LPSADPVR----GRDKAANFSSSRSGLLRPHMSDPSQSPL 140
KLAA+LWE ND P+A + G +N R G + ++D + L
Sbjct: 66 KLAASLWEFYQYYDNDHLIHPPAATKMHRAPLGSAGPSNSRRLRHGHGKAAVADNNGIEL 125
Query: 141 LERMKGFEGDGHKRRVSGL-----SHQLQSGDYLLEGLDSCSSARLIEKNCGKCTDGVKS 195
+ E G RR G H + D+ L+ + S +E +
Sbjct: 126 TDHQP--ESAGSIRRQIGQMLMKHHHLTERNDHALQPVSPTSYDSSLEFRG-------RR 176
Query: 196 RLKEARSGLSTSKKLLKVLNQVCIREHQSSTK-SLILALGSELDRVCHQIDLLIHEDRSN 254
R E + + TS +LLKVLN++ I E Q S SLI +L +EL +I L+ +++
Sbjct: 177 RAGEPNNNIKTSTELLKVLNRIWILEEQHSANISLIKSLKTELAHSRARIKDLLRCKQAD 236
Query: 255 QNDMEYVIKCFAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMAN 314
+ DM+ +K AEEK + ++E HD L +A + L+ E+KLR+++E L +K+A+E++
Sbjct: 237 KRDMDDFVKQLAEEKLSKGTKE----HDRLSSAVQSLEDERKLRKRSESLYRKLAQELSE 292
Query: 315 VKASHLKASKELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXXXXXXXM 374
VK++ KE+ER +K+ILE++CDE A GI ++ LK++ M
Sbjct: 293 VKSTLSNCVKEMERGTESKKILERLCDEFAKGIKSYEREIHGLKQKLDKNWKGWDEQDHM 352
Query: 375 -LQLADVLREERVQMKLSEAKYQFEEKNDVLEKLRNELECFMRT 417
L +A+ +ER+Q A LEKL E+E F++T
Sbjct: 353 ILCIAESWLDERIQSGNGSA----------LEKLEFEIETFLKT 386
>AT2G46250.1 | Symbols: | myosin heavy chain-related |
chr2:18991386-18993201 FORWARD LENGTH=468
Length = 468
Score = 73.9 bits (180), Expect = 3e-13, Method: Compositional matrix adjust.
Identities = 100/373 (26%), Positives = 172/373 (46%), Gaps = 68/373 (18%)
Query: 42 RFKRAIL-VGKKAGSTTPVPLWKXXXXXXXXXXXXLHHSQLPPK---DKELSV----SAR 93
+ +R+ + ++AG +TP P W+ L S PP+ KE S R
Sbjct: 13 KLRRSFMGYTRRAGPSTPPPTWR------------LEFS--PPRVGATKEFLANSEDSVR 58
Query: 94 KLAATLWEINDLPSADPVRGRDKAAN--FSSSRSGLLRPHMSDPSQSPLLERMKGFEGDG 151
KL A LWE +R + + S SRS L H PS++ L
Sbjct: 59 KLCADLWETEQFRQRIELRRCRRRDSDVESHSRSPL---HDHPPSRASL----------- 104
Query: 152 HKRRVSGLSHQLQSGDYLLEGLDSCSSA------RLIEKNCGKCTDGVKSRLKEARSGLS 205
RR + ++GD LL+ + S + +++ + G S +K A GL
Sbjct: 105 --RRQIAATDDYRNGD-LLQPISPASCSSSSSSLQVVVRKPAFSQTG-SSAVKSASYGLG 160
Query: 206 TSKKLLKVLNQV-CIREHQSSTKSLILALGSELDRVCHQIDLLIHEDRSNQNDMEYVIKC 264
+S KLLKVLN++ + E ++ SL+ AL ELD +I + + R +D
Sbjct: 161 SSTKLLKVLNRIWSLEEQNTANMSLVRALKMELDECRAEIKEV--QQRKKLSD------- 211
Query: 265 FAEEKAAWKSREREKIHDALKNAAEELKVEKKLRRQTERLNKKIAKEMANVKASHLKASK 324
+ K +E E++ D ++ EL E+K+R+++E L++K+ +E+ K KA K
Sbjct: 212 ----RPLRKKKEEEEVKDVFRSIKRELDDERKVRKESETLHRKLTRELCEAKHCLSKALK 267
Query: 325 ELEREKRAKEILEQICDELANGIGEDRAQVEELKRESAXXXXXXXXXXXMLQLADVLREE 384
+LE+E + + ++E +CDE A + + +V + ++S ++Q+A+V ++
Sbjct: 268 DLEKETQERVVVENLCDEFAKAVKDYEDKVRRIGKKSP------VSDKVIVQIAEVWSDQ 321
Query: 385 RVQMKLSEAKYQF 397
R+QMKL E F
Sbjct: 322 RLQMKLEEDDKTF 334