Miyakogusa Predicted Gene
- Lj1g3v4724810.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj1g3v4724810.1 Non Chatacterized Hit- tr|I1JQ59|I1JQ59_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.23381
PE,83.38,0,seg,NULL; coiled-coil,NULL; FAMILY NOT
NAMED,NULL,CUFF.33047.1
(661 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma me... 536 e-152
AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 167 2e-41
AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 154 2e-37
AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 144 2e-34
AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 135 7e-32
AT1G64180.1 | Symbols: | intracellular protein transport protei... 118 1e-26
AT2G46250.1 | Symbols: | myosin heavy chain-related | chr2:1899... 96 6e-20
AT1G11690.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 55 2e-07
>AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma
membrane; EXPRESSED IN: 22 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT5G22310.1);
Has 22320 Blast hits to 15179 proteins in 1213 species:
Archae - 372; Bacteria - 2307; Metazoa - 10906; Fungi -
1700; Plants - 1146; Viruses - 65; Other Eukaryotes -
5824 (source: NCBI BLink). | chr3:3660628-3663537
FORWARD LENGTH=622
Length = 622
Score = 536 bits (1380), Expect = e-152, Method: Compositional matrix adjust.
Identities = 336/624 (53%), Positives = 405/624 (64%), Gaps = 56/624 (8%)
Query: 3 LIPGKIRKRGCXXXXXXXXXML-HNYRFKRAILVGKRGGSTTPVPTWNLLSSRSPA---- 57
L+ GKIRKRGC +L YRFKRAI+VGKRGGSTTPVPTW L+ RSP+
Sbjct: 12 LLLGKIRKRGCSSPTSSTSSILREGYRFKRAIVVGKRGGSTTPVPTWRLMG-RSPSPRAS 70
Query: 58 SVRESPKYPPSQVGG--GAKTRQAPVSARKLAATLWEMNEIPSPSVRE-----VRDHHHN 110
+ P S G G + APVSARKLAATLWEMNE+PSP V E +R
Sbjct: 71 GALHAAASPSSHCGSKTGKVSAPAPVSARKLAATLWEMNEMPSPRVVEEAAPMIR----- 125
Query: 111 TRVKKELRAKERVPRST-RXXXXXXXXXXXXXXXXXERMDXXXXXXXXXXXXXXXXKPRL 169
+ +KE A PRS+ ERM+ K RL
Sbjct: 126 -KSRKERIAPLPPPRSSVHSGSLPPHLSDPSHSPVSERMERSGTGSRQRRASSTVQKLRL 184
Query: 170 TEHHVPPLDSRSNASLMEVETRSRAHTPALSTVGVKTRLKDVSNALTTSKELLKIINRMW 229
+ +V D ++ S M++ETRSR TP STVGVKTRLKD SNALTTSKELLKIINRMW
Sbjct: 185 GDCNVGARDPINSGSFMDIETRSRVETPTGSTVGVKTRLKDCSNALTTSKELLKIINRMW 244
Query: 230 GHEDRPSSSMSLISALHTELERARLQVNQLIQEQRSDQNEISYLMKCFAEEKAAWKSKEQ 289
G +DRPSSSMSL+SALH+ELERARLQVNQLI E + + N+ISYLMK FAEEKA WKS EQ
Sbjct: 245 GQDDRPSSSMSLVSALHSELERARLQVNQLIHEHKPENNDISYLMKRFAEEKAVWKSNEQ 304
Query: 290 EIVEAAIESVAGELDVERKLRRRFESLNKKLGRELAETKASLLKVVKELESEKRAREIIE 349
E+VEAAIESVAGEL+VERKLRRRFESLNKKLG+ELAETK++L+K VKE+E+EKRAR ++E
Sbjct: 305 EVVEAAIESVAGELEVERKLRRRFESLNKKLGKELAETKSALMKAVKEIENEKRARVMVE 364
Query: 350 QVCDELARDVDEDKSEIDKQKRVATKAFEDVEKEKEMIQLTDMLREERAQKKLSEAKYQM 409
+VCDELARD+ EDK+E+++ KR + K E+VEKE+EM+QL D LREER Q KLSEAK+Q+
Sbjct: 365 KVCDELARDISEDKAEVEELKRESFKVKEEVEKEREMLQLADALREERVQMKLSEAKHQL 424
Query: 410 EEKNAAVDNLRNQLEAFLGSKQVREKGR--SSTHLNDDEIAAYLGRS-RLASHHNEDKED 466
EEKNAAVD LRNQL+ +L +K+ +EK R T L+++E YL S++ ED
Sbjct: 425 EEKNAAVDKLRNQLQTYLKAKRCKEKTREPPQTQLHNEEAGDYLNHHISFGSYNIED--- 481
Query: 467 DGGEVDNGVECEEESAESDLHSIELNMDNNNKSYKWASPSESRFDTRRYPTGEGVKXXXX 526
GEV+NG EE S ESDLHSIELN+D NKSYKW P GE
Sbjct: 482 --GEVENG--NEEGSGESDLHSIELNID--NKSYKW-------------PYGE----ENR 518
Query: 527 XXXXXXXXXXXLQRSISEGIEWGVQAEKLQSSGD-GLDWEGFYELEKQAQGKGYGDEMLG 585
LQRSIS+ ++W VQ+EKLQ SGD GLDW ++E KGY DE
Sbjct: 519 GRKSTPRKSLSLQRSISDCVDWVVQSEKLQKSGDGGLDWGRSIDVEP----KGYIDETQA 574
Query: 586 YKSVKGLR--DQILAGSKLASSRG 607
YK K IL+GS+L++ RG
Sbjct: 575 YKPNKASSKDHHILSGSRLSNFRG 598
>AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G11590.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:7383742-7385345 REVERSE LENGTH=481
Length = 481
Score = 167 bits (424), Expect = 2e-41, Method: Compositional matrix adjust.
Identities = 169/522 (32%), Positives = 252/522 (48%), Gaps = 124/522 (23%)
Query: 7 KIRKRGCXXXXXXXXXMLHNYRFKRAILVGKR-----GGSTTPVPTWNLLSSRSPASVRE 61
KIRKRG + RFKRAI GKR GGS TPV + + ++++P +
Sbjct: 9 KIRKRG--GSSSSSSSLARRNRFKRAIFAGKRAAQDDGGSGTPVKS--ITAAKTPVLLSF 64
Query: 62 SPKYPPSQVGGGAKTRQAPVSARKLAATLWEMNEIPSPSVREVRDHHHNT-----RVKK- 115
SP+ P + +++ VSARKLAATLWE+N+ P V +D + R KK
Sbjct: 65 SPENLPID---HHQLQKSCVSARKLAATLWEINDDADPPVNSDKDCLRSKKPSRYRAKKS 121
Query: 116 -ELRAKERVPRSTRXXXXXXXXXXXXXXXXXERMDXXXXXXXXXXXXXXXXKPRLTEHHV 174
E + + PRS+ ER+D P E+ +
Sbjct: 122 TEFSSIDFPPRSS----------DPISRLSSERIDLCDDMIRRRSTNPQKLNP--IEYKI 169
Query: 175 PPLDSRSNASLMEVETRSRAHTPALSTVGVKTRLKDVSNALTTSKELLKIINRMWG-HED 233
+S V+TR K+VS+ LTTSKEL+K++ R+ +D
Sbjct: 170 IGANS--------VKTR----------------FKNVSDGLTTSKELVKVLKRIGELGDD 205
Query: 234 RPSSSMSLISALHTELERARLQVNQLIQEQRSDQNEISYLMKCFAEEKAAWKSKEQEIVE 293
++S LISAL EL+RAR + + +LM EE+ + + + E
Sbjct: 206 HKTASNRLISALLCELDRAR--------------SSLKHLMSELDEEEEEKRRLIESLQE 251
Query: 294 AAIESVAGELDVERKLRRRFESLNKKLGRELAETKASLLKVVKELESEKRAREIIEQVCD 353
A+ VERKLRRR E +N++LGREL E K + K+ +E++ EKRA++++E+VCD
Sbjct: 252 EAM--------VERKLRRRTEKMNRRLGRELTEAKETERKMKEEMKREKRAKDVLEEVCD 303
Query: 354 ELARDVDEDKSEIDKQKRVATKAFEDVEKEKEMIQLTDMLREERAQKKLSEAKYQMEEKN 413
EL + + +DK E+ EKE+EM+ + D+LREER Q KL+EAK++ E+K
Sbjct: 304 ELTKGIGDDKKEM--------------EKEREMMHIADVLREERVQMKLTEAKFEFEDKY 349
Query: 414 AAVDNLRNQLEAFLGSKQVREKGRSSTHLNDDEIAAYLGRSRLASHHNEDKEDDGGEVDN 473
AAV+ L+ +L L E+G+ S+ EI L EV +
Sbjct: 350 AAVERLKKELRRVLDG----EEGKGSS-----EIRRIL------------------EVID 382
Query: 474 GVECEEESAESDLHSIELNMDNNNKSYKWASPSESRFDTRRY 515
G +++ ESDL SIELNM++ + KW +S D RR+
Sbjct: 383 GSGSDDDE-ESDLKSIELNMESGS---KWGY-VDSLKDRRRF 419
>AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: chloroplast;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT3G20350.1); Has 21445 Blast
hits to 15134 proteins in 1325 species: Archae - 461;
Bacteria - 2309; Metazoa - 11052; Fungi - 1737; Plants -
1035; Viruses - 42; Other Eukaryotes - 4809 (source:
NCBI BLink). | chr1:18771386-18774385 FORWARD LENGTH=725
Length = 725
Score = 154 bits (390), Expect = 2e-37, Method: Compositional matrix adjust.
Identities = 102/308 (33%), Positives = 176/308 (57%), Gaps = 31/308 (10%)
Query: 214 ALTTSKELLKIINRMWGHEDRPSSSMSLISALHTELERARLQVNQLIQEQRSDQNEISYL 273
L T +E+ +I + M D+ +++SL+S+L ELE A ++ L E+RS + ++
Sbjct: 212 CLDTMEEVHQIYSNM-KRIDQQVNAVSLVSSLEAELEEAHARIEDLESEKRSHKKKLEQF 270
Query: 274 MKCFAEEKAAWKSKEQEIVEAAIESVAGELDVERKLRRRFESLNKKLGRELAETKASLLK 333
++ +EE+AAW+S+E E V A I+ + +++ E+K R+R E +N KL ELA++K ++ +
Sbjct: 271 LRKVSEERAAWRSREHEKVRAIIDDMKTDMNREKKTRQRLEIVNHKLVNELADSKLAVKR 330
Query: 334 VVKELESEKRAREIIEQVCDELARDVDEDKSEIDKQKRVATKAFEDVEKEKEMIQLTDML 393
+++ E E++ARE+IE+VCDELA+++ EDK+EI+ KR + E+V+ E+ M+Q+ ++
Sbjct: 331 YMQDYEKERKARELIEEVCDELAKEIGEDKAEIEALKRESMSLREEVDDERRMLQMAEVW 390
Query: 394 REERAQKKLSEAKYQMEEKNAAVDNLRNQLEAFLGS-------KQVRE------------ 434
REER Q KL +AK +EE+ + ++ L LE+FL S K+VRE
Sbjct: 391 REERVQMKLIDAKVALEERYSQMNKLVGDLESFLRSRDIVTDVKEVREAELLRETAASVN 450
Query: 435 ----KGRSSTHLNDDEIAAYLGRSRLASHHNEDKEDDGGEVDNGVECEEESAESDLHSIE 490
K + N D+I A L H+ E++ V S +S +H++
Sbjct: 451 IQEIKEFTYVPANPDDIYAVFEEMNLGEAHDR-------EMEKSVAYSPISHDSKVHTVS 503
Query: 491 LNMDNNNK 498
L+ + NK
Sbjct: 504 LDANMMNK 511
>AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: cotyledon; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT1G50660.1);
Has 15095 Blast hits to 11224 proteins in 1051 species:
Archae - 223; Bacteria - 1586; Metazoa - 7000; Fungi -
1255; Plants - 746; Viruses - 40; Other Eukaryotes -
4245 (source: NCBI BLink). | chr3:7096602-7099372
FORWARD LENGTH=673
Length = 673
Score = 144 bits (363), Expect = 2e-34, Method: Compositional matrix adjust.
Identities = 75/194 (38%), Positives = 131/194 (67%)
Query: 239 MSLISALHTELERARLQVNQLIQEQRSDQNEISYLMKCFAEEKAAWKSKEQEIVEAAIES 298
+SL S++ +L+ AR + L E+RS + ++ +K +EE+AAW+S+E E V A I+
Sbjct: 213 VSLASSIELKLQEARACIKDLESEKRSQKKKLEQFLKKVSEERAAWRSREHEKVRAIIDD 272
Query: 299 VAGELDVERKLRRRFESLNKKLGRELAETKASLLKVVKELESEKRAREIIEQVCDELARD 358
+ +++ E+K R+R E +N KL ELA++K ++ + + + + E++ARE+IE+VCDELA++
Sbjct: 273 MKADMNQEKKTRQRLEIVNSKLVNELADSKLAVKRYMHDYQQERKARELIEEVCDELAKE 332
Query: 359 VDEDKSEIDKQKRVATKAFEDVEKEKEMIQLTDMLREERAQKKLSEAKYQMEEKNAAVDN 418
++EDK+EI+ K + E+V+ E+ M+Q+ ++ REER Q KL +AK +EEK + ++
Sbjct: 333 IEEDKAEIEALKSESMNLREEVDDERRMLQMAEVWREERVQMKLIDAKVTLEEKYSQMNK 392
Query: 419 LRNQLEAFLGSKQV 432
L +EAFL S+
Sbjct: 393 LVGDMEAFLSSRNT 406
>AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast, plasma membrane; EXPRESSED IN: 9 plant
structures; EXPRESSED DURING: 6 growth stages; BEST
Arabidopsis thaliana protein match is: intracellular
protein transport protein USO1-related
(TAIR:AT1G64180.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:16646330-16648776 FORWARD LENGTH=623
Length = 623
Score = 135 bits (341), Expect = 7e-32, Method: Compositional matrix adjust.
Identities = 93/263 (35%), Positives = 159/263 (60%), Gaps = 4/263 (1%)
Query: 172 HHVPPLDSRSNASLMEVETRSRAHTPALSTVGVKTRLKDVSNALTTSKELLKIINRMWGH 231
H + P+ S S +EV T ++A TP+ S ++ L TS ELLK++NR+W
Sbjct: 150 HALQPVSPASYGSSLEVTTYNKAVTPSSSLEFRGRPSREPHYNLKTSTELLKVLNRIWSL 209
Query: 232 EDRPSSSMSLISALHTELERARLQVNQLIQEQRSDQNEISYLMKCFAEEKAAWKSKEQEI 291
E++ S++SLI AL TE+ +R+++ +L++ Q++D++E+ ++K AEEK K+KE E
Sbjct: 210 EEQHVSNISLIKALKTEVAHSRVRIKELLRYQQADRHELDSVVKQLAEEKLLSKNKEVER 269
Query: 292 VEAAIESVAGELDVERKLRRRFESLNKKLGRELAETKASLLKVVKELESEKRAREIIEQV 351
+ +A++SV L+ ERKLR+R ESL++K+ REL+E K+SL VKELE ++ +++E +
Sbjct: 270 MSSAVQSVRKALEDERKLRKRSESLHRKMARELSEVKSSLSNCVKELERGSKSNKMMELL 329
Query: 352 CDELARDVDEDKSEID--KQKRVATKAFEDVEKEKEMIQLTDMLREERAQKKLSEAKYQM 409
CDE A+ + + EI K+K + ++ ++ + + +ER Q +L E +
Sbjct: 330 CDEFAKGIKSYEEEIHGLKKKNLDKDWAGRGGGDQLVLHIAESWLDERMQMRL-EGGDTL 388
Query: 410 EEKNAAV-DNLRNQLEAFLGSKQ 431
KN +V D L ++E FL K+
Sbjct: 389 NGKNRSVLDKLEVEIETFLQEKR 411
>AT1G64180.1 | Symbols: | intracellular protein transport protein
USO1-related | chr1:23821640-23824193 FORWARD LENGTH=593
Length = 593
Score = 118 bits (295), Expect = 1e-26, Method: Compositional matrix adjust.
Identities = 82/263 (31%), Positives = 148/263 (56%), Gaps = 29/263 (11%)
Query: 170 TEHHVPPLDSRSNASLMEVETRSRAHTPALSTVGVKTRLKDVSNALTTSKELLKIINRMW 229
+H + P+ S S +E R RA P +N + TS ELLK++NR+W
Sbjct: 154 NDHALQPVSPTSYDSSLEFRGRRRAGEP--------------NNNIKTSTELLKVLNRIW 199
Query: 230 GHEDRPSSSMSLISALHTELERARLQVNQLIQEQRSDQNEISYLMKCFAEEKAAWKSKEQ 289
E++ S+++SLI +L TEL +R ++ L++ +++D+ ++ +K AEEK + +KE
Sbjct: 200 ILEEQHSANISLIKSLKTELAHSRARIKDLLRCKQADKRDMDDFVKQLAEEKLSKGTKEH 259
Query: 290 EIVEAAIESVAGELDVERKLRRRFESLNKKLGRELAETKASLLKVVKELESEKRAREIIE 349
+ + +A++S L+ ERKLR+R ESL +KL +EL+E K++L VKE+E +++I+E
Sbjct: 260 DRLSSAVQS----LEDERKLRKRSESLYRKLAQELSEVKSTLSNCVKEMERGTESKKILE 315
Query: 350 QVCDELARDVDEDKSEIDKQKRVATKAFEDVEKEKEMI-QLTDMLREERAQKKLSEAKYQ 408
++CDE A+ + + EI K+ K ++ +++ MI + + +ER Q
Sbjct: 316 RLCDEFAKGIKSYEREIHGLKQKLDKNWKGWDEQDHMILCIAESWLDERIQSG------- 368
Query: 409 MEEKNAAVDNLRNQLEAFLGSKQ 431
+A++ L ++E FL + Q
Sbjct: 369 ---NGSALEKLEFEIETFLKTNQ 388
>AT2G46250.1 | Symbols: | myosin heavy chain-related |
chr2:18991386-18993201 FORWARD LENGTH=468
Length = 468
Score = 96.3 bits (238), Expect = 6e-20, Method: Compositional matrix adjust.
Identities = 71/208 (34%), Positives = 119/208 (57%), Gaps = 20/208 (9%)
Query: 197 PALSTVGVKTRLKDVSNALTTSKELLKIINRMWGHEDRPSSSMSLISALHTELERARLQV 256
PA S G + +K S L +S +LLK++NR+W E++ +++MSL+ AL EL+ R ++
Sbjct: 142 PAFSQTG-SSAVKSASYGLGSSTKLLKVLNRIWSLEEQNTANMSLVRALKMELDECRAEI 200
Query: 257 NQLIQEQRSDQNEISYLMKCFAEEKAAWKSKEQEIVEAAIESVAGELDVERKLRRRFESL 316
++ Q ++ ++ K KE+E V+ S+ ELD ERK+R+ E+L
Sbjct: 201 KEVQQRKK-------------LSDRPLRKKKEEEEVKDVFRSIKRELDDERKVRKESETL 247
Query: 317 NKKLGRELAETKASLLKVVKELESEKRAREIIEQVCDELARDVDEDKSEIDKQKRVATKA 376
++KL REL E K L K +K+LE E + R ++E +CDE A+ V K DK +R+ K+
Sbjct: 248 HRKLTRELCEAKHCLSKALKDLEKETQERVVVENLCDEFAKAV---KDYEDKVRRIGKKS 304
Query: 377 FEDVEKEKEMIQLTDMLREERAQKKLSE 404
+K ++Q+ ++ ++R Q KL E
Sbjct: 305 ---PVSDKVIVQIAEVWSDQRLQMKLEE 329
>AT1G11690.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G20350.1); Has 5959 Blast hits to 4807 proteins
in 476 species: Archae - 156; Bacteria - 436; Metazoa -
2789; Fungi - 309; Plants - 336; Viruses - 9; Other
Eukaryotes - 1924 (source: NCBI BLink). |
chr1:3941469-3942212 FORWARD LENGTH=247
Length = 247
Score = 55.1 bits (131), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 47/191 (24%), Positives = 102/191 (53%), Gaps = 21/191 (10%)
Query: 239 MSLISALHTELERARLQVNQLIQEQRSDQNEISYLMKCFAEEKAAWKSKEQEIVEAAIES 298
+L+ L TEL +A+ ++ +L E+ + I L++ EK +E ++
Sbjct: 32 FNLVPCLQTELWKAQTRIKELEAEKFKSEETIRCLIRNQRNEK-------EETTNPFVDY 84
Query: 299 VAGELDVERKLRRRFESLNKKLGRELAETKASLLKVVKELESEKRAREIIEQVCDELARD 358
+ +L ER+ ++R ++ N +L +++ + ++S+ ++ +R R+ +E+VC+EL
Sbjct: 85 LKEKLSKEREEKKRVKAENSRLKKKILDMESSVNRL-------RRERDTMEKVCEELVTR 137
Query: 359 VDEDKSEIDKQKRVATKAFEDVEKEKEMIQLTDMLREERAQKKLSEAKYQMEEKNAAVDN 418
+DE K + +++ E+E++M+Q+ +M REER + K +AK ++EK ++
Sbjct: 138 IDELKVN-------TRRVWDETEEERQMLQMAEMWREERVRVKFMDAKLALQEKYEEMNL 190
Query: 419 LRNQLEAFLGS 429
+LE L +
Sbjct: 191 FVVELEKCLET 201