Miyakogusa Predicted Gene
- Lj6g3v0920480.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj6g3v0920480.1 Non Chatacterized Hit- tr|G7IXM4|G7IXM4_MEDTR
Putative uncharacterized protein OS=Medicago truncatul,73.55,0,FAMILY
NOT NAMED,NULL; coiled-coil,NULL; seg,NULL,CUFF.58518.1
(693 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 617 e-176
AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN: biologic... 534 e-151
AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma me... 121 2e-27
AT1G11690.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 74 2e-13
AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 70 6e-12
AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 69 8e-12
AT1G64180.1 | Symbols: | intracellular protein transport protei... 68 2e-11
AT2G46250.1 | Symbols: | myosin heavy chain-related | chr2:1899... 51 4e-06
>AT1G50660.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: chloroplast;
EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
growth stages; BEST Arabidopsis thaliana protein match
is: unknown protein (TAIR:AT3G20350.1); Has 21445 Blast
hits to 15134 proteins in 1325 species: Archae - 461;
Bacteria - 2309; Metazoa - 11052; Fungi - 1737; Plants -
1035; Viruses - 42; Other Eukaryotes - 4809 (source:
NCBI BLink). | chr1:18771386-18774385 FORWARD LENGTH=725
Length = 725
Score = 617 bits (1590), Expect = e-176, Method: Compositional matrix adjust.
Identities = 351/679 (51%), Positives = 455/679 (67%), Gaps = 40/679 (5%)
Query: 50 RRSGASVGKRSRPETPLSKWKIHE-DRERCGAGGDPIEEPDSRPC-------RKREPKQP 101
RRSG S G+RSRPETPL KWK+ + ++ER G D E D+ RK K
Sbjct: 52 RRSGPSGGRRSRPETPLLKWKVEDRNKERSGVVEDDDYEDDNHQVARSETTRRKDRRKIA 111
Query: 102 VVVSARKLAAGLWRLQLPEVAA--GDPGRRVGSKLQHEVGHVNHPFLSHQNGMMHGSAMK 159
VS RKLAAGLWRLQ+P+ ++ G+ + G Q G++ P+L H + G
Sbjct: 112 RPVSVRKLAAGLWRLQVPDASSSGGERKGKEGLGFQGNGGYMGVPYLYHHSDKPSGGQSN 171
Query: 160 NPSQSPRFISGT---MVC--EPSLQLSNTAMEGATKWDPVCFKTSDEVQHFYSQMKFLDQ 214
Q+P I+ T +C EPS+ ++AMEGATKWDPVC T +EV YS MK +DQ
Sbjct: 172 KIRQNPSTIATTKNGFLCKLEPSMPFPHSAMEGATKWDPVCLDTMEEVHQIYSNMKRIDQ 231
Query: 215 KVSTVXXXXXXXXXXXQARVQIQELETECHSSKKKLEHYLKKVSEERASWRTKEHEKIRA 274
+V+ V +A +I++LE+E S KKKLE +L+KVSEERA+WR++EHEK+RA
Sbjct: 232 QVNAVSLVSSLEAELEEAHARIEDLESEKRSHKKKLEQFLRKVSEERAAWRSREHEKVRA 291
Query: 275 YIDDIKAELNRERKSRQRIEIVNSKLVNELADAKLFAKRYMKDYEKERKGRELIEEVCDE 334
IDD+K ++NRE+K+RQR+EIVN KLVNELAD+KL KRYM+DYEKERK RELIEEVCDE
Sbjct: 292 IIDDMKTDMNREKKTRQRLEIVNHKLVNELADSKLAVKRYMQDYEKERKARELIEEVCDE 351
Query: 335 LANEIGEDKAEVEALKXXXXXXXXXXXXXXXXXXXXXVWREERVHMKLIDAKIALDEKYS 394
LA EIGEDKAE+EALK VWREERV MKLIDAK+AL+E+YS
Sbjct: 352 LAKEIGEDKAEIEALKRESMSLREEVDDERRMLQMAEVWREERVQMKLIDAKVALEERYS 411
Query: 395 QMNKLVADLETFVKSTDVNSNAKEMREAQSLQQAAAAVNIQDIKGFSYEPPNPDDIFAIF 454
QMNKLV DLE+F++S D+ ++ KE+REA+ L++ AA+VNIQ+IK F+Y P NPDDI+A+F
Sbjct: 412 QMNKLVGDLESFLRSRDIVTDVKEVREAELLRETAASVNIQEIKEFTYVPANPDDIYAVF 471
Query: 455 EDVNSGEPNEREIESCIAYSPASQASNIHMVSPEANLIRKANLQRHSDVFMDDNGEV-ED 513
E++N GE ++RE+E +AYSP S S +H VS +AN++ K RHSD + NG++ ED
Sbjct: 472 EEMNLGEAHDREMEKSVAYSPISHDSKVHTVSLDANMMNKKG--RHSDAYTHQNGDIEED 529
Query: 514 ESGWETVSHVEDQGSSCSPEGSTLSMTK---NSRVSNIS--GRSVLE--WEENACEATPL 566
+SGWETVSH+E+QGSS SP+GS S+ N R SN S G L W++ TP
Sbjct: 530 DSGWETVSHLEEQGSSYSPDGSIPSVNNKNHNHRHSNASSGGTESLGKVWDDT---MTPT 586
Query: 567 TEISEXXXXXXXXXXXXXXITRLWRS-GQANGD---SYKIISMDGMN-GRLSNGRVSNGG 621
TEISE I +LWRS G +NGD +YK+ISM+GMN GR+SNGR S+ G
Sbjct: 587 TEISEVCSIPRRSSKKVSSIAKLWRSTGASNGDRDSNYKVISMEGMNGGRVSNGRKSSAG 646
Query: 622 IVSPDWGPGKGGLSP-QDILCQL-SSPESGN---LHNRGKKGCI--PRTAQKNSLKARLL 674
+VSPD KGG SP D++ Q SSPES N ++ G KGCI PR AQK+SLK++L+
Sbjct: 647 MVSPDRVSSKGGFSPMMDLVGQWNSSPESANHPHVNRGGMKGCIEWPRGAQKSSLKSKLI 706
Query: 675 EARMETQKFQLRHVLKQKI 693
EAR+E+QK QL+HVLKQ+I
Sbjct: 707 EARIESQKVQLKHVLKQRI 725
>AT3G20350.1 | Symbols: | unknown protein; INVOLVED IN:
biological_process unknown; LOCATED IN: plasma membrane;
EXPRESSED IN: cotyledon; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT1G50660.1);
Has 15095 Blast hits to 11224 proteins in 1051 species:
Archae - 223; Bacteria - 1586; Metazoa - 7000; Fungi -
1255; Plants - 746; Viruses - 40; Other Eukaryotes -
4245 (source: NCBI BLink). | chr3:7096602-7099372
FORWARD LENGTH=673
Length = 673
Score = 534 bits (1375), Expect = e-151, Method: Compositional matrix adjust.
Identities = 314/662 (47%), Positives = 413/662 (62%), Gaps = 43/662 (6%)
Query: 50 RRSGASVGKRSRPETPLSKWKIHEDR-ERCGAGGD-PIEEPDSRPCRKREPKQPVVV-SA 106
RRSG SV + SRPETP K K+ + ERCG D E+ D R +E + V +
Sbjct: 37 RRSGVSVRRLSRPETPQLKSKVEDQNIERCGGVEDGDNEDDDCNKMRCQERSRSVRPDTV 96
Query: 107 RKLAAGLWRLQLPEVAAGDPGRRVGSKLQHE-----VGHVNHPFLSHQNGMMHGSAMKNP 161
RKLAAG+WRL++P+ + +R +L+ + G++ F H + H N
Sbjct: 97 RKLAAGVWRLRVPDAVSSGGDKRSKDRLRFQETAGPAGNLGPLFYYHHHDDKHSGFQSNN 156
Query: 162 SQS--PRFISGTMVCEPSLQLSNTAMEGATKWDPVCFKTSDEVQHFYSQMKFLDQKVSTV 219
S++ RF+ EPS+ + AMEGATKWDP+C T D+V Y+ +K+ +Q+V+ V
Sbjct: 157 SRNKHSRFLCKH---EPSVPFPHCAMEGATKWDPICLDTRDDVHQIYTNVKWNNQQVNDV 213
Query: 220 XXXXXXXXXXXQARVQIQELETECHSSKKKLEHYLKKVSEERASWRTKEHEKIRAYIDDI 279
+AR I++LE+E S KKKLE +LKKVSEERA+WR++EHEK+RA IDD+
Sbjct: 214 SLASSIELKLQEARACIKDLESEKRSQKKKLEQFLKKVSEERAAWRSREHEKVRAIIDDM 273
Query: 280 KAELNRERKSRQRIEIVNSKLVNELADAKLFAKRYMKDYEKERKGRELIEEVCDELANEI 339
KA++N+E+K+RQR+EIVNSKLVNELAD+KL KRYM DY++ERK RELIEEVCDELA EI
Sbjct: 274 KADMNQEKKTRQRLEIVNSKLVNELADSKLAVKRYMHDYQQERKARELIEEVCDELAKEI 333
Query: 340 GEDKAEVEALKXXXXXXXXXXXXXXXXXXXXXVWREERVHMKLIDAKIALDEKYSQMNKL 399
EDKAE+EALK VWREERV MKLIDAK+ L+EKYSQMNKL
Sbjct: 334 EEDKAEIEALKSESMNLREEVDDERRMLQMAEVWREERVQMKLIDAKVTLEEKYSQMNKL 393
Query: 400 VADLETFVKSTDVNSNAKEMREAQSLQQAAAAV-NIQDIKGFSYEPPNPDDIFAIFEDVN 458
V D+E F+ S + + KE+R A+ L++ AA+V NIQ+IK F+YEP PDDI +FE +N
Sbjct: 394 VGDMEAFLSSRNT-TGVKEVRVAELLRETAASVDNIQEIKEFTYEPAKPDDILMLFEQMN 452
Query: 459 SGEPNEREIESCIAYSPASQASNIHMVSPEANLIRKANLQRHSDVFMDDNGEV-EDESGW 517
GE +RE E +AYSP S AS H VSP+ NLI K RHS+ F D NGE ED+SGW
Sbjct: 453 MGENQDRESEQYVAYSPVSHASKAHTVSPDVNLINKG---RHSNAFTDQNGEFEEDDSGW 509
Query: 518 ETVSHVEDQGSSCSPEGSTLSMTK-NSRVSNISGRSVLEWEENACEATPLTEISEXXXXX 576
ETVSH E+ GSS SP+ S +++ + R SN+S E T L EI E
Sbjct: 510 ETVSHSEEHGSSYSPDESIPNISNTHHRNSNVSMNGT------EYEKTLLREIKEVCSVP 563
Query: 577 XXXXXXXXXITRLWRSGQANGDSYKIISMDGMNGRLSNGRVSNGGIVSPDWGPGKGGLSP 636
+ +LW S++GMNGR+SN R S +VSP+ G KGG +
Sbjct: 564 RRQSKKLPSMAKLWS------------SLEGMNGRVSNARKSTVEMVSPETGSNKGGFNT 611
Query: 637 QDILCQL-SSPES--GNLHNRGKKGCI--PRTAQKNSLKARLLEARMETQKFQLRHVLKQ 691
D++ Q SSP+S NL+ G+KGCI PR A KNSLK +L+EA++E+QK QL+HVL+
Sbjct: 612 LDLVGQWSSSPDSANANLNRGGRKGCIEWPRGAHKNSLKTKLIEAQIESQKVQLKHVLEH 671
Query: 692 KI 693
KI
Sbjct: 672 KI 673
>AT3G11590.1 | Symbols: | unknown protein; LOCATED IN: plasma
membrane; EXPRESSED IN: 22 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT5G22310.1);
Has 22320 Blast hits to 15179 proteins in 1213 species:
Archae - 372; Bacteria - 2307; Metazoa - 10906; Fungi -
1700; Plants - 1146; Viruses - 65; Other Eukaryotes -
5824 (source: NCBI BLink). | chr3:3660628-3663537
FORWARD LENGTH=622
Length = 622
Score = 121 bits (303), Expect = 2e-27, Method: Compositional matrix adjust.
Identities = 76/230 (33%), Positives = 120/230 (52%), Gaps = 1/230 (0%)
Query: 195 CFKTSDEVQHFYSQMKFLDQK-VSTVXXXXXXXXXXXQARVQIQELETECHSSKKKLEHY 253
TS E+ ++M D + S++ +AR+Q+ +L E + +
Sbjct: 229 ALTTSKELLKIINRMWGQDDRPSSSMSLVSALHSELERARLQVNQLIHEHKPENNDISYL 288
Query: 254 LKKVSEERASWRTKEHEKIRAYIDDIKAELNRERKSRQRIEIVNSKLVNELADAKLFAKR 313
+K+ +EE+A W++ E E + A I+ + EL ERK R+R E +N KL ELA+ K +
Sbjct: 289 MKRFAEEKAVWKSNEQEVVEAAIESVAGELEVERKLRRRFESLNKKLGKELAETKSALMK 348
Query: 314 YMKDYEKERKGRELIEEVCDELANEIGEDKAEVEALKXXXXXXXXXXXXXXXXXXXXXVW 373
+K+ E E++ R ++E+VCDELA +I EDKAEVE LK
Sbjct: 349 AVKEIENEKRARVMVEKVCDELARDISEDKAEVEELKRESFKVKEEVEKEREMLQLADAL 408
Query: 374 REERVHMKLIDAKIALDEKYSQMNKLVADLETFVKSTDVNSNAKEMREAQ 423
REERV MKL +AK L+EK + ++KL L+T++K+ +E + Q
Sbjct: 409 REERVQMKLSEAKHQLEEKNAAVDKLRNQLQTYLKAKRCKEKTREPPQTQ 458
>AT1G11690.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G20350.1); Has 5959 Blast hits to 4807 proteins
in 476 species: Archae - 156; Bacteria - 436; Metazoa -
2789; Fungi - 309; Plants - 336; Viruses - 9; Other
Eukaryotes - 1924 (source: NCBI BLink). |
chr1:3941469-3942212 FORWARD LENGTH=247
Length = 247
Score = 74.3 bits (181), Expect = 2e-13, Method: Compositional matrix adjust.
Identities = 68/258 (26%), Positives = 123/258 (47%), Gaps = 30/258 (11%)
Query: 185 MEGATKWDPVCFKTSDEVQ--HFYSQMKFLDQKVSTVXXXXXXXXXXXQARVQIQELETE 242
ME T+WD +T V+ + + +FLD + +A+ +I+ELE E
Sbjct: 1 MESITEWDLGSLRTYYSVEPSENFQEDEFLDFNLVPCLQTELW-----KAQTRIKELEAE 55
Query: 243 CHSSKKKLEHYLKKVSEERASWRTKEHEKIRAYIDDIKAELNRERKSRQRIEIVNSKLVN 302
S++ + ++ + R ++ E ++D +K +L++ER+ ++R++ NS+L
Sbjct: 56 KFKSEETIRCLIR-------NQRNEKEETTNPFVDYLKEKLSKEREEKKRVKAENSRLKK 108
Query: 303 ELADAKLFAKRYMKDYEKERKGRELIEEVCDELANEIGEDKAEVEALKXXXXXXXXXXXX 362
++ D + R R+ R+ +E+VC+EL I E LK
Sbjct: 109 KILDMESSVNRL-------RRERDTMEKVCEELVTRIDE-------LKVNTRRVWDETEE 154
Query: 363 XXXXXXXXXVWREERVHMKLIDAKIALDEKYSQMNKLVADLETFVKST-DVNS-NAKEMR 420
+WREERV +K +DAK+AL EKY +MN V +LE +++ +V K +R
Sbjct: 155 ERQMLQMAEMWREERVRVKFMDAKLALQEKYEEMNLFVVELEKCLETAREVGGIEEKRLR 214
Query: 421 EAQSLQQAAAAVNIQDIK 438
+ L + A ++ + D K
Sbjct: 215 HGEGLIKMAKSMEVVDSK 232
>AT5G22310.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G11590.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:7383742-7385345 REVERSE LENGTH=481
Length = 481
Score = 69.7 bits (169), Expect = 6e-12, Method: Compositional matrix adjust.
Identities = 41/118 (34%), Positives = 66/118 (55%), Gaps = 14/118 (11%)
Query: 286 ERKSRQRIEIVNSKLVNELADAKLFAKRYMKDYEKERKGRELIEEVCDELANEIGEDKAE 345
ERK R+R E +N +L EL +AK ++ ++ ++E++ ++++EEVCDEL IG+DK E
Sbjct: 256 ERKLRRRTEKMNRRLGRELTEAKETERKMKEEMKREKRAKDVLEEVCDELTKGIGDDKKE 315
Query: 346 VEALKXXXXXXXXXXXXXXXXXXXXXVWREERVHMKLIDAKIALDEKYSQMNKLVADL 403
+E V REERV MKL +AK ++KY+ + +L +L
Sbjct: 316 ME--------------KEREMMHIADVLREERVQMKLTEAKFEFEDKYAAVERLKKEL 359
>AT5G41620.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast, plasma membrane; EXPRESSED IN: 9 plant
structures; EXPRESSED DURING: 6 growth stages; BEST
Arabidopsis thaliana protein match is: intracellular
protein transport protein USO1-related
(TAIR:AT1G64180.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr5:16646330-16648776 FORWARD LENGTH=623
Length = 623
Score = 69.3 bits (168), Expect = 8e-12, Method: Compositional matrix adjust.
Identities = 54/216 (25%), Positives = 104/216 (48%), Gaps = 3/216 (1%)
Query: 196 FKTSDEVQHFYSQMKFLD-QKVSTVXXXXXXXXXXXQARVQIQELETECHSSKKKLEHYL 254
KTS E+ +++ L+ Q VS + +RV+I+EL + + +L+ +
Sbjct: 193 LKTSTELLKVLNRIWSLEEQHVSNISLIKALKTEVAHSRVRIKELLRYQQADRHELDSVV 252
Query: 255 KKVSEERASWRTKEHEKIRAYIDDIKAELNRERKSRQRIEIVNSKLVNELADAKLFAKRY 314
K+++EE+ + KE E++ + + ++ L ERK R+R E ++ K+ EL++ K
Sbjct: 253 KQLAEEKLLSKNKEVERMSSAVQSVRKALEDERKLRKRSESLHRKMARELSEVKSSLSNC 312
Query: 315 MKDYEKERKGRELIEEVCDELANEIGEDKAEVEALKXXXXXX--XXXXXXXXXXXXXXXV 372
+K+ E+ K +++E +CDE A I + E+ LK
Sbjct: 313 VKELERGSKSNKMMELLCDEFAKGIKSYEEEIHGLKKKNLDKDWAGRGGGDQLVLHIAES 372
Query: 373 WREERVHMKLIDAKIALDEKYSQMNKLVADLETFVK 408
W +ER+ M+L + S ++KL ++ETF++
Sbjct: 373 WLDERMQMRLEGGDTLNGKNRSVLDKLEVEIETFLQ 408
>AT1G64180.1 | Symbols: | intracellular protein transport protein
USO1-related | chr1:23821640-23824193 FORWARD LENGTH=593
Length = 593
Score = 67.8 bits (164), Expect = 2e-11, Method: Compositional matrix adjust.
Identities = 60/232 (25%), Positives = 115/232 (49%), Gaps = 29/232 (12%)
Query: 196 FKTSDEVQHFYSQMKFLD-QKVSTVXXXXXXXXXXXQARVQIQELETECHSSKKKLEHYL 254
KTS E+ +++ L+ Q + + +R +I++L + K+ ++ ++
Sbjct: 185 IKTSTELLKVLNRIWILEEQHSANISLIKSLKTELAHSRARIKDLLRCKQADKRDMDDFV 244
Query: 255 KKVSEERASWRTKEHEKIRAYIDDIKAELNRERKSRQRIEIVNSKLVNELADAKLFAKRY 314
K+++EE+ S TKEH+++ + + L ERK R+R E + KL EL++ K
Sbjct: 245 KQLAEEKLSKGTKEHDRLSSAVQS----LEDERKLRKRSESLYRKLAQELSEVKSTLSNC 300
Query: 315 MKDYEKERKGRELIEEVCDELANEIGEDKAEVEALKXXXXXXXXXXXXXXXXXXXXXVWR 374
+K+ E+ + ++++E +CDE A I + E+ LK W+
Sbjct: 301 VKEMERGTESKKILERLCDEFAKGIKSYEREIHGLKQKLDKN----------------WK 344
Query: 375 --EERVHMKLIDAKIALDEKY-----SQMNKLVADLETFVKSTDVNSNAKEM 419
+E+ HM L A+ LDE+ S + KL ++ETF+K T+ N+++ E+
Sbjct: 345 GWDEQDHMILCIAESWLDERIQSGNGSALEKLEFEIETFLK-TNQNADSNEI 395
>AT2G46250.1 | Symbols: | myosin heavy chain-related |
chr2:18991386-18993201 FORWARD LENGTH=468
Length = 468
Score = 50.8 bits (120), Expect = 4e-06, Method: Compositional matrix adjust.
Identities = 33/104 (31%), Positives = 52/104 (50%), Gaps = 6/104 (5%)
Query: 279 IKAELNRERKSRQRIEIVNSKLVNELADAKLFAKRYMKDYEKERKGRELIEEVCDELANE 338
IK EL+ ERK R+ E ++ KL EL +AK + +KD EKE + R ++E +CDE A
Sbjct: 230 IKRELDDERKVRKESETLHRKLTRELCEAKHCLSKALKDLEKETQERVVVENLCDEFAKA 289
Query: 339 IGEDKAEVEALKXXXXXXXXXXXXXXXXXXXXXVWREERVHMKL 382
+ + + +V + VW ++R+ MKL
Sbjct: 290 VKDYEDKVRRI------GKKSPVSDKVIVQIAEVWSDQRLQMKL 327