Miyakogusa Predicted Gene
- Lj5g3v0402730.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj5g3v0402730.1 tr|G7IB73|G7IB73_MEDTR Chloroplast unusual
positioning 1A OS=Medicago truncatula GN=MTR_1g016290
PE=,75.31,0,SUBFAMILY NOT NAMED,NULL; FAMILY NOT NAMED,NULL; seg,NULL;
coiled-coil,NULL,CUFF.52928.1
(791 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 330 2e-90
AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 330 3e-90
AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 330 3e-90
AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like su... 325 5e-89
AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein famil... 288 1e-77
AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 236 5e-62
AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 64 5e-10
AT1G52080.1 | Symbols: AR791 | actin binding protein family | ch... 55 2e-07
>AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD LENGTH=863
Length = 863
Score = 330 bits (847), Expect = 2e-90, Method: Compositional matrix adjust.
Identities = 157/265 (59%), Positives = 202/265 (76%), Gaps = 1/265 (0%)
Query: 522 VKRAPQVVELYHSLMKRDSRKDSSNGGLSDAP-DVANVRSSMIGEIENRSSHLLAIKADI 580
V RAP++VE Y SLMKR+S+K+ + +S + + R++MIGEIENRS+ LLA+KAD+
Sbjct: 577 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 636
Query: 581 ETQGEFVNSLIKEVKNAVYQNIEDVVAFVKWLDDELCFLVDERAVLKHFEWPEKKADTLR 640
ETQG+FV SL EV+ + + +IED++AFV WLD+EL FLVDERAVLKHF+WPE KAD LR
Sbjct: 637 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALR 696
Query: 641 EAAFGYQDLKKLESEVSSYKDDPRIPCDIALKKMVALSEKMERTVYNLLRTRESLMRNCK 700
EAAF YQDL KLE +V+S+ DDP + C+ ALKKM L EK+E++VY LLRTR+ + K
Sbjct: 697 EAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYK 756
Query: 701 DFQIPIEWMLDNXXXXXXXXXSVKLAKNYMKRVAMELQAKSALDKDPAMDYMLLQGVRFA 760
+F IP++W+ D SV+LAK YMKRVA EL + S DKDP +++LLQGVRFA
Sbjct: 757 EFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFA 816
Query: 761 FRIHQFAGGFDAETMHAFEELRNLA 785
FR+HQFAGGFDAE+M AFEELR+ A
Sbjct: 817 FRVHQFAGGFDAESMKAFEELRSRA 841
Score = 100 bits (248), Expect = 5e-21, Method: Compositional matrix adjust.
Identities = 47/81 (58%), Positives = 62/81 (76%)
Query: 232 QMQKRNLTCRLSSLEAQLPCPANSSESDIVAKIKAEASLLRHTNEDLSKQVEGLQTSRLN 291
Q +KR L+ +L S EA++ +N +ESD VAK++ E + L+H NEDL KQVEGLQ +R +
Sbjct: 143 QHEKRELSIKLDSAEARIATLSNMTESDKVAKVREEVNNLKHNNEDLLKQVEGLQMNRFS 202
Query: 292 EVEELAYLRWVNSCLRNELKN 312
EVEEL YLRWVN+CLR EL+N
Sbjct: 203 EVEELVYLRWVNACLRYELRN 223
>AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 330 bits (845), Expect = 3e-90, Method: Compositional matrix adjust.
Identities = 157/265 (59%), Positives = 202/265 (76%), Gaps = 1/265 (0%)
Query: 522 VKRAPQVVELYHSLMKRDSRKDSSNGGLSDAP-DVANVRSSMIGEIENRSSHLLAIKADI 580
V RAP++VE Y SLMKR+S+K+ + +S + + R++MIGEIENRS+ LLA+KAD+
Sbjct: 718 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 777
Query: 581 ETQGEFVNSLIKEVKNAVYQNIEDVVAFVKWLDDELCFLVDERAVLKHFEWPEKKADTLR 640
ETQG+FV SL EV+ + + +IED++AFV WLD+EL FLVDERAVLKHF+WPE KAD LR
Sbjct: 778 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALR 837
Query: 641 EAAFGYQDLKKLESEVSSYKDDPRIPCDIALKKMVALSEKMERTVYNLLRTRESLMRNCK 700
EAAF YQDL KLE +V+S+ DDP + C+ ALKKM L EK+E++VY LLRTR+ + K
Sbjct: 838 EAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYK 897
Query: 701 DFQIPIEWMLDNXXXXXXXXXSVKLAKNYMKRVAMELQAKSALDKDPAMDYMLLQGVRFA 760
+F IP++W+ D SV+LAK YMKRVA EL + S DKDP +++LLQGVRFA
Sbjct: 898 EFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFA 957
Query: 761 FRIHQFAGGFDAETMHAFEELRNLA 785
FR+HQFAGGFDAE+M AFEELR+ A
Sbjct: 958 FRVHQFAGGFDAESMKAFEELRSRA 982
Score = 100 bits (248), Expect = 5e-21, Method: Compositional matrix adjust.
Identities = 53/102 (51%), Positives = 70/102 (68%), Gaps = 6/102 (5%)
Query: 232 QMQKRNLTCRLSSLEAQLPCPANSSESDIVAKIKAEASLLRHTNEDLSKQVEGLQTSRLN 291
Q +KR L+ +L S EA++ +N +ESD VAK++ E + L+H NEDL KQVEGLQ +R +
Sbjct: 284 QHEKRELSIKLDSAEARIATLSNMTESDKVAKVREEVNNLKHNNEDLLKQVEGLQMNRFS 343
Query: 292 EVEELAYLRWVNSCLRNELKN------TCSALESDKPSSPQS 327
EVEEL YLRWVN+CLR EL+N SA + K SP+S
Sbjct: 344 EVEELVYLRWVNACLRYELRNYQTPAGKISARDLSKNLSPKS 385
>AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 330 bits (845), Expect = 3e-90, Method: Compositional matrix adjust.
Identities = 157/265 (59%), Positives = 202/265 (76%), Gaps = 1/265 (0%)
Query: 522 VKRAPQVVELYHSLMKRDSRKDSSNGGLSDAP-DVANVRSSMIGEIENRSSHLLAIKADI 580
V RAP++VE Y SLMKR+S+K+ + +S + + R++MIGEIENRS+ LLA+KAD+
Sbjct: 718 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 777
Query: 581 ETQGEFVNSLIKEVKNAVYQNIEDVVAFVKWLDDELCFLVDERAVLKHFEWPEKKADTLR 640
ETQG+FV SL EV+ + + +IED++AFV WLD+EL FLVDERAVLKHF+WPE KAD LR
Sbjct: 778 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALR 837
Query: 641 EAAFGYQDLKKLESEVSSYKDDPRIPCDIALKKMVALSEKMERTVYNLLRTRESLMRNCK 700
EAAF YQDL KLE +V+S+ DDP + C+ ALKKM L EK+E++VY LLRTR+ + K
Sbjct: 838 EAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYK 897
Query: 701 DFQIPIEWMLDNXXXXXXXXXSVKLAKNYMKRVAMELQAKSALDKDPAMDYMLLQGVRFA 760
+F IP++W+ D SV+LAK YMKRVA EL + S DKDP +++LLQGVRFA
Sbjct: 898 EFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFA 957
Query: 761 FRIHQFAGGFDAETMHAFEELRNLA 785
FR+HQFAGGFDAE+M AFEELR+ A
Sbjct: 958 FRVHQFAGGFDAESMKAFEELRSRA 982
Score = 100 bits (248), Expect = 5e-21, Method: Compositional matrix adjust.
Identities = 53/102 (51%), Positives = 70/102 (68%), Gaps = 6/102 (5%)
Query: 232 QMQKRNLTCRLSSLEAQLPCPANSSESDIVAKIKAEASLLRHTNEDLSKQVEGLQTSRLN 291
Q +KR L+ +L S EA++ +N +ESD VAK++ E + L+H NEDL KQVEGLQ +R +
Sbjct: 284 QHEKRELSIKLDSAEARIATLSNMTESDKVAKVREEVNNLKHNNEDLLKQVEGLQMNRFS 343
Query: 292 EVEELAYLRWVNSCLRNELKN------TCSALESDKPSSPQS 327
EVEEL YLRWVN+CLR EL+N SA + K SP+S
Sbjct: 344 EVEELVYLRWVNACLRYELRNYQTPAGKISARDLSKNLSPKS 385
>AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like
superfamily protein | chr4:10231439-10234534 FORWARD
LENGTH=642
Length = 642
Score = 325 bits (834), Expect = 5e-89, Method: Compositional matrix adjust.
Identities = 167/273 (61%), Positives = 199/273 (72%), Gaps = 9/273 (3%)
Query: 519 SAMVKRAPQVVELYHSLMKRDS---RKDSSNGGLSDAPDV---ANVRSSMIGEIENRSSH 572
SA V+R P+VVE YHSLM+RDS R+DS+ GG + A + +N R MIGEIENRS +
Sbjct: 350 SAKVRRVPEVVEFYHSLMRRDSTNSRRDSTGGGNAAAEAILANSNARD-MIGEIENRSVY 408
Query: 573 LLAIKADIETQGEFVNSLIKEVKNAVYQNIEDVVAFVKWLDDELCFLVDERAVLKHFEWP 632
LLAIK D+ETQG+F+ LIKEV NA + +IEDVV FVKWLDDEL +LVDERAVLKHFEWP
Sbjct: 409 LLAIKTDVETQGDFIRFLIKEVGNAAFSDIEDVVPFVKWLDDELSYLVDERAVLKHFEWP 468
Query: 633 EKKADTLREAAFGYQDLKKLESEVSSYKDDPRIPCDIALKKMVALSEKMERTVYNLLRTR 692
E+KAD LREAAF Y DLKKL SE S +++DPR ALKKM AL EK+E VY+L R R
Sbjct: 469 EQKADALREAAFCYFDLKKLISEASRFREDPRQSSSSALKKMQALFEKLEHGVYSLSRMR 528
Query: 693 ESLMRNCKDFQIPIEWMLDNXXXXXXXXXSVKLAKNYMKRVAMELQAKSALDKDPAMDYM 752
ES K FQIP++WML+ SVKLA YMKRV+ EL+A P + +
Sbjct: 529 ESAATKFKSFQIPVDWMLETGITSQIKLASVKLAMKYMKRVSAELEA--IEGGGPEEEEL 586
Query: 753 LLQGVRFAFRIHQFAGGFDAETMHAFEELRNLA 785
++QGVRFAFR+HQFAGGFDAETM AFEELR+ A
Sbjct: 587 IVQGVRFAFRVHQFAGGFDAETMKAFEELRDKA 619
>AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein family
protein | chr1:17835196-17837553 FORWARD LENGTH=558
Length = 558
Score = 288 bits (736), Expect = 1e-77, Method: Compositional matrix adjust.
Identities = 138/266 (51%), Positives = 190/266 (71%)
Query: 517 SNSAMVKRAPQVVELYHSLMKRDSRKDSSNGGLSDAPDVANVRSSMIGEIENRSSHLLAI 576
+ +A +++P V +L+ L K+D+ ++ S + V + +S++GEI+NRS+HL+AI
Sbjct: 274 AKAARAQKSPPVSQLFQLLNKQDNSRNLSQSVNGNKSQVNSAHNSIVGEIQNRSAHLIAI 333
Query: 577 KADIETQGEFVNSLIKEVKNAVYQNIEDVVAFVKWLDDELCFLVDERAVLKHFEWPEKKA 636
KADIET+GEF+N LI++V + ++EDV+ FV WLD EL L DERAVLKHF+WPEKKA
Sbjct: 334 KADIETKGEFINDLIQKVLTTCFSDMEDVMKFVDWLDKELATLADERAVLKHFKWPEKKA 393
Query: 637 DTLREAAFGYQDLKKLESEVSSYKDDPRIPCDIALKKMVALSEKMERTVYNLLRTRESLM 696
DTL+EAA Y++LKKLE E+SSY DDP I +ALKKM L +K E+ + L+R R S M
Sbjct: 394 DTLQEAAVEYRELKKLEKELSSYSDDPNIHYGVALKKMANLLDKSEQRIRRLVRLRGSSM 453
Query: 697 RNCKDFQIPIEWMLDNXXXXXXXXXSVKLAKNYMKRVAMELQAKSALDKDPAMDYMLLQG 756
R+ +DF+IP+EWMLD+ S+KLAK YM RVA ELQ+ LD++ + +LLQG
Sbjct: 454 RSYQDFKIPVEWMLDSGMICKIKRASIKLAKTYMNRVANELQSARNLDRESTKEALLLQG 513
Query: 757 VRFAFRIHQFAGGFDAETMHAFEELR 782
VRFA+R HQFAGG D ET+ A EE++
Sbjct: 514 VRFAYRTHQFAGGLDPETLCALEEIK 539
>AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast envelope; EXPRESSED IN: inflorescence
meristem, petal, leaf whorl, flower; EXPRESSED DURING: 4
anthesis, petal differentiation and expansion stage;
BEST Arabidopsis thaliana protein match is:
Tetratricopeptide repeat (TPR)-like superfamily protein
(TAIR:AT4G18570.1); Has 288 Blast hits to 260 proteins
in 50 species: Archae - 0; Bacteria - 8; Metazoa - 27;
Fungi - 15; Plants - 163; Viruses - 0; Other Eukaryotes
- 75 (source: NCBI BLink). | chr1:2184874-2186580
REVERSE LENGTH=392
Length = 392
Score = 236 bits (602), Expect = 5e-62, Method: Compositional matrix adjust.
Identities = 134/317 (42%), Positives = 189/317 (59%), Gaps = 26/317 (8%)
Query: 474 VPNPPPRPSSCSISNITKQESSAQVXXXXXXXXXXXXLNFASRSNSAMVKRAPQVVELYH 533
V NP P+P+ S TK + RS V+RAP+VVE Y
Sbjct: 93 VRNPNPKPTIQGQSTATKPPPPPPLPSKR---------TLGKRS----VRRAPEVVEFYR 139
Query: 534 SLMKRDSR---KDSSNGGLSDAPDVANVRSSMIGEIENRSSHLLAIKADIETQGEFVNSL 590
+L KR+S K + NG LS A +MIGEIENRS +L IK+D + + ++ L
Sbjct: 140 ALTKRESHMGNKINQNGVLSPA-----FNRNMIGEIENRSKYLSDIKSDTDRHRDHIHIL 194
Query: 591 IKEVKNAVYQNIEDVVAFVKWLDDELCFLVDERAVLKHF-EWPEKKADTLREAAFGYQDL 649
I +V+ A + +I +V FVKW+D+EL LVDERAVLKHF +WPE+K D+LREAA Y+
Sbjct: 195 ISKVEAATFTDISEVETFVKWIDEELSSLVDERAVLKHFPKWPERKVDSLREAACNYKRP 254
Query: 650 KKLESEVSSYKDDPRIPCDIALKKMVALSEKMERTVYNLLRTRESLMRNCKDFQIPIEWM 709
K L +E+ S+KD+P+ AL+++ +L +++E +V N + R+S + KDFQIP EWM
Sbjct: 255 KNLGNEILSFKDNPKDSLTQALQRIQSLQDRLEESVNNTEKMRDSTGKRYKDFQIPWEWM 314
Query: 710 LDNXXXXXXXXXSVKLAKNYMKRVAMELQAKSALDKDPAMDYMLLQGVRFAFRIHQFAGG 769
LD S++LA+ YMKR+A EL++ + + M LQGVRFA+ IHQFAGG
Sbjct: 315 LDTGLIGQLKYSSLRLAQEYMKRIAKELESNGSGKEGNLM----LQGVRFAYTIHQFAGG 370
Query: 770 FDAETMHAFEELRNLAS 786
FD ET+ F EL+ + +
Sbjct: 371 FDGETLSIFHELKKITT 387
>AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G11070.1); Has 23100 Blast hits to 15699
proteins in 1063 species: Archae - 116; Bacteria - 2262;
Metazoa - 8308; Fungi - 3268; Plants - 3181; Viruses -
958; Other Eukaryotes - 5007 (source: NCBI BLink). |
chr4:2544210-2547893 REVERSE LENGTH=880
Length = 880
Score = 63.5 bits (153), Expect = 5e-10, Method: Compositional matrix adjust.
Identities = 65/273 (23%), Positives = 128/273 (46%), Gaps = 26/273 (9%)
Query: 519 SAMVKRAPQVVELYHSLM-KRDSR------KDSSNG--GLSDAPDVANVRSSM---IGEI 566
++ ++R+ Q+ LY +L K + R K +S G +++ V RS M + E+
Sbjct: 587 TSKLRRSAQIANLYWALKGKLEGRGVEGKTKKASKGQNSVAEKSPVKVARSGMADALAEM 646
Query: 567 ENRSSHLLAIKADIETQGEFVNSLIKEVKNAVYQNIEDVVAFVKWLDDELCFLVDERAVL 626
RSS+ I+ D++ + + L + + +++++++ F ++ L L DE VL
Sbjct: 647 TKRSSYFQQIEEDVQKYAKSIEELKSSIHSFQTKDMKELLEFHSKVESILEKLTDETQVL 706
Query: 627 KHFE-WPEKKADTLREAAFGYQDLKKLESEVSSYKDDPRIPCDIALKKMVALSEKMERTV 685
FE +PEKK + +R A Y+ L + E+ ++K +P P + L K+ K + +
Sbjct: 707 ARFEGFPEKKLEVIRTAGALYKKLDGILVELKNWKIEP--PLNDLLDKIERYFNKFKGEI 764
Query: 686 YNLLRTRESLMRNCKDFQIPIEWMLDNXXXXXXXXXSVKLAKNYMKRVAMELQAKSALDK 745
+ RT++ + K + I I++ + V ++ N M+ E + + K
Sbjct: 765 ETVERTKDEDAKMFKRYNINIDFEV----LVQVKETMVDVSSNCMELALKERREANEEAK 820
Query: 746 DPAMDYM-------LLQGVRFAFRIHQFAGGFD 771
+ M L + +FAF+++ FAGG D
Sbjct: 821 NGEESKMKEERAKRLWRAFQFAFKVYTFAGGHD 853
>AT1G52080.1 | Symbols: AR791 | actin binding protein family |
chr1:19369788-19371862 FORWARD LENGTH=573
Length = 573
Score = 55.1 bits (131), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 29/81 (35%), Positives = 48/81 (59%), Gaps = 2/81 (2%)
Query: 232 QMQKRNLTCRLSSLEAQLPCPANSSESDIVAKIKAEASLLRHTNEDLSKQVEGLQTSRLN 291
Q + L+ +L S+ Q+ + E + + ++ + + LR NE+L K VE LQ R
Sbjct: 294 QFENFELSEKLESV--QIIANSKLEEPEEIETLREDCNRLRSENEELKKDVEQLQGDRCT 351
Query: 292 EVEELAYLRWVNSCLRNELKN 312
++E+L YLRW+N+CLR EL+
Sbjct: 352 DLEQLVYLRWINACLRYELRT 372