Miyakogusa Predicted Gene
- Lj0g3v0080909.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj0g3v0080909.1 Non Chatacterized Hit- tr|I1MHN4|I1MHN4_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.7743
PE=,74.52,0,seg,NULL; SUBFAMILY NOT NAMED,NULL; FAMILY NOT NAMED,NULL;
coiled-coil,NULL,CUFF.4166.1
(579 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein famil... 439 e-123
AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 261 1e-69
AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 261 1e-69
AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 261 1e-69
AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like su... 238 7e-63
AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 219 5e-57
AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 62 1e-09
AT1G61080.1 | Symbols: | Hydroxyproline-rich glycoprotein famil... 53 5e-07
>AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein family
protein | chr1:17835196-17837553 FORWARD LENGTH=558
Length = 558
Score = 439 bits (1128), Expect = e-123, Method: Compositional matrix adjust.
Identities = 251/519 (48%), Positives = 331/519 (63%), Gaps = 30/519 (5%)
Query: 75 NSRAKRGLVMMNKSKPNEE---VEGTQKCREVEEAKVAAQYA------SRHSVEQLARPK 125
N AKR +++ ++K EE V Q+ R V V Q+ SR S E +
Sbjct: 49 NDPAKRRSILLKRAKSAEEEMAVLAPQRARSVNRPAVVEQFGCPRRPISRKSEETVMATA 108
Query: 126 RGVGDFVLKMNREEIHGXXXXXXXXXVRESLIKNLQSEVLALKAELGKVKSLNVELDSHN 185
+ +M EE+ V ESLIK+LQ +VL LK EL + ++ NVEL+ +N
Sbjct: 109 AAEDEKRKRM--EELE------EKLVVNESLIKDLQLQVLNLKTELEEARNSNVELELNN 160
Query: 186 RKLIQNLAAAEAKVATIGSCEKEPIGEHESSKFKHIQKLIADKLEKSKLKKEAITESCIV 245
RKL Q+L +AEAK++++ S +K P EH++S+FK IQ+LIA KLE+ K+KKE ES +
Sbjct: 161 RKLSQDLVSAEAKISSLSSNDK-PAKEHQNSRFKDIQRLIASKLEQPKVKKEVAVESSRL 219
Query: 246 KEPVPAPKA---------VLAIPGATSSRIGTNXXXXXXXXXXXXXXXXXXXXXXXXAKA 296
P P+P L P ++ +G AKA
Sbjct: 220 SPPSPSPSRLPPTPPLPKFLVSPASS---LGKRDENSSPFAPPTPPPPPPPPPPRPLAKA 276
Query: 297 TSTQKAPSFEKLFHLLKNQEGMKDTNGSVKQQKPVAVSVHSSIVGEIQNRSAHLLSIRAD 356
QK+P +LF LL Q+ ++ + SV K S H+SIVGEIQNRSAHL++I+AD
Sbjct: 277 ARAQKSPPVSQLFQLLNKQDNSRNLSQSVNGNKSQVNSAHNSIVGEIQNRSAHLIAIKAD 336
Query: 357 IETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAM 416
IETKGEFIN LI+KV+ ++D+EDV+ FV+WLD EL++LADERAVLKHF WPEKKAD +
Sbjct: 337 IETKGEFINDLIQKVLTTCFSDMEDVMKFVDWLDKELATLADERAVLKHFKWPEKKADTL 396
Query: 417 REAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSY 476
+EAAVEYRELK LE+++SSY DDP I G +L+KMA+LLDKSE I+RLV+LR S MRSY
Sbjct: 397 QEAAVEYRELKKLEKELSSYSDDPNIHYGVALKKMANLLDKSEQRIRRLVRLRGSSMRSY 456
Query: 477 QEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHF 536
Q++KIP WMLDSGM+ KIK AS+ L K Y+ RV EL SARN R+S++E+LLLQG+ F
Sbjct: 457 QDFKIPVEWMLDSGMICKIKRASIKLAKTYMNRVANELQSARNLDRESTKEALLLQGVRF 516
Query: 537 AYRAHQFAGGLDAETLCAFEEIRKSVQGHLAGSRELLSG 575
AYR HQFAGGLD ETLCA EEI++ V HL +R ++G
Sbjct: 517 AYRTHQFAGGLDPETLCALEEIKQRVPSHLRLARGNMAG 555
>AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD LENGTH=863
Length = 863
Score = 261 bits (666), Expect = 1e-69, Method: Compositional matrix adjust.
Identities = 123/260 (47%), Positives = 181/260 (69%), Gaps = 1/260 (0%)
Query: 301 KAPSFEKLFHLLKNQEGMKDTNGS-VKQQKPVAVSVHSSIVGEIQNRSAHLLSIRADIET 359
+AP + + L +E K+ S + + + ++++GEI+NRS LL+++AD+ET
Sbjct: 579 RAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADVET 638
Query: 360 KGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAMREA 419
+G+F+ L +V +++TDIED+L FV+WLD ELS L DERAVLKHF WPE KADA+REA
Sbjct: 639 QGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALREA 698
Query: 420 AVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEY 479
A EY++L LE+ ++S+ DDP +SC +L+KM LL+K E S+ L++ R+ + Y+E+
Sbjct: 699 AFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYKEF 758
Query: 480 KIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHFAYR 539
IP W+ D+G+V KIK +S+ L K Y+KRV EL S S + ++E LLLQG+ FA+R
Sbjct: 759 GIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFAFR 818
Query: 540 AHQFAGGLDAETLCAFEEIR 559
HQFAGG DAE++ AFEE+R
Sbjct: 819 VHQFAGGFDAESMKAFEELR 838
>AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 261 bits (666), Expect = 1e-69, Method: Compositional matrix adjust.
Identities = 123/260 (47%), Positives = 181/260 (69%), Gaps = 1/260 (0%)
Query: 301 KAPSFEKLFHLLKNQEGMKDTNGS-VKQQKPVAVSVHSSIVGEIQNRSAHLLSIRADIET 359
+AP + + L +E K+ S + + + ++++GEI+NRS LL+++AD+ET
Sbjct: 720 RAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADVET 779
Query: 360 KGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAMREA 419
+G+F+ L +V +++TDIED+L FV+WLD ELS L DERAVLKHF WPE KADA+REA
Sbjct: 780 QGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALREA 839
Query: 420 AVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEY 479
A EY++L LE+ ++S+ DDP +SC +L+KM LL+K E S+ L++ R+ + Y+E+
Sbjct: 840 AFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYKEF 899
Query: 480 KIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHFAYR 539
IP W+ D+G+V KIK +S+ L K Y+KRV EL S S + ++E LLLQG+ FA+R
Sbjct: 900 GIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFAFR 959
Query: 540 AHQFAGGLDAETLCAFEEIR 559
HQFAGG DAE++ AFEE+R
Sbjct: 960 VHQFAGGFDAESMKAFEELR 979
>AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 261 bits (666), Expect = 1e-69, Method: Compositional matrix adjust.
Identities = 123/260 (47%), Positives = 181/260 (69%), Gaps = 1/260 (0%)
Query: 301 KAPSFEKLFHLLKNQEGMKDTNGS-VKQQKPVAVSVHSSIVGEIQNRSAHLLSIRADIET 359
+AP + + L +E K+ S + + + ++++GEI+NRS LL+++AD+ET
Sbjct: 720 RAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADVET 779
Query: 360 KGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAMREA 419
+G+F+ L +V +++TDIED+L FV+WLD ELS L DERAVLKHF WPE KADA+REA
Sbjct: 780 QGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALREA 839
Query: 420 AVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEY 479
A EY++L LE+ ++S+ DDP +SC +L+KM LL+K E S+ L++ R+ + Y+E+
Sbjct: 840 AFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYKEF 899
Query: 480 KIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHFAYR 539
IP W+ D+G+V KIK +S+ L K Y+KRV EL S S + ++E LLLQG+ FA+R
Sbjct: 900 GIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFAFR 959
Query: 540 AHQFAGGLDAETLCAFEEIR 559
HQFAGG DAE++ AFEE+R
Sbjct: 960 VHQFAGGFDAESMKAFEELR 979
>AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like
superfamily protein | chr4:10231439-10234534 FORWARD
LENGTH=642
Length = 642
Score = 238 bits (608), Expect = 7e-63, Method: Compositional matrix adjust.
Identities = 118/265 (44%), Positives = 178/265 (67%), Gaps = 7/265 (2%)
Query: 300 QKAPSFEKLFHLLKNQEGM---KDTNG--SVKQQKPVAVSVHSSIVGEIQNRSAHLLSIR 354
++ P + +H L ++ +D+ G + + +A S ++GEI+NRS +LL+I+
Sbjct: 354 RRVPEVVEFYHSLMRRDSTNSRRDSTGGGNAAAEAILANSNARDMIGEIENRSVYLLAIK 413
Query: 355 ADIETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKAD 414
D+ET+G+FI LIK+V AA++DIEDV+ FV WLD ELS L DERAVLKHF WPE+KAD
Sbjct: 414 TDVETQGDFIRFLIKEVGNAAFSDIEDVVPFVKWLDDELSYLVDERAVLKHFEWPEQKAD 473
Query: 415 AMREAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMR 474
A+REAA Y +LK L + S +++DP S ++L+KM +L +K E+ + L ++R S
Sbjct: 474 ALREAAFCYFDLKKLISEASRFREDPRQSSSSALKKMQALFEKLEHGVYSLSRMRESAAT 533
Query: 475 SYQEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGM 534
++ ++IP WML++G+ ++IK AS+ L Y+KRV+ EL + +E L++QG+
Sbjct: 534 KFKSFQIPVDWMLETGITSQIKLASVKLAMKYMKRVSAELEAIEGG--GPEEEELIVQGV 591
Query: 535 HFAYRAHQFAGGLDAETLCAFEEIR 559
FA+R HQFAGG DAET+ AFEE+R
Sbjct: 592 RFAFRVHQFAGGFDAETMKAFEELR 616
>AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast envelope; EXPRESSED IN: inflorescence
meristem, petal, leaf whorl, flower; EXPRESSED DURING: 4
anthesis, petal differentiation and expansion stage;
BEST Arabidopsis thaliana protein match is:
Tetratricopeptide repeat (TPR)-like superfamily protein
(TAIR:AT4G18570.1); Has 288 Blast hits to 260 proteins
in 50 species: Archae - 0; Bacteria - 8; Metazoa - 27;
Fungi - 15; Plants - 163; Viruses - 0; Other Eukaryotes
- 75 (source: NCBI BLink). | chr1:2184874-2186580
REVERSE LENGTH=392
Length = 392
Score = 219 bits (557), Expect = 5e-57, Method: Compositional matrix adjust.
Identities = 140/413 (33%), Positives = 222/413 (53%), Gaps = 35/413 (8%)
Query: 162 SEVLALKAELGKVKSLNVELDSHNRKLIQNLAAAEAKVATIGSCEKEPIGEHESSKFKHI 221
S++L L EL N +L+ N +L Q +A A+V+ + S E E +S +K +
Sbjct: 9 SDLLRLVKELQAYLVRNDKLEKENHELRQEVARLRAQVSNLKSHE----NERKSMLWKKL 64
Query: 222 QK-LIADKLEKSKLKK----EAITESCIVKEPVPAPKAVLAIPGATSSRIGTNXXXXXXX 276
Q + S LK ++ T+ V+ P P P I G +++
Sbjct: 65 QSSYDGSNTDGSNLKAPESVKSNTKGQEVRNPNPKP----TIQGQSTA------------ 108
Query: 277 XXXXXXXXXXXXXXXXXAKATSTQKAPSFEKLFHLLKNQEGMKDTNGSVKQQKPVAVSVH 336
S ++AP + + L +E + Q ++ + +
Sbjct: 109 ---TKPPPPPPLPSKRTLGKRSVRRAPEVVEFYRALTKRESH--MGNKINQNGVLSPAFN 163
Query: 337 SSIVGEIQNRSAHLLSIRADIETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSL 396
+++GEI+NRS +L I++D + + I+ LI KV A +TDI +V FV W+D ELSSL
Sbjct: 164 RNMIGEIENRSKYLSDIKSDTDRHRDHIHILISKVEAATFTDISEVETFVKWIDEELSSL 223
Query: 397 ADERAVLKHF-SWPEKKADAMREAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLL 455
DERAVLKHF WPE+K D++REAA Y+ K L +I S+KD+P+ S +L+++ SL
Sbjct: 224 VDERAVLKHFPKWPERKVDSLREAACNYKRPKNLGNEILSFKDNPKDSLTQALQRIQSLQ 283
Query: 456 DKSEYSIQRLVKLRNSVMRSYQEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELG 515
D+ E S+ K+R+S + Y++++IP WMLD+G++ ++K++S+ L + Y+KR+ EL
Sbjct: 284 DRLEESVNNTEKMRDSTGKRYKDFQIPWEWMLDTGLIGQLKYSSLRLAQEYMKRIAKELE 343
Query: 516 SARNSGRQSSQESLLLQGMHFAYRAHQFAGGLDAETLCAFEEIRKSVQGHLAG 568
S SG++ +L+LQG+ FAY HQFAGG D ETL F E++K G G
Sbjct: 344 S-NGSGKEG---NLMLQGVRFAYTIHQFAGGFDGETLSIFHELKKITTGETRG 392
>AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G11070.1); Has 23100 Blast hits to 15699
proteins in 1063 species: Archae - 116; Bacteria - 2262;
Metazoa - 8308; Fungi - 3268; Plants - 3181; Viruses -
958; Other Eukaryotes - 5007 (source: NCBI BLink). |
chr4:2544210-2547893 REVERSE LENGTH=880
Length = 880
Score = 62.4 bits (150), Expect = 1e-09, Method: Compositional matrix adjust.
Identities = 56/250 (22%), Positives = 117/250 (46%), Gaps = 16/250 (6%)
Query: 324 SVKQQKPVAV--SVHSSIVGEIQNRSAHLLSIRADIETKGEFINGLIKKVVEAAYTDIED 381
SV ++ PV V S + + E+ RS++ I D++ + I L + D+++
Sbjct: 625 SVAEKSPVKVARSGMADALAEMTKRSSYFQQIEEDVQKYAKSIEELKSSIHSFQTKDMKE 684
Query: 382 VLNFVNWLDGELSSLADERAVLKHF-SWPEKKADAMREAAVEYRELKLLEQDISSYKDDP 440
+L F + ++ L L DE VL F +PEKK + +R A Y++L + ++ ++K +P
Sbjct: 685 LLEFHSKVESILEKLTDETQVLARFEGFPEKKLEVIRTAGALYKKLDGILVELKNWKIEP 744
Query: 441 EISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEYKIPTAWMLDSGMVTKIKHA-- 498
++ L K+ +K + I+ + + ++ + ++ Y I +D ++ ++K
Sbjct: 745 PLN--DLLDKIERYFNKFKGEIETVERTKDEDAKMFKRYNIN----IDFEVLVQVKETMV 798
Query: 499 --SMNLVKVYIKR---VTMELGSARNSGRQSSQESLLLQGMHFAYRAHQFAGGLDAETLC 553
S N +++ +K E + S + + L + FA++ + FAGG D C
Sbjct: 799 DVSSNCMELALKERREANEEAKNGEESKMKEERAKRLWRAFQFAFKVYTFAGGHDERADC 858
Query: 554 AFEEIRKSVQ 563
++ +Q
Sbjct: 859 LTRQLAHEIQ 868
>AT1G61080.1 | Symbols: | Hydroxyproline-rich glycoprotein family
protein | chr1:22493194-22497019 REVERSE LENGTH=907
Length = 907
Score = 53.1 bits (126), Expect = 5e-07, Method: Compositional matrix adjust.
Identities = 54/225 (24%), Positives = 97/225 (43%), Gaps = 32/225 (14%)
Query: 340 VGEIQNRSAHLLSIRADIETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADE 399
+ EI +SA+ L I+ADI IN L ++ + D+ ++L+F ++ L +L DE
Sbjct: 707 LAEITKKSAYFLQIQADIAKYMTSINELKIEITKFQTKDMTELLSFHRRVESVLENLTDE 766
Query: 400 RAVLKHF-SWPEKKADAMREAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKS 458
VL +P+KK +AMR A Y +L + ++ + K +P ++ LLDK
Sbjct: 767 SQVLARCEGFPQKKLEAMRMAVALYTKLHGMITELQNMKIEPPLN---------QLLDKV 817
Query: 459 EYSIQRLVKLRNSVMRSYQEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSAR 518
E ++ + + + E + D +V+ S+ +GSA+
Sbjct: 818 ERYFTKIKETMVDISSNCMELALKEK--RDEKLVSPDAKPSLKKT----------VGSAK 865
Query: 519 NSGRQSSQESLLLQGMHFAYRAHQFAGGLDAETLCAFEEIRKSVQ 563
+L + FA++ + FAGG D E+ +Q
Sbjct: 866 ----------MLWRAFQFAFKVYTFAGGHDDRADSLTRELAHEIQ 900