Miyakogusa Predicted Gene

Lj0g3v0080909.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj0g3v0080909.1 Non Chatacterized Hit- tr|I1MHN4|I1MHN4_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.7743
PE=,74.52,0,seg,NULL; SUBFAMILY NOT NAMED,NULL; FAMILY NOT NAMED,NULL;
coiled-coil,NULL,CUFF.4166.1
         (579 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT1G48280.1 | Symbols:  | hydroxyproline-rich glycoprotein famil...   439   e-123
AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   261   1e-69
AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   261   1e-69
AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   261   1e-69
AT4G18570.1 | Symbols:  | Tetratricopeptide repeat (TPR)-like su...   238   7e-63
AT1G07120.1 | Symbols:  | FUNCTIONS IN: molecular_function unkno...   219   5e-57
AT4G04980.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...    62   1e-09
AT1G61080.1 | Symbols:  | Hydroxyproline-rich glycoprotein famil...    53   5e-07

>AT1G48280.1 | Symbols:  | hydroxyproline-rich glycoprotein family
           protein | chr1:17835196-17837553 FORWARD LENGTH=558
          Length = 558

 Score =  439 bits (1128), Expect = e-123,   Method: Compositional matrix adjust.
 Identities = 251/519 (48%), Positives = 331/519 (63%), Gaps = 30/519 (5%)

Query: 75  NSRAKRGLVMMNKSKPNEE---VEGTQKCREVEEAKVAAQYA------SRHSVEQLARPK 125
           N  AKR  +++ ++K  EE   V   Q+ R V    V  Q+       SR S E +    
Sbjct: 49  NDPAKRRSILLKRAKSAEEEMAVLAPQRARSVNRPAVVEQFGCPRRPISRKSEETVMATA 108

Query: 126 RGVGDFVLKMNREEIHGXXXXXXXXXVRESLIKNLQSEVLALKAELGKVKSLNVELDSHN 185
               +   +M  EE+           V ESLIK+LQ +VL LK EL + ++ NVEL+ +N
Sbjct: 109 AAEDEKRKRM--EELE------EKLVVNESLIKDLQLQVLNLKTELEEARNSNVELELNN 160

Query: 186 RKLIQNLAAAEAKVATIGSCEKEPIGEHESSKFKHIQKLIADKLEKSKLKKEAITESCIV 245
           RKL Q+L +AEAK++++ S +K P  EH++S+FK IQ+LIA KLE+ K+KKE   ES  +
Sbjct: 161 RKLSQDLVSAEAKISSLSSNDK-PAKEHQNSRFKDIQRLIASKLEQPKVKKEVAVESSRL 219

Query: 246 KEPVPAPKA---------VLAIPGATSSRIGTNXXXXXXXXXXXXXXXXXXXXXXXXAKA 296
             P P+P            L  P ++   +G                          AKA
Sbjct: 220 SPPSPSPSRLPPTPPLPKFLVSPASS---LGKRDENSSPFAPPTPPPPPPPPPPRPLAKA 276

Query: 297 TSTQKAPSFEKLFHLLKNQEGMKDTNGSVKQQKPVAVSVHSSIVGEIQNRSAHLLSIRAD 356
              QK+P   +LF LL  Q+  ++ + SV   K    S H+SIVGEIQNRSAHL++I+AD
Sbjct: 277 ARAQKSPPVSQLFQLLNKQDNSRNLSQSVNGNKSQVNSAHNSIVGEIQNRSAHLIAIKAD 336

Query: 357 IETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAM 416
           IETKGEFIN LI+KV+   ++D+EDV+ FV+WLD EL++LADERAVLKHF WPEKKAD +
Sbjct: 337 IETKGEFINDLIQKVLTTCFSDMEDVMKFVDWLDKELATLADERAVLKHFKWPEKKADTL 396

Query: 417 REAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSY 476
           +EAAVEYRELK LE+++SSY DDP I  G +L+KMA+LLDKSE  I+RLV+LR S MRSY
Sbjct: 397 QEAAVEYRELKKLEKELSSYSDDPNIHYGVALKKMANLLDKSEQRIRRLVRLRGSSMRSY 456

Query: 477 QEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHF 536
           Q++KIP  WMLDSGM+ KIK AS+ L K Y+ RV  EL SARN  R+S++E+LLLQG+ F
Sbjct: 457 QDFKIPVEWMLDSGMICKIKRASIKLAKTYMNRVANELQSARNLDRESTKEALLLQGVRF 516

Query: 537 AYRAHQFAGGLDAETLCAFEEIRKSVQGHLAGSRELLSG 575
           AYR HQFAGGLD ETLCA EEI++ V  HL  +R  ++G
Sbjct: 517 AYRTHQFAGGLDPETLCALEEIKQRVPSHLRLARGNMAG 555


>AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD LENGTH=863
          Length = 863

 Score =  261 bits (666), Expect = 1e-69,   Method: Compositional matrix adjust.
 Identities = 123/260 (47%), Positives = 181/260 (69%), Gaps = 1/260 (0%)

Query: 301 KAPSFEKLFHLLKNQEGMKDTNGS-VKQQKPVAVSVHSSIVGEIQNRSAHLLSIRADIET 359
           +AP   + +  L  +E  K+   S +      + +  ++++GEI+NRS  LL+++AD+ET
Sbjct: 579 RAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADVET 638

Query: 360 KGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAMREA 419
           +G+F+  L  +V  +++TDIED+L FV+WLD ELS L DERAVLKHF WPE KADA+REA
Sbjct: 639 QGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALREA 698

Query: 420 AVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEY 479
           A EY++L  LE+ ++S+ DDP +SC  +L+KM  LL+K E S+  L++ R+  +  Y+E+
Sbjct: 699 AFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYKEF 758

Query: 480 KIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHFAYR 539
            IP  W+ D+G+V KIK +S+ L K Y+KRV  EL S   S +  ++E LLLQG+ FA+R
Sbjct: 759 GIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFAFR 818

Query: 540 AHQFAGGLDAETLCAFEEIR 559
            HQFAGG DAE++ AFEE+R
Sbjct: 819 VHQFAGGFDAESMKAFEELR 838


>AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD
           LENGTH=1004
          Length = 1004

 Score =  261 bits (666), Expect = 1e-69,   Method: Compositional matrix adjust.
 Identities = 123/260 (47%), Positives = 181/260 (69%), Gaps = 1/260 (0%)

Query: 301 KAPSFEKLFHLLKNQEGMKDTNGS-VKQQKPVAVSVHSSIVGEIQNRSAHLLSIRADIET 359
           +AP   + +  L  +E  K+   S +      + +  ++++GEI+NRS  LL+++AD+ET
Sbjct: 720 RAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADVET 779

Query: 360 KGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAMREA 419
           +G+F+  L  +V  +++TDIED+L FV+WLD ELS L DERAVLKHF WPE KADA+REA
Sbjct: 780 QGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALREA 839

Query: 420 AVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEY 479
           A EY++L  LE+ ++S+ DDP +SC  +L+KM  LL+K E S+  L++ R+  +  Y+E+
Sbjct: 840 AFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYKEF 899

Query: 480 KIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHFAYR 539
            IP  W+ D+G+V KIK +S+ L K Y+KRV  EL S   S +  ++E LLLQG+ FA+R
Sbjct: 900 GIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFAFR 959

Query: 540 AHQFAGGLDAETLCAFEEIR 559
            HQFAGG DAE++ AFEE+R
Sbjct: 960 VHQFAGGFDAESMKAFEELR 979


>AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD
           LENGTH=1004
          Length = 1004

 Score =  261 bits (666), Expect = 1e-69,   Method: Compositional matrix adjust.
 Identities = 123/260 (47%), Positives = 181/260 (69%), Gaps = 1/260 (0%)

Query: 301 KAPSFEKLFHLLKNQEGMKDTNGS-VKQQKPVAVSVHSSIVGEIQNRSAHLLSIRADIET 359
           +AP   + +  L  +E  K+   S +      + +  ++++GEI+NRS  LL+++AD+ET
Sbjct: 720 RAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADVET 779

Query: 360 KGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKADAMREA 419
           +G+F+  L  +V  +++TDIED+L FV+WLD ELS L DERAVLKHF WPE KADA+REA
Sbjct: 780 QGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALREA 839

Query: 420 AVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEY 479
           A EY++L  LE+ ++S+ DDP +SC  +L+KM  LL+K E S+  L++ R+  +  Y+E+
Sbjct: 840 AFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYKEF 899

Query: 480 KIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGMHFAYR 539
            IP  W+ D+G+V KIK +S+ L K Y+KRV  EL S   S +  ++E LLLQG+ FA+R
Sbjct: 900 GIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFAFR 959

Query: 540 AHQFAGGLDAETLCAFEEIR 559
            HQFAGG DAE++ AFEE+R
Sbjct: 960 VHQFAGGFDAESMKAFEELR 979


>AT4G18570.1 | Symbols:  | Tetratricopeptide repeat (TPR)-like
           superfamily protein | chr4:10231439-10234534 FORWARD
           LENGTH=642
          Length = 642

 Score =  238 bits (608), Expect = 7e-63,   Method: Compositional matrix adjust.
 Identities = 118/265 (44%), Positives = 178/265 (67%), Gaps = 7/265 (2%)

Query: 300 QKAPSFEKLFHLLKNQEGM---KDTNG--SVKQQKPVAVSVHSSIVGEIQNRSAHLLSIR 354
           ++ P   + +H L  ++     +D+ G  +   +  +A S    ++GEI+NRS +LL+I+
Sbjct: 354 RRVPEVVEFYHSLMRRDSTNSRRDSTGGGNAAAEAILANSNARDMIGEIENRSVYLLAIK 413

Query: 355 ADIETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADERAVLKHFSWPEKKAD 414
            D+ET+G+FI  LIK+V  AA++DIEDV+ FV WLD ELS L DERAVLKHF WPE+KAD
Sbjct: 414 TDVETQGDFIRFLIKEVGNAAFSDIEDVVPFVKWLDDELSYLVDERAVLKHFEWPEQKAD 473

Query: 415 AMREAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKSEYSIQRLVKLRNSVMR 474
           A+REAA  Y +LK L  + S +++DP  S  ++L+KM +L +K E+ +  L ++R S   
Sbjct: 474 ALREAAFCYFDLKKLISEASRFREDPRQSSSSALKKMQALFEKLEHGVYSLSRMRESAAT 533

Query: 475 SYQEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSARNSGRQSSQESLLLQGM 534
            ++ ++IP  WML++G+ ++IK AS+ L   Y+KRV+ EL +         +E L++QG+
Sbjct: 534 KFKSFQIPVDWMLETGITSQIKLASVKLAMKYMKRVSAELEAIEGG--GPEEEELIVQGV 591

Query: 535 HFAYRAHQFAGGLDAETLCAFEEIR 559
            FA+R HQFAGG DAET+ AFEE+R
Sbjct: 592 RFAFRVHQFAGGFDAETMKAFEELR 616


>AT1G07120.1 | Symbols:  | FUNCTIONS IN: molecular_function unknown;
           INVOLVED IN: biological_process unknown; LOCATED IN:
           chloroplast envelope; EXPRESSED IN: inflorescence
           meristem, petal, leaf whorl, flower; EXPRESSED DURING: 4
           anthesis, petal differentiation and expansion stage;
           BEST Arabidopsis thaliana protein match is:
           Tetratricopeptide repeat (TPR)-like superfamily protein
           (TAIR:AT4G18570.1); Has 288 Blast hits to 260 proteins
           in 50 species: Archae - 0; Bacteria - 8; Metazoa - 27;
           Fungi - 15; Plants - 163; Viruses - 0; Other Eukaryotes
           - 75 (source: NCBI BLink). | chr1:2184874-2186580
           REVERSE LENGTH=392
          Length = 392

 Score =  219 bits (557), Expect = 5e-57,   Method: Compositional matrix adjust.
 Identities = 140/413 (33%), Positives = 222/413 (53%), Gaps = 35/413 (8%)

Query: 162 SEVLALKAELGKVKSLNVELDSHNRKLIQNLAAAEAKVATIGSCEKEPIGEHESSKFKHI 221
           S++L L  EL      N +L+  N +L Q +A   A+V+ + S E     E +S  +K +
Sbjct: 9   SDLLRLVKELQAYLVRNDKLEKENHELRQEVARLRAQVSNLKSHE----NERKSMLWKKL 64

Query: 222 QK-LIADKLEKSKLKK----EAITESCIVKEPVPAPKAVLAIPGATSSRIGTNXXXXXXX 276
           Q        + S LK     ++ T+   V+ P P P     I G +++            
Sbjct: 65  QSSYDGSNTDGSNLKAPESVKSNTKGQEVRNPNPKP----TIQGQSTA------------ 108

Query: 277 XXXXXXXXXXXXXXXXXAKATSTQKAPSFEKLFHLLKNQEGMKDTNGSVKQQKPVAVSVH 336
                                S ++AP   + +  L  +E        + Q   ++ + +
Sbjct: 109 ---TKPPPPPPLPSKRTLGKRSVRRAPEVVEFYRALTKRESH--MGNKINQNGVLSPAFN 163

Query: 337 SSIVGEIQNRSAHLLSIRADIETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSL 396
            +++GEI+NRS +L  I++D +   + I+ LI KV  A +TDI +V  FV W+D ELSSL
Sbjct: 164 RNMIGEIENRSKYLSDIKSDTDRHRDHIHILISKVEAATFTDISEVETFVKWIDEELSSL 223

Query: 397 ADERAVLKHF-SWPEKKADAMREAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLL 455
            DERAVLKHF  WPE+K D++REAA  Y+  K L  +I S+KD+P+ S   +L+++ SL 
Sbjct: 224 VDERAVLKHFPKWPERKVDSLREAACNYKRPKNLGNEILSFKDNPKDSLTQALQRIQSLQ 283

Query: 456 DKSEYSIQRLVKLRNSVMRSYQEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELG 515
           D+ E S+    K+R+S  + Y++++IP  WMLD+G++ ++K++S+ L + Y+KR+  EL 
Sbjct: 284 DRLEESVNNTEKMRDSTGKRYKDFQIPWEWMLDTGLIGQLKYSSLRLAQEYMKRIAKELE 343

Query: 516 SARNSGRQSSQESLLLQGMHFAYRAHQFAGGLDAETLCAFEEIRKSVQGHLAG 568
           S   SG++    +L+LQG+ FAY  HQFAGG D ETL  F E++K   G   G
Sbjct: 344 S-NGSGKEG---NLMLQGVRFAYTIHQFAGGFDGETLSIFHELKKITTGETRG 392


>AT4G04980.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT1G11070.1); Has 23100 Blast hits to 15699
           proteins in 1063 species: Archae - 116; Bacteria - 2262;
           Metazoa - 8308; Fungi - 3268; Plants - 3181; Viruses -
           958; Other Eukaryotes - 5007 (source: NCBI BLink). |
           chr4:2544210-2547893 REVERSE LENGTH=880
          Length = 880

 Score = 62.4 bits (150), Expect = 1e-09,   Method: Compositional matrix adjust.
 Identities = 56/250 (22%), Positives = 117/250 (46%), Gaps = 16/250 (6%)

Query: 324 SVKQQKPVAV--SVHSSIVGEIQNRSAHLLSIRADIETKGEFINGLIKKVVEAAYTDIED 381
           SV ++ PV V  S  +  + E+  RS++   I  D++   + I  L   +      D+++
Sbjct: 625 SVAEKSPVKVARSGMADALAEMTKRSSYFQQIEEDVQKYAKSIEELKSSIHSFQTKDMKE 684

Query: 382 VLNFVNWLDGELSSLADERAVLKHF-SWPEKKADAMREAAVEYRELKLLEQDISSYKDDP 440
           +L F + ++  L  L DE  VL  F  +PEKK + +R A   Y++L  +  ++ ++K +P
Sbjct: 685 LLEFHSKVESILEKLTDETQVLARFEGFPEKKLEVIRTAGALYKKLDGILVELKNWKIEP 744

Query: 441 EISCGASLRKMASLLDKSEYSIQRLVKLRNSVMRSYQEYKIPTAWMLDSGMVTKIKHA-- 498
            ++    L K+    +K +  I+ + + ++   + ++ Y I     +D  ++ ++K    
Sbjct: 745 PLN--DLLDKIERYFNKFKGEIETVERTKDEDAKMFKRYNIN----IDFEVLVQVKETMV 798

Query: 499 --SMNLVKVYIKR---VTMELGSARNSGRQSSQESLLLQGMHFAYRAHQFAGGLDAETLC 553
             S N +++ +K       E  +   S  +  +   L +   FA++ + FAGG D    C
Sbjct: 799 DVSSNCMELALKERREANEEAKNGEESKMKEERAKRLWRAFQFAFKVYTFAGGHDERADC 858

Query: 554 AFEEIRKSVQ 563
              ++   +Q
Sbjct: 859 LTRQLAHEIQ 868


>AT1G61080.1 | Symbols:  | Hydroxyproline-rich glycoprotein family
           protein | chr1:22493194-22497019 REVERSE LENGTH=907
          Length = 907

 Score = 53.1 bits (126), Expect = 5e-07,   Method: Compositional matrix adjust.
 Identities = 54/225 (24%), Positives = 97/225 (43%), Gaps = 32/225 (14%)

Query: 340 VGEIQNRSAHLLSIRADIETKGEFINGLIKKVVEAAYTDIEDVLNFVNWLDGELSSLADE 399
           + EI  +SA+ L I+ADI      IN L  ++ +    D+ ++L+F   ++  L +L DE
Sbjct: 707 LAEITKKSAYFLQIQADIAKYMTSINELKIEITKFQTKDMTELLSFHRRVESVLENLTDE 766

Query: 400 RAVLKHF-SWPEKKADAMREAAVEYRELKLLEQDISSYKDDPEISCGASLRKMASLLDKS 458
             VL     +P+KK +AMR A   Y +L  +  ++ + K +P ++          LLDK 
Sbjct: 767 SQVLARCEGFPQKKLEAMRMAVALYTKLHGMITELQNMKIEPPLN---------QLLDKV 817

Query: 459 EYSIQRLVKLRNSVMRSYQEYKIPTAWMLDSGMVTKIKHASMNLVKVYIKRVTMELGSAR 518
           E    ++ +    +  +  E  +      D  +V+     S+             +GSA+
Sbjct: 818 ERYFTKIKETMVDISSNCMELALKEK--RDEKLVSPDAKPSLKKT----------VGSAK 865

Query: 519 NSGRQSSQESLLLQGMHFAYRAHQFAGGLDAETLCAFEEIRKSVQ 563
                     +L +   FA++ + FAGG D        E+   +Q
Sbjct: 866 ----------MLWRAFQFAFKVYTFAGGHDDRADSLTRELAHEIQ 900