Miyakogusa Predicted Gene

Lj4g3v2226740.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj4g3v2226740.1 Non Chatacterized Hit- tr|I1LYF0|I1LYF0_SOYBN
Uncharacterized protein OS=Glycine max PE=4 SV=1,73.15,0,seg,NULL;
coiled-coil,NULL; SUBFAMILY NOT NAMED,NULL; FAMILY NOT
NAMED,NULL,CUFF.50573.1
         (583 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT1G48280.1 | Symbols:  | hydroxyproline-rich glycoprotein famil...   446   e-125
AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   272   6e-73
AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   271   7e-73
AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   271   7e-73
AT4G18570.1 | Symbols:  | Tetratricopeptide repeat (TPR)-like su...   252   4e-67
AT1G07120.1 | Symbols:  | FUNCTIONS IN: molecular_function unkno...   214   9e-56
AT4G04980.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...    68   2e-11
AT1G61080.1 | Symbols:  | Hydroxyproline-rich glycoprotein famil...    53   7e-07

>AT1G48280.1 | Symbols:  | hydroxyproline-rich glycoprotein family
           protein | chr1:17835196-17837553 FORWARD LENGTH=558
          Length = 558

 Score =  446 bits (1147), Expect = e-125,   Method: Compositional matrix adjust.
 Identities = 255/530 (48%), Positives = 340/530 (64%), Gaps = 27/530 (5%)

Query: 68  ISSTRAKSVPPDLKNSSKAKRGLVLNKAKSIEE---VEGSQKGREGEEAKVVVLSAAAR- 123
           ++  + KS   D+KN    +R ++L +AKS EE   V   Q+ R      VV      R 
Sbjct: 35  LTGGKPKSSGYDVKNDPAKRRSILLKRAKSAEEEMAVLAPQRARSVNRPAVVEQFGCPRR 94

Query: 124 -IRRRVGD--FGLRRGEDDPDGXXXXXXXXXXXXXXXXXXXXXXLIKNLQSEMLELKVEL 180
            I R+  +        ED+                         LIK+LQ ++L LK EL
Sbjct: 95  PISRKSEETVMATAAAEDE--------KRKRMEELEEKLVVNESLIKDLQLQVLNLKTEL 146

Query: 181 DKARSVNMELESQNRKLTQELSAAEAKIAALGNSVKEPIGEHQSPKFKDIQKLIADKLER 240
           ++AR+ N+ELE  NRKL+Q+L +AEAKI++L ++ K P  EHQ+ +FKDIQ+LIA KLE+
Sbjct: 147 EEARNSNVELELNNRKLSQDLVSAEAKISSLSSNDK-PAKEHQNSRFKDIQRLIASKLEQ 205

Query: 241 SKVKKETVPEAIFVKASIPAPTPSRAV---------PETNIGRK--SXXXXXXXXXXXXX 289
            KVKKE   E+  +    P+P+              P +++G++  +             
Sbjct: 206 PKVKKEVAVESSRLSPPSPSPSRLPPTPPLPKFLVSPASSLGKRDENSSPFAPPTPPPPP 265

Query: 290 XXXXXXXXAKLANIQKPPAIVELFHSLKNQDGKKDSKGMVNHQRPVASSAHSSIVGEIQN 349
                   AK A  QK P + +LF  L  QD  ++    VN  +   +SAH+SIVGEIQN
Sbjct: 266 PPPPPRPLAKAARAQKSPPVSQLFQLLNKQDNSRNLSQSVNGNKSQVNSAHNSIVGEIQN 325

Query: 350 RSAHLLAIRADIETKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADERAVLKH 409
           RSAHL+AI+ADIETKGEFINDLI+KV+   + D+E+V+KFVDWLD EL+TLADERAVLKH
Sbjct: 326 RSAHLIAIKADIETKGEFINDLIQKVLTTCFSDMEDVMKFVDWLDKELATLADERAVLKH 385

Query: 410 FKWPEKKADAMREAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKSERSIQRL 469
           FKWPEKKAD ++EAAVEYRELK LE+E+SS+ DD +I  G +L+KMA LLDKSE+ I+RL
Sbjct: 386 FKWPEKKADTLQEAAVEYRELKKLEKELSSYSDDPNIHYGVALKKMANLLDKSEQRIRRL 445

Query: 470 IKLRSSAIRSYQVYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIRNSDRESS 529
           ++LR S++RSYQ + IP  WMLDSGM+ KIK+AS+ L K YM R+  EL+S RN DRES+
Sbjct: 446 VRLRGSSMRSYQDFKIPVEWMLDSGMICKIKRASIKLAKTYMNRVANELQSARNLDREST 505

Query: 530 QDSLLLQGVHFAYKAHQFAGGLDSETLCAFEEIRQRVPGNLAGSRELLAG 579
           +++LLLQGV FAY+ HQFAGGLD ETLCA EEI+QRVP +L  +R  +AG
Sbjct: 506 KEALLLQGVRFAYRTHQFAGGLDPETLCALEEIKQRVPSHLRLARGNMAG 555


>AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD LENGTH=863
          Length = 863

 Score =  272 bits (695), Expect = 6e-73,   Method: Compositional matrix adjust.
 Identities = 124/264 (46%), Positives = 192/264 (72%), Gaps = 1/264 (0%)

Query: 303 IQKPPAIVELFHSLKNQDGKKD-SKGMVNHQRPVASSAHSSIVGEIQNRSAHLLAIRADI 361
           + + P +VE + SL  ++ KK+ +  +++     +S+A ++++GEI+NRS  LLA++AD+
Sbjct: 577 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 636

Query: 362 ETKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADERAVLKHFKWPEKKADAMR 421
           ET+G+F+  L  +V  +++ DIE++L FV WLD ELS L DERAVLKHF WPE KADA+R
Sbjct: 637 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALR 696

Query: 422 EAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKSERSIQRLIKLRSSAIRSYQ 481
           EAA EY++L  LE++++SF DD ++ C  +L+KM  LL+K E+S+  L++ R  AI  Y+
Sbjct: 697 EAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYK 756

Query: 482 VYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIRNSDRESSQDSLLLQGVHFA 541
            + IP  W+ D+G++ KIK +S+ L K YMKR+  EL+S+  SD++ +++ LLLQGV FA
Sbjct: 757 EFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFA 816

Query: 542 YKAHQFAGGLDSETLCAFEEIRQR 565
           ++ HQFAGG D+E++ AFEE+R R
Sbjct: 817 FRVHQFAGGFDAESMKAFEELRSR 840


>AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD
           LENGTH=1004
          Length = 1004

 Score =  271 bits (694), Expect = 7e-73,   Method: Compositional matrix adjust.
 Identities = 124/264 (46%), Positives = 192/264 (72%), Gaps = 1/264 (0%)

Query: 303 IQKPPAIVELFHSLKNQDGKKD-SKGMVNHQRPVASSAHSSIVGEIQNRSAHLLAIRADI 361
           + + P +VE + SL  ++ KK+ +  +++     +S+A ++++GEI+NRS  LLA++AD+
Sbjct: 718 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 777

Query: 362 ETKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADERAVLKHFKWPEKKADAMR 421
           ET+G+F+  L  +V  +++ DIE++L FV WLD ELS L DERAVLKHF WPE KADA+R
Sbjct: 778 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALR 837

Query: 422 EAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKSERSIQRLIKLRSSAIRSYQ 481
           EAA EY++L  LE++++SF DD ++ C  +L+KM  LL+K E+S+  L++ R  AI  Y+
Sbjct: 838 EAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYK 897

Query: 482 VYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIRNSDRESSQDSLLLQGVHFA 541
            + IP  W+ D+G++ KIK +S+ L K YMKR+  EL+S+  SD++ +++ LLLQGV FA
Sbjct: 898 EFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFA 957

Query: 542 YKAHQFAGGLDSETLCAFEEIRQR 565
           ++ HQFAGG D+E++ AFEE+R R
Sbjct: 958 FRVHQFAGGFDAESMKAFEELRSR 981


>AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD
           LENGTH=1004
          Length = 1004

 Score =  271 bits (694), Expect = 7e-73,   Method: Compositional matrix adjust.
 Identities = 124/264 (46%), Positives = 192/264 (72%), Gaps = 1/264 (0%)

Query: 303 IQKPPAIVELFHSLKNQDGKKD-SKGMVNHQRPVASSAHSSIVGEIQNRSAHLLAIRADI 361
           + + P +VE + SL  ++ KK+ +  +++     +S+A ++++GEI+NRS  LLA++AD+
Sbjct: 718 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 777

Query: 362 ETKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADERAVLKHFKWPEKKADAMR 421
           ET+G+F+  L  +V  +++ DIE++L FV WLD ELS L DERAVLKHF WPE KADA+R
Sbjct: 778 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADALR 837

Query: 422 EAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKSERSIQRLIKLRSSAIRSYQ 481
           EAA EY++L  LE++++SF DD ++ C  +L+KM  LL+K E+S+  L++ R  AI  Y+
Sbjct: 838 EAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISRYK 897

Query: 482 VYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIRNSDRESSQDSLLLQGVHFA 541
            + IP  W+ D+G++ KIK +S+ L K YMKR+  EL+S+  SD++ +++ LLLQGV FA
Sbjct: 898 EFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVRFA 957

Query: 542 YKAHQFAGGLDSETLCAFEEIRQR 565
           ++ HQFAGG D+E++ AFEE+R R
Sbjct: 958 FRVHQFAGGFDAESMKAFEELRSR 981


>AT4G18570.1 | Symbols:  | Tetratricopeptide repeat (TPR)-like
           superfamily protein | chr4:10231439-10234534 FORWARD
           LENGTH=642
          Length = 642

 Score =  252 bits (644), Expect = 4e-67,   Method: Compositional matrix adjust.
 Identities = 127/270 (47%), Positives = 185/270 (68%), Gaps = 7/270 (2%)

Query: 301 ANIQKPPAIVELFHSLKNQDG---KKDSKGMVNH--QRPVASSAHSSIVGEIQNRSAHLL 355
           A +++ P +VE +HSL  +D    ++DS G  N   +  +A+S    ++GEI+NRS +LL
Sbjct: 351 AKVRRVPEVVEFYHSLMRRDSTNSRRDSTGGGNAAAEAILANSNARDMIGEIENRSVYLL 410

Query: 356 AIRADIETKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADERAVLKHFKWPEK 415
           AI+ D+ET+G+FI  LIK+V +AA+ DIE+V+ FV WLD ELS L DERAVLKHF+WPE+
Sbjct: 411 AIKTDVETQGDFIRFLIKEVGNAAFSDIEDVVPFVKWLDDELSYLVDERAVLKHFEWPEQ 470

Query: 416 KADAMREAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKSERSIQRLIKLRSS 475
           KADA+REAA  Y +LK L  E S F++D      ++L+KM  L +K E  +  L ++R S
Sbjct: 471 KADALREAAFCYFDLKKLISEASRFREDPRQSSSSALKKMQALFEKLEHGVYSLSRMRES 530

Query: 476 AIRSYQVYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIRNSDRESSQDSLLL 535
           A   ++ + IP  WML++G+ S+IK AS+ L   YMKR++ ELE+I        ++ L++
Sbjct: 531 AATKFKSFQIPVDWMLETGITSQIKLASVKLAMKYMKRVSAELEAIEGG--GPEEEELIV 588

Query: 536 QGVHFAYKAHQFAGGLDSETLCAFEEIRQR 565
           QGV FA++ HQFAGG D+ET+ AFEE+R +
Sbjct: 589 QGVRFAFRVHQFAGGFDAETMKAFEELRDK 618


>AT1G07120.1 | Symbols:  | FUNCTIONS IN: molecular_function unknown;
           INVOLVED IN: biological_process unknown; LOCATED IN:
           chloroplast envelope; EXPRESSED IN: inflorescence
           meristem, petal, leaf whorl, flower; EXPRESSED DURING: 4
           anthesis, petal differentiation and expansion stage;
           BEST Arabidopsis thaliana protein match is:
           Tetratricopeptide repeat (TPR)-like superfamily protein
           (TAIR:AT4G18570.1); Has 288 Blast hits to 260 proteins
           in 50 species: Archae - 0; Bacteria - 8; Metazoa - 27;
           Fungi - 15; Plants - 163; Viruses - 0; Other Eukaryotes
           - 75 (source: NCBI BLink). | chr1:2184874-2186580
           REVERSE LENGTH=392
          Length = 392

 Score =  214 bits (546), Expect = 9e-56,   Method: Compositional matrix adjust.
 Identities = 110/271 (40%), Positives = 168/271 (61%), Gaps = 7/271 (2%)

Query: 303 IQKPPAIVELFHSLKNQDGKKDSKGMVNHQRPVASSAHSSIVGEIQNRSAHLLAIRADIE 362
           +++ P +VE + +L  ++    +K  +N    ++ + + +++GEI+NRS +L  I++D +
Sbjct: 128 VRRAPEVVEFYRALTKRESHMGNK--INQNGVLSPAFNRNMIGEIENRSKYLSDIKSDTD 185

Query: 363 TKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADERAVLKHF-KWPEKKADAMR 421
              + I+ LI KV  A + DI EV  FV W+D ELS+L DERAVLKHF KWPE+K D++R
Sbjct: 186 RHRDHIHILISKVEAATFTDISEVETFVKWIDEELSSLVDERAVLKHFPKWPERKVDSLR 245

Query: 422 EAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKSERSIQRLIKLRSSAIRSYQ 481
           EAA  Y+  K L  EI SFKD+       +L+++  L D+ E S+    K+R S  + Y+
Sbjct: 246 EAACNYKRPKNLGNEILSFKDNPKDSLTQALQRIQSLQDRLEESVNNTEKMRDSTGKRYK 305

Query: 482 VYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIRNSDRESSQDSLLLQGVHFA 541
            + IP  WMLD+G++ ++K +S+ L + YMKR+  ELE    S+    + +L+LQGV FA
Sbjct: 306 DFQIPWEWMLDTGLIGQLKYSSLRLAQEYMKRIAKELE----SNGSGKEGNLMLQGVRFA 361

Query: 542 YKAHQFAGGLDSETLCAFEEIRQRVPGNLAG 572
           Y  HQFAGG D ETL  F E+++   G   G
Sbjct: 362 YTIHQFAGGFDGETLSIFHELKKITTGETRG 392


>AT4G04980.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT1G11070.1); Has 23100 Blast hits to 15699
           proteins in 1063 species: Archae - 116; Bacteria - 2262;
           Metazoa - 8308; Fungi - 3268; Plants - 3181; Viruses -
           958; Other Eukaryotes - 5007 (source: NCBI BLink). |
           chr4:2544210-2547893 REVERSE LENGTH=880
          Length = 880

 Score = 67.8 bits (164), Expect = 2e-11,   Method: Compositional matrix adjust.
 Identities = 67/280 (23%), Positives = 130/280 (46%), Gaps = 32/280 (11%)

Query: 301 ANIQKPPAIVELFHSLKNQ--------DGKKDSKGM--VNHQRPV--ASSAHSSIVGEIQ 348
           + +++   I  L+ +LK +          KK SKG   V  + PV  A S  +  + E+ 
Sbjct: 588 SKLRRSAQIANLYWALKGKLEGRGVEGKTKKASKGQNSVAEKSPVKVARSGMADALAEMT 647

Query: 349 NRSAHLLAIRADIETKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADERAVLK 408
            RS++   I  D++   + I +L   +      D++E+L+F   ++  L  L DE  VL 
Sbjct: 648 KRSSYFQQIEEDVQKYAKSIEELKSSIHSFQTKDMKELLEFHSKVESILEKLTDETQVLA 707

Query: 409 HFK-WPEKKADAMREAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKSERSIQ 467
            F+ +PEKK + +R A   Y++L  +  E+ ++K  ++ P    L K+    +K +  I+
Sbjct: 708 RFEGFPEKKLEVIRTAGALYKKLDGILVELKNWK--IEPPLNDLLDKIERYFNKFKGEIE 765

Query: 468 RLIKLRSSAIRSYQVYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIRNSDRE 527
            + + +    + ++ YNI     +D  ++ ++K+   T+V +    + + L+  R ++ E
Sbjct: 766 TVERTKDEDAKMFKRYNIN----IDFEVLVQVKE---TMVDVSSNCMELALKERREANEE 818

Query: 528 SSQDS----------LLLQGVHFAYKAHQFAGGLDSETLC 557
           +               L +   FA+K + FAGG D    C
Sbjct: 819 AKNGEESKMKEERAKRLWRAFQFAFKVYTFAGGHDERADC 858


>AT1G61080.1 | Symbols:  | Hydroxyproline-rich glycoprotein family
           protein | chr1:22493194-22497019 REVERSE LENGTH=907
          Length = 907

 Score = 52.8 bits (125), Expect = 7e-07,   Method: Compositional matrix adjust.
 Identities = 58/222 (26%), Positives = 94/222 (42%), Gaps = 56/222 (25%)

Query: 344 VGEIQNRSAHLLAIRADIETKGEFINDLIKKVVDAAYVDIEEVLKFVDWLDGELSTLADE 403
           + EI  +SA+ L I+ADI      IN+L  ++      D+ E+L F   ++  L  L DE
Sbjct: 707 LAEITKKSAYFLQIQADIAKYMTSINELKIEITKFQTKDMTELLSFHRRVESVLENLTDE 766

Query: 404 RAVLKHFK-WPEKKADAMREAAVEYRELKMLEQEISSFKDDLDIPCGASLRKMACLLDKS 462
             VL   + +P+KK +AMR A   Y +L  +  E+ + K  ++ P          LLDK 
Sbjct: 767 SQVLARCEGFPQKKLEAMRMAVALYTKLHGMITELQNMK--IEPPLNQ-------LLDKV 817

Query: 463 ERSIQRLIKLRSSAIRSYQVYNIPTAWMLDSGMMSKIKQASMTLVKMYMKRLTIELESIR 522
           ER                                +KIK+   T+V +    + + L+  R
Sbjct: 818 ER------------------------------YFTKIKE---TMVDISSNCMELALKEKR 844

Query: 523 NSDRESSQDS------------LLLQGVHFAYKAHQFAGGLD 552
           + ++  S D+            +L +   FA+K + FAGG D
Sbjct: 845 D-EKLVSPDAKPSLKKTVGSAKMLWRAFQFAFKVYTFAGGHD 885