Miyakogusa Predicted Gene

Lj1g3v3892320.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj1g3v3892320.1 tr|A9SWX5|A9SWX5_PHYPA Predicted protein
OS=Physcomitrella patens subsp. patens
GN=PHYPADRAFT_234309,40,0.000000000000007,FAMILY NOT NAMED,NULL;
seg,NULL; BAR/IMD domain-like,NULL,CUFF.31391.1
         (497 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT2G33490.1 | Symbols:  | hydroxyproline-rich glycoprotein famil...   273   2e-73
AT5G41100.1 | Symbols:  | FUNCTIONS IN: molecular_function unkno...   237   1e-62
AT5G41100.2 | Symbols:  | FUNCTIONS IN: molecular_function unkno...   237   2e-62
AT3G26910.2 | Symbols:  | hydroxyproline-rich glycoprotein famil...   231   1e-60
AT3G26910.1 | Symbols:  | hydroxyproline-rich glycoprotein famil...   231   1e-60

>AT2G33490.1 | Symbols:  | hydroxyproline-rich glycoprotein family
           protein | chr2:14183552-14187666 FORWARD LENGTH=623
          Length = 623

 Score =  273 bits (699), Expect = 2e-73,   Method: Compositional matrix adjust.
 Identities = 202/497 (40%), Positives = 277/497 (55%), Gaps = 50/497 (10%)

Query: 1   MKRQCDEKRDVYEYMIAQQXXXXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSLK 60
           M+R CDEKR+VYE M+ +Q              + QQLQ AHD+Y+ E TL  FRLKSLK
Sbjct: 132 MQRLCDEKRNVYEGMLTRQREKGRSKGGKGETFSPQQLQEAHDDYENETTLFVFRLKSLK 191

Query: 61  QGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFSGLEDDDGEDC 120
           QGQ+RSLLTQAARH+AAQL FF+K L SLE V+PHV+MV   QHIDY FSGLEDDDG+D 
Sbjct: 192 QGQTRSLLTQAARHHAAQLCFFKKALSSLEEVDPHVQMVTESQHIDYHFSGLEDDDGDDE 251

Query: 121 SEDAGND-DEIVEGTELSFNYRSNKQGPYTASTSPNSAEVEESRLSYVRASTAETAEIDK 179
            E+  ND  E+ +  ELSF YR N +     S++  S+E+  S +++ +     TA+ + 
Sbjct: 252 IENNENDGSEVHDDGELSFEYRVNDKDQDADSSAGGSSELGNSDITFPQIGGPYTAQ-EN 310

Query: 180 NQGDFKFS---TRDRRVSSYSAPIFAEKKFD-PAEKVRLLLSSSAAKPNAYVLPTPVNIK 235
            +G+++ S    RD R  S SAP+F E +   P+EK+  + S+   K N Y LPTPV   
Sbjct: 311 EEGNYRKSHSFRRDVRAVSQSAPLFPENRTTPPSEKLLRMRSTLTRKFNTYALPTPVETT 370

Query: 236 ETKTXXXXXXXXXXXXHN--------LWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILK 287
            + +             N        +W+SSPL+ +   K       S   +     +L+
Sbjct: 371 RSPSSTTSPGHKNVGSSNPTKAITKQIWYSSPLETRGPAK-----VSSRSMVALKEQVLR 425

Query: 288 ESNSDNTSIQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPV------ 341
           ESN  NTS +LP P A+GL   +             R++FSGPLT+KPL  KP+      
Sbjct: 426 ESNK-NTS-RLPPPLADGLLFSRLGTLK--------RRSFSGPLTSKPLPNKPLSTTSHL 475

Query: 342 -SGAFPRLPMPQPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKSSRVGHSAP 400
            SG  PR P+ +     +            +ISELHELPRPP   S+K   S  +G+SAP
Sbjct: 476 YSGPIPRNPVSKLPKVSSSPTASPTFVSTPKISELHELPRPPPRSSTK--SSRELGYSAP 533

Query: 401 LVFRNPDHPAANKFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTKYLDTRQ 460
           LV R     +      ++++ ASPLP PP  ++RSFSIP+SN RA  L+++ T  L T++
Sbjct: 534 LVSR-----SQLLSKPLITNSASPLPIPP-AITRSFSIPTSNLRASDLDMSKTS-LGTKK 586

Query: 461 IPEKVEVAASPPLTPIS 477
           +        SPPLTP+S
Sbjct: 587 L-----GTPSPPLTPMS 598


>AT5G41100.1 | Symbols:  | FUNCTIONS IN: molecular_function unknown;
           INVOLVED IN: biological_process unknown; LOCATED IN:
           plasma membrane; EXPRESSED IN: 23 plant structures;
           EXPRESSED DURING: 13 growth stages; BEST Arabidopsis
           thaliana protein match is: hydroxyproline-rich
           glycoprotein family protein (TAIR:AT3G26910.2); Has 1503
           Blast hits to 1197 proteins in 220 species: Archae - 4;
           Bacteria - 108; Metazoa - 481; Fungi - 318; Plants -
           186; Viruses - 39; Other Eukaryotes - 367 (source: NCBI
           BLink). | chr5:16447429-16450610 FORWARD LENGTH=586
          Length = 586

 Score =  237 bits (605), Expect = 1e-62,   Method: Compositional matrix adjust.
 Identities = 200/492 (40%), Positives = 267/492 (54%), Gaps = 86/492 (17%)

Query: 1   MKRQCDEKRDVYEYMIAQQXX-XXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSL 59
           MK+QC+EKRDV ++M+ +               +  +QL+ A DE ++EATLC FRLKSL
Sbjct: 133 MKQQCEEKRDVVKHMLMEHVKDKVQVKGTKGERLIRRQLETARDELQDEATLCIFRLKSL 192

Query: 60  KQGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFSGLEDDDGED 119
           K+GQ+RSLLTQAARH+ AQ++ F  GLKSLEAVE HVR+ A  QHID   S  +  +  D
Sbjct: 193 KEGQARSLLTQAARHHTAQMHMFFAGLKSLEAVEQHVRIAADRQHIDCVLS--DPGNEMD 250

Query: 120 CSEDAGNDDEIV-EGTELSFNYRSNKQGPYTASTSPNSAEVEESRLSYVRASTAETAEID 178
           CSED  +DD +V    ELSF+Y +++Q     ST   S +++++ LS+ R S A +A ++
Sbjct: 251 CSEDNDDDDRLVNRDGELSFDYITSEQRVEVISTPHGSMKMDDTDLSFQRPSPAGSATVN 310

Query: 179 KN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKPNAYVLPTPVNIKET 237
            + + +   S RDRR SS+SAP+F +KK D A++    ++ SA   NAY+LPTPV+ K +
Sbjct: 311 ADPREEHSVSNRDRRTSSHSAPLFPDKKADLADRSMRQMTPSA---NAYILPTPVDSKSS 367

Query: 238 KTXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSDNTSIQ 297
                          NLWHSSPL               EP I  AH   K++ S N   +
Sbjct: 368 PIFTKPVTQTNHSA-NLWHSSPL---------------EP-IKTAH---KDAES-NLYSR 406

Query: 298 LPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPVSGAFPRLPMP-----Q 352
           LPRPS                       AFSGPL  KP S         RLP+P     Q
Sbjct: 407 LPRPS---------------------EHAFSGPL--KPSST--------RLPVPVAVQAQ 435

Query: 353 PTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKS---SRVGHSAPLVFRNPDHP 409
            +SP+             RI+ELHELPRPPG Q + P +S     VGHSAPL   N +  
Sbjct: 436 SSSPRISPTASPPLASSPRINELHELPRPPG-QFAPPRRSKSPGLVGHSAPLTAWNQERS 494

Query: 410 AANKFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTKYLDTRQIPEKVE--V 467
                 ++V   ASPLP PP++V RS+SIPS NQRAMA           + +PE+ +  V
Sbjct: 495 NVVVSTNIV---ASPLPVPPLVVPRSYSIPSRNQRAMA----------QQPLPERNQNRV 541

Query: 468 AASP--PLTPIS 477
           A+ P  PLTP S
Sbjct: 542 ASPPPLPLTPAS 553


>AT5G41100.2 | Symbols:  | FUNCTIONS IN: molecular_function unknown;
           INVOLVED IN: biological_process unknown; LOCATED IN:
           plasma membrane; EXPRESSED IN: 23 plant structures;
           EXPRESSED DURING: 13 growth stages; BEST Arabidopsis
           thaliana protein match is: hydroxyproline-rich
           glycoprotein family protein (TAIR:AT3G26910.2); Has 1497
           Blast hits to 1191 proteins in 214 species: Archae - 4;
           Bacteria - 102; Metazoa - 485; Fungi - 316; Plants -
           187; Viruses - 37; Other Eukaryotes - 366 (source: NCBI
           BLink). | chr5:16447429-16450686 FORWARD LENGTH=582
          Length = 582

 Score =  237 bits (604), Expect = 2e-62,   Method: Compositional matrix adjust.
 Identities = 202/494 (40%), Positives = 267/494 (54%), Gaps = 90/494 (18%)

Query: 1   MKRQCDEKRDVYEYMIAQQXX-XXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSL 59
           MK+QC+EKRDV ++M+ +               +  +QL+ A DE ++EATLC FRLKSL
Sbjct: 133 MKQQCEEKRDVVKHMLMEHVKDKVQVKGTKGERLIRRQLETARDELQDEATLCIFRLKSL 192

Query: 60  KQGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFSGLEDDDGE- 118
           K+GQ+RSLLTQAARH+ AQ++ F  GLKSLEAVE HVR+ A  QHID   S    D G  
Sbjct: 193 KEGQARSLLTQAARHHTAQMHMFFAGLKSLEAVEQHVRIAADRQHIDCVLS----DPGNE 248

Query: 119 -DCSEDAGNDDEIV-EGTELSFNYRSNKQGPYTASTSPNSAEVEESRLSYVRASTAETAE 176
            DCSED  +DD +V    ELSF+Y +++Q     ST   S +++++ LS+ R S A +A 
Sbjct: 249 MDCSEDNDDDDRLVNRDGELSFDYITSEQRVEVISTPHGSMKMDDTDLSFQRPSPAGSAT 308

Query: 177 IDKN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKPNAYVLPTPVNIK 235
           ++ + + +   S RDRR SS+SAP+F +KK D A++    ++ SA   NAY+LPTPV+ K
Sbjct: 309 VNADPREEHSVSNRDRRTSSHSAPLFPDKKADLADRSMRQMTPSA---NAYILPTPVDSK 365

Query: 236 ETKTXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSDNTS 295
            +               NLWHSSPL               EP I  AH   K++ S N  
Sbjct: 366 SSPIFTKPVTQTNHSA-NLWHSSPL---------------EP-IKTAH---KDAES-NLY 404

Query: 296 IQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPVSGAFPRLPMP---- 351
            +LPRPS                       AFSGPL  KP S         RLP+P    
Sbjct: 405 SRLPRPS---------------------EHAFSGPL--KPSST--------RLPVPVAVQ 433

Query: 352 -QPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKS---SRVGHSAPLVFRNPD 407
            Q +SP+             RI+ELHELPRPPG Q + P +S     VGHSAPL   N +
Sbjct: 434 AQSSSPRISPTASPPLASSPRINELHELPRPPG-QFAPPRRSKSPGLVGHSAPLTAWNQE 492

Query: 408 HPAANKFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTKYLDTRQIPEKVE- 466
                   ++V   ASPLP PP++V RS+SIPS NQRAMA           + +PE+ + 
Sbjct: 493 RSNVVVSTNIV---ASPLPVPPLVVPRSYSIPSRNQRAMA----------QQPLPERNQN 539

Query: 467 -VAASP--PLTPIS 477
            VA+ P  PLTP S
Sbjct: 540 RVASPPPLPLTPAS 553


>AT3G26910.2 | Symbols:  | hydroxyproline-rich glycoprotein family
           protein | chr3:9915304-9918511 REVERSE LENGTH=614
          Length = 614

 Score =  231 bits (589), Expect = 1e-60,   Method: Compositional matrix adjust.
 Identities = 192/503 (38%), Positives = 266/503 (52%), Gaps = 74/503 (14%)

Query: 1   MKRQCDEKRDVYEYMIAQQXXXXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSLK 60
           MK+QCD KR+VYE  + ++                 + + A+ E+ +EAT+C FRLKSLK
Sbjct: 134 MKQQCDGKRNVYEMSLVKEKGRPKSSKGERH--IPPESRPAYSEFHDEATMCIFRLKSLK 191

Query: 61  QGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFS--GLEDDDGE 118
           +GQ+RSLL QA RH+ AQ+  F  GLKSLEAVE HV++    QHID   S  G E +  E
Sbjct: 192 EGQARSLLIQAVRHHTAQMRLFHTGLKSLEAVERHVKVAVEKQHIDCDLSVHGNEMEASE 251

Query: 119 DCSEDAGNDDEIVEGTELSFNYRSNKQGPYTASTS-PNSAEVEESRLSYVRASTAETAEI 177
           D  +D    +   EG ELSF+YR+N+Q    +S S P + +++++ LS+ R ST   A +
Sbjct: 252 DDDDDGRYMNR--EG-ELSFDYRTNEQKVEASSLSTPWATKMDDTDLSFPRPSTTRPAAV 308

Query: 178 DKN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKP--NAYVLPTPVNI 234
           + + + ++  STRD+ +SS+SAP+F EKK D +E++R       A P  NAYVLPTP + 
Sbjct: 309 NADHREEYPVSTRDKYLSSHSAPLFPEKKPDVSERLR------QANPSFNAYVLPTPNDS 362

Query: 235 KETK--TXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSD 292
           + +K  +             N+WHSSPL+  K+ KD  D                ESNS 
Sbjct: 363 RYSKPVSQALNPRPTNHSAGNIWHSSPLEPIKSGKDGKDA---------------ESNSF 407

Query: 293 NTSIQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPV------SGAFP 346
               +LPRPS       Q         +   R AFSGPL  +P S KP+      SGAF 
Sbjct: 408 YG--RLPRPSTTDTHHHQ--------QQAAGRHAFSGPL--RPSSTKPITMADSYSGAFC 455

Query: 347 RLPMP--------QPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKS---SRV 395
            LP P          +SP+             R++ELHELPRPPG  +  P ++     V
Sbjct: 456 PLPTPPVLQSHPHSSSSPRVSPTASPPPASSPRLNELHELPRPPGHFAPPPRRAKSPGLV 515

Query: 396 GHSAPLVFRNPDHPAAN-KFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTK 454
           GHSAPL   N +        PS  +  ASPLP PP++V RS+SIPS NQR ++       
Sbjct: 516 GHSAPLTAWNQERSTVTVAVPSATNIVASPLPVPPLVVPRSYSIPSRNQRVVS------- 568

Query: 455 YLDTRQIPEKVEVAASPPLTPIS 477
               R +  + ++ ASPPLTP+S
Sbjct: 569 ---QRLVERRDDIVASPPLTPMS 588


>AT3G26910.1 | Symbols:  | hydroxyproline-rich glycoprotein family
           protein | chr3:9915338-9918511 REVERSE LENGTH=608
          Length = 608

 Score =  231 bits (588), Expect = 1e-60,   Method: Compositional matrix adjust.
 Identities = 192/503 (38%), Positives = 266/503 (52%), Gaps = 74/503 (14%)

Query: 1   MKRQCDEKRDVYEYMIAQQXXXXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSLK 60
           MK+QCD KR+VYE  + ++                 + + A+ E+ +EAT+C FRLKSLK
Sbjct: 134 MKQQCDGKRNVYEMSLVKEKGRPKSSKGERH--IPPESRPAYSEFHDEATMCIFRLKSLK 191

Query: 61  QGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFS--GLEDDDGE 118
           +GQ+RSLL QA RH+ AQ+  F  GLKSLEAVE HV++    QHID   S  G E +  E
Sbjct: 192 EGQARSLLIQAVRHHTAQMRLFHTGLKSLEAVERHVKVAVEKQHIDCDLSVHGNEMEASE 251

Query: 119 DCSEDAGNDDEIVEGTELSFNYRSNKQGPYTASTS-PNSAEVEESRLSYVRASTAETAEI 177
           D  +D    +   EG ELSF+YR+N+Q    +S S P + +++++ LS+ R ST   A +
Sbjct: 252 DDDDDGRYMNR--EG-ELSFDYRTNEQKVEASSLSTPWATKMDDTDLSFPRPSTTRPAAV 308

Query: 178 DKN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKP--NAYVLPTPVNI 234
           + + + ++  STRD+ +SS+SAP+F EKK D +E++R       A P  NAYVLPTP + 
Sbjct: 309 NADHREEYPVSTRDKYLSSHSAPLFPEKKPDVSERLR------QANPSFNAYVLPTPNDS 362

Query: 235 KETK--TXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSD 292
           + +K  +             N+WHSSPL+  K+ KD  D                ESNS 
Sbjct: 363 RYSKPVSQALNPRPTNHSAGNIWHSSPLEPIKSGKDGKDA---------------ESNSF 407

Query: 293 NTSIQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPV------SGAFP 346
               +LPRPS       Q         +   R AFSGPL  +P S KP+      SGAF 
Sbjct: 408 YG--RLPRPSTTDTHHHQ--------QQAAGRHAFSGPL--RPSSTKPITMADSYSGAFC 455

Query: 347 RLPMP--------QPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKSSR---V 395
            LP P          +SP+             R++ELHELPRPPG  +  P ++     V
Sbjct: 456 PLPTPPVLQSHPHSSSSPRVSPTASPPPASSPRLNELHELPRPPGHFAPPPRRAKSPGLV 515

Query: 396 GHSAPLVFRNPDHPAAN-KFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTK 454
           GHSAPL   N +        PS  +  ASPLP PP++V RS+SIPS NQR ++       
Sbjct: 516 GHSAPLTAWNQERSTVTVAVPSATNIVASPLPVPPLVVPRSYSIPSRNQRVVS------- 568

Query: 455 YLDTRQIPEKVEVAASPPLTPIS 477
               R +  + ++ ASPPLTP+S
Sbjct: 569 ---QRLVERRDDIVASPPLTPMS 588