Create a multiple sequence alignment

Build your own multiple sequence alignment

The multiple sequence alignment will be built using the proteins from one or more protein families and a number of protein filters (currently on a species filter is available).

The first step towards having your own multiple sequence alignment built is selecting one or more protein families from the tree. Your are free to select whatever you want, however the quality of the alignment will best if you select families that are close in the tree.
Selected protein families appear on the right side of the page, from where you can remove them if you made a mistake or submit the selected families which will take you to the next step of the process, selecting the filters.

After you have selected you protein families, the filter part is shown. Using these filters you can determine which proteins and wich residue positions are included in the alignment. More information about how the filters work and how to use them is provided when you need it.

Note: Due to computational restrictions it is not possible to generate very large alignments. The upper limit lies around 175,000 residues. This is approximately equivalent to 850 full proteins, or more if you select specific residues to be aligned. If you select a bigger set, you will get a warning telling you that you should limit your selection.

When both the protein families and the filters are submitted, a submit button appears that will submit the alignment job. Please be patient, some jobs can take quite a while.

Step 1: Select the proteins to align

Protein search help

This search form contains a number of fields, all of which are optional (although at least one field is needed to perform a search). The search terms you enter are 'ANDed' together, meaning that the proteins that are found comply with ALL the terms you have specified.

Id's and accession codes

The identifiers in this database correspond to the ID fields in UniProt records and records from other databases. These look like 'adrb2_human' and 'opsd_bovin'. To search for entries like 'P07550' use the accession code field.

Gene names

A large number of proteins is linked to a gene, for which you can search using the 'gene' box. Searching is case insensitive.

Descriptions

Most proteins have a short description of their function. These descriptions can be searched by using the description field. Searching is case insensitive and wildcards are used on both sides of your query, so 'rhodop' and 'opsin' will both match a description containing the word 'Rhodopsin'.

Species

You can search for proteins from certain species by entering a (part of) the scientific name of the species. If you would like to search for human proteins, entering 'homo sapiens' in the species search field would do just that.

Families

Using the families search box you can search for proteins that have a specific family name. This search is done both on all the subfamily levels: searching for 'amine' will return all proteins in the amine protein family (as well as proteins from protein families which have amine in their family name). Searching is case insensitive and wildcards are used on both sides of your query, so 'adrenoceptor' will match all the beta-adrenoceptor and alpha-adrenoceptor families

Mutant data

Selecting the option 'has mutant data' will return only proteins for which there is mutant data available.

Structure data

Selecting the option 'has structure data' will return only proteins for which there is structure data available.

Oligomer data

Selecting the option 'has oligomer data' will return only proteins for which there is oligomerization data available.

Ligand binding data

Selecting the option 'has ligand data' will return only proteins for which there is ligand binding data available.

Select the proteins you would like to align by searching the database and selecting your protein.
To search, use the search box below.

Selected proteins will appear on the right, where you can edit the list or submit it for alignment.

 PROTEIN SEARCH 

Protein search help

This search form contains a number of fields, all of which are optional (although at least one field is needed to perform a search). The search terms you enter are 'ANDed' together, meaning that the proteins that are found comply with ALL the terms you have specified.

Id's and accession codes

The identifiers in this database correspond to the ID fields in UniProt records and records from other databases. These look like 'adrb2_human' and 'opsd_bovin'. To search for entries like 'P07550' use the accession code field.

Gene names

A large number of proteins is linked to a gene, for which you can search using the 'gene' box. Searching is case insensitive.

Descriptions

Most proteins have a short description of their function. These descriptions can be searched by using the description field. Searching is case insensitive and wildcards are used on both sides of your query, so 'rhodop' and 'opsin' will both match a description containing the word 'Rhodopsin'.

Species

You can search for proteins from certain species by entering a (part of) the scientific name of the species. If you would like to search for human proteins, entering 'homo sapiens' in the species search field would do just that.

Families

Using the families search box you can search for proteins that have a specific family name. This search is done both on all the subfamily levels: searching for 'amine' will return all proteins in the amine protein family (as well as proteins from protein families which have amine in their family name). Searching is case insensitive and wildcards are used on both sides of your query, so 'adrenoceptor' will match all the beta-adrenoceptor and alpha-adrenoceptor families

Mutant data

Selecting the option 'has mutant data' will return only proteins for which there is mutant data available.

Structure data

Selecting the option 'has structure data' will return only proteins for which there is structure data available.

Oligomer data

Selecting the option 'has oligomer data' will return only proteins for which there is oligomerization data available.

Ligand binding data

Selecting the option 'has ligand data' will return only proteins for which there is ligand binding data available.

Selected proteins