help

phyloT automatically generates phylogenetic trees based on the NCBI taxonomy or the Genome Taxonomy Database (GTDb). NCBI taxonomy attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources, and phyloT generated trees which use NCBI as a source simply represent the current taxonomic structure of the NCBI taxonomy database. They are not "proper" phylogenetic trees, and do not contain branch lengths or clade support values. On the other hand, phyloT trees which use GTDb as a source are pruned versions of the GTDb reference trees (either Bacterial or Archaeal), and therefore contain branch lengths and clade support values.

Please check the data source web sites for detailed information on their methodology:

How to use phyloT

After parsing the complete NCBI taxonomy or GTDb, phyloT will generate a pruned tree in the selected format, based on the tree elements you provide. Tree elements can by typed or pasted into the provided input box ("Tree elements"), or uploaded in a plain text file. Depending on the taxonomy source selected, you can use any combination of the following element types, separated by commas or newlines:

NCBI Taxonomy:

NCBI taxonomy IDs: they will appear as leaves in the generated tree, regardless of their taxonomic class, unless you provide 'overlapping' IDs, ie. some IDs are child nodes of other used IDs. In that case, some IDs may appear as internal tree nodes. Classes can be mixed freely, ie. one ID can represent a species and another a phylum
NCBI scientific names: make sure you provide the proper NCBI scientific name (including proper capitalization). For example, human should be specified as "Homo sapiens". Otherwise, they function exactly the same as NCBI taxonomy IDs. Please note that we exclude names containing special characters from our database, please use numeric taxonomy IDs instead. You can also use the taxonomy names search box, described below.
UniProt species identification codes: Codes should be in captial letters, exactly as listed in UniProt speclist.txt. They will be replaced by the corresponding NCBI taxonomy ID or scientific name.
UniProt protein IDs or ACCs: proteins will appear as leaves in the generated tree, while the clades will be NCBI taxonomy IDs. If multiple IDs map to the same species, they will all be grouped into a multifurcating clade.
NCBI GenBank locus ACCs: they function the same as Uniprot IDs. Please omit the version number and use only the base ACC (e.g. use KEJ01913 and not KEJ01913.1)
RefSeq protein IDs: they function the same as Uniprot IDs. Only RefSeq IDs which can be mapped directly to a species can be used. Please omit the version number and use only the base ID (e.g. use WP_076177537 and not WP_076177537.1)

Genome Taxonomy Database:

GTDb taxonomy name: they function the same way as NCBI taxonomy IDs or scientific names described above. You can omit the class prefix and the 2 underscores (e.g. either s__Escherichia coli or Escherichia coli can be used)
GTDb genome IDs: these should be fully qualified RefSeq or GenBank genome accession numbers (e.g. GCF_000005845.2)

Click any of the Example buttons below the main input field to see various example tree element combinations.

Searching for a taxonomic name

Use the 'Search taxonomy' box to quickly search the full database of NCBI taxonomy or GTDb names. Each result will show the full scientific name with its taxonomic class and taxonomy ID. Clicking on any result will append its tax ID to the current tree elements.

Including full clades

phyloT supports generation of full clade trees for any taxonomy ID or scientific name. To include the complete sub-tree for a taxonomy ID (or scientific name), simply append a vertical line followed by the keyword "subtree" to the input element.

For example:

Mammalia|subtree
7147|subtree

The example input above would generate a tree containing 2 clades: Mammalia and Insecta (NCBI Tax ID 7147). All NCBI nodes belonging to these clades would be included.

Filtering nodes

In the example above, the final tree would contain ~33 000 nodes, making it hard to manipulate or visualize. phyloT offers two mechanisms for filtering nodes from generated trees:

Interrupting at a specific class
Using the "Interrupt at" selector, you can specify a taxonomic class where the tree generation will be stopped. For example, if "Genus" is selected, leaves of the included clades will correspond to nodes with taxonomic class genus. Note that this applies only to full clades (ie. nodes appended with keyword "|subtree"). You can still include additional elements corresponding to "higher" taxonomic classes, and these will be present in the tree.
For example:
```
Mammalia|subtree
7147|subtree
Escherichia_coli
```
In the example above, with "Interrupt at" set to Genus, Escherichia coli would still appear as a regular leaf in the tree, even though its class is species (ie. 'higher' than genus).
Removing nodes matching a text pattern
You can remove all nodes whose scientific names match a certain pattern. Simply type the words and phrases which should be filtered into the Filtering field, separated with commas.

For example:
```
Ascomycota|subtree
```
In the example above, full Ascomycota phylum tree would contain ~55 000 nodes. However, if various unclassified species and environmental samples are removed, only ~3500 nodes remain. This can be accomplished by entering, for example, "environmental_sample,unclassified" into the Filtering field.

Genome Taxonomy Database specific options

GTD taxonomy covers Bacteria and Archaea only, and phyloT uses their respective reference trees when creating a pruned version. Since these trees are independent, you have to specify which one to use, by selecting the correct entry under Source taxonomy.

Branch lengths and node support values

As opposed to NCBI taxonomy, the GTD trees are proper phylogenetic reference trees, containing branch lengths and clade support values (bootstraps). If you want to include these values in the phyloT generated tree, set the Support/BRL option to Yes.

Including genome IDs

If you use genome IDs (RefSeq or GeneBank accession numbers) to create the tree, these will normally be mapped to their corresponding species name (which will be shown as a leaf in the tree). If you select the option to include the genome IDs in the tree, these will be added as leaves, while their parent node will be the species name.

Output options

Node identifiers (only for NCBI taxonomy)

Nodes of the generated tree can be represented by four identifier types:

Scientific names: nodes will be NCBI scientific names, e.g. Homo sapiens
NCBI Taxonomy IDs: nodes will be NCBI taxonomy IDs, e.g. 9606
Name|Tax ID: nodes will contain both NCBI scientific name and NCBI taxonomy IDs (separated with a vertical line), e.g. Homo sapiens|9606
Tax ID|Name: nodes will contain both NCBI taxonomy ID and NCBI scientific name (separated with a vertical line), e.g. 9606|Homo sapiens

When setting the identifier format to "NCBI Taxonomy IDs", all internal nodes of the tree will be prefixed with keyword "INT" (for example, node 7147 will be labeled as INT7147). This is done to prevent various tree parsers from misidentifying these IDs as tree support values (bootstraps).

Collapsing internal nodes (only for NCBI taxonomy)

Due to the nature of the underlying NCBI taxonomy data and depending on the provided tree elements, generated trees will often have many internal nodes which have only one child. phyloT therefore offers an option to remove such nodes, by setting the "Internal nodes" option to "Collapsed".

Forcing creation of binary trees (only for NCBI taxonomy)

Many nodes in NCBI taxonomy are highly polytomic (have many child branches). If your tree visualization or analysis software requires a strictly binary tree, where each node must have exactly two children, you can set the "Polytomy" option to "No". phyloT will then randomly combine multifurcating nodes into separate bifurcating structures and introduce additional ("fake") internal nodes as required, producing a proper binary tree.

Tree format and file name

In addition to the commonly used Newick format, you can also download the generated trees in NEXUS or phyloXML formats. If a file name is not provided when generating a tree, a randomly generated string will be used.

Visualizing generated trees in iTOL

If you only want to visualize the generated tree, phyloT offers a direct link to iTOL: interactive Tree Of Life. Simply click the "Visualize in iTOL" button. iTOL is an online phylogenetic tree visualization tool, offering powerful annotation features. Check the iTOL website for more details.

Uploading generated trees to your iTOL account

If you have an iTOL: interactive Tree Of Life account, you can upload generated trees directly to any of your iTOL projects. Simply click the "Upload to your iTOL account" button, and a list of your iTOL workspaces and projects will be shown. Select the project where you want to upload the tree and click the Upload to selected project button. Note that you must first get the iTOL API key (from your iTOL user info page), and save it in your phyloT account, by clicking the Update iTOL API key button.

User account

Trees created from more than 10 elements require either an active phyloT subscription, institutional access or a tree generation token. Please create a phyloT account first, and then visit your personal info page to view all available options.

Tree generation tokens:

If you don't use phyloT often, tree generation tokens are the simplest solution. They do not expire, and can be purchased directly, or simply obtained by using our sponsor's adverts. Each generated tree requires one token, but you can freely change the phyloT tree options and file format, as long as the tree elements remain the same. Each tree (defined by the tree elements used to generate it) remains freely accessible for 6 months. You can also share our sponsor advert links with your friends. Any tokens generated through those links will be credited to your phyloT account.

Subscription:

phyloT offers monthly or annual subscriptions for unrestricted access. Click the 'Start/extend subscription' button on your personal info page to display the available options. Note that all phyloT subscriptions are non-recurring and will never be extended automatically. Once your subscription expires, you will have to manually reactivate it.

Institutional access:

If your institution currently has a valid phyloT license, you either have direct unrestricted access to phyloT via your IP address, or you were given a phyloT access key. To use a license key, click on the 'Provide access key' button on your personal info page to activate your account. If you are interested in this mode of access, please have your librarian or IT department to obtain a license.