What is dbCAN3
dbCAN3 server is a web server for automated Carbohydrate-active enzyme ANnotation, funded by the NSF (DBI-1933521) and NIH (R01GM140370). Similar resources on the web include CAZy, CAT (obsolete), and CUPP. dbCAN3 server is an updated version of dbCAN (obsolete) and dbCAN2 (obsolete) , and has the following new features (thanks to dbCAN users all over the world for suggestions):
- dbCAN3 server allows users predict glycan substrates for CAZymes by searching against dbCAN-sub, and for CAZyme gene clusters (CGCs) by using two approaches: searching against PULs of dbCAN-PUL and dbCAN-sub majority voting
- dbCAN3 server, like dbCAN2, allows submission of nucleotide sequences: prokaryotic genomes (fna file) or metagenome assembled genomes (MAGs); for eukaryotic genomes, please still submit protein seqs (faa file)
- dbCAN3 server, like dbCAN2, integrates three state-of-the-art tools/databases for automated CAZyme annotation:
- HMMER search for CAZyme family annotation vs. dbCAN CAZyme domain HMM database
- DIAMOND search for BLAST hits in the CAZy database
- HMMER search for CAZyme subfamily annotation vs. dbCAN-sub HMM database of CAZyme subfamilies (derived from eCAMI classification of CAZyDB families)
- dbCAN3 server can identify transcription factors (TFs), transporters (TCs), signal transduction proteins (STPs), and further CAZyme gene clusters (CGCs) using CGC-Finder if users submit faa+gff files or fna file
- dbCAN3 server combines the results from the three tools and allows visualization of detailed results as tables/graphs
dbCAN3 server will be updated once a year to use the most updated CAZy database, dbCAN HMMdb and dbCAN-sub HMMdb
News
- 8/2/2023: dbCAN HMMdb v12 is released (based on CAZyDB 7/26/2023). Now the HMMdb contains 783 CAZyme HMMs (470 family HMMs + 3 bacterial cellulosome HMMs + 2 fungal cellulosome HMMs + 308 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 2,816,770 fasta sequences. See readme for details.
- 05/01/2023: dbCAN3 paper is published at Nucleic Acids Research featuring substrate prediction
- 02/11/2023: dbCAN2 is updated to dbCAN3 with glycan substrate prediction functions: 1. CAZyme substrate prediction based on dbCAN-sub ; 2. CGC substrate prediction based on dbCAN-PUL searching and dbCAN-sub majority voting. For CGC substrate prediction, please see our dbCAN-seq update paper for details. With these new functions (esp. the dbCAN-sub search), dbCAN3 is now slower to get the result back to you. Please be patience!
- 8/9/2022: dbCAN HMMdb v11 is released (based on CAZyDB 8/7/2022). Now the HMMdb contains 699 CAZyme HMMs (452 family HMMs + 3 cellulosome HMMs + 244 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 2,428,817 fasta sequences. See readme for details.
- 06/29/2022: dbCAN-sub (HMMdb from eCAMI subfams and allows EC and substrate inferences) is now deployed on dbCAN meta server and replaces eCAMI (consumes too much RAM and too slow).
- 12/21/2021: updated run_dbcan python package to V3.0.1. Major updates include: (1) replaced Hotpep with eCAMI (recommended by an evaluation study); (2) added EC number in the overview output file (inferred by eCAMI); (3) formated cgc.out to make it more readable. The web server has been updated accordingly.
- 10/03/2021: updated CAZyDB for Diamond search. Now this file contains 2,161,786 fasta sequences. The old CAZyDB fasta file CAZyDB.07292021.fa was deleted in the download folder.See readme for details.
- 8/17/2021: dbCAN HMMdb v10 is released (based on CAZyDB 7/26/2020). Now the HMMdb contains 692 CAZyme HMMs (445 family HMMs + 3 cellulosome HMMs + 244 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 1,776,583 fasta sequences. See readme for details.
- 04/28/2021: We received an NIH R01 award to continue the development of dbCAN family tools
- 8/04/2020: dbCAN HMMdb v9 is released (based on CAZyDB 7/30/2020). Now the HMMdb contains 681 CAZyme HMMs (434 family HMMs + 3 cellulosome HMMs + 244 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 1,716,043 fasta sequences. See readme for details.
- 04/21/2020: dbCAN2 Hotpep PPR patterns updated to most recent release of CAZyDB (2019). Also missing group EC# files for families added in.
- 10/07/2019 run_dbcan python package is released. You should not only use pip install run-dbcan==2.0.0 to download it, but also install Miniconda or Anaconda as well to install dependencies packages(conda install -c bioconda diamond hmmer=3.1b2 prodigal fraggenescan). And then use only one command to download and compress all the related databases from Download section. Read more on run_dbcan2.
- /08/2019: dbCAN HMMdb v8 is released (based on CAZyDB 7/26/2019). Now the HMMdb contains 641 CAZyme HMMs (421 family HMMs + 3 cellulosome HMMs + 217 subfamily HMMs). The CAZyDB for Diamond search is also updated, containing in total 1,386,849 fasta sequences. See readme for details.
- 4/01/2019: dbCAN2 has a docker version written by Haidong Yi.
- 3/19/2019: dbCAN2 web server has moved to UNL and has a new URL
- 1/20/2019: dbCAN2 standalone package is available on github; if you prefer to still use the old hmmscan way, the data are available in the download page
- 8/25/2018: dbCAN HMMdb v7 is released (based on CAZyDB 7/31/2018): HMMs of 15 new families were added (AA14, AA15, CBM82, CBM83, GH146, GH147, GH148, GH149, GH150, GH151, GH152, GH153, GT105, GT106, PL28), GT2 family HMM now is replaced with 8 Pfam HMMs (GT2_Chitin_synth_1, GT2_Chitin_synth_2, GT2_Glycos_transf_2, GT2_Glyco_tranf_2_2, GT2_Glyco_tranf_2_3, GT2_Glyco_tranf_2_4, GT2_Glyco_tranf_2_5, GT2_Glyco_trans_2_3)
- 5/2/2018: dbCAN2 meta server paper is accepted to publish at Nucleic Acids Research
- 8/15/2017: Tanner and Le Huang begin to work on dbCAN2 meta server
- 7/1/2017: Yanbin is awarded the NSF CAREER grant for CAZyme bioinformatics research