TaxoTagger, 2023-present
A Deep Learning and Semantic Search based classifier for DNA Barcode Identification
Project: AID: Artificial Intelligence for DNA barcode identification, a Small-Scale Initiatives Machine Learning project, awarded by the Netherlands eScience Center
Github: https://github.com/MycoAI/TaxoTagger
Github:https://github.com/MycoAI/taxotagger-webapp
Developers: Cunliang Geng, Sonja Georgievska, Duong Vu
MycoAI, 2024
A software package, written in Python, for exploring various AI models including MycoAI-CNN and MycoAI-BERT for (fungal) eDNA identification.
Github: https://github.com/MycoAI/MycoAI
Developers: Luuk Romeijn, Andrius Bernatavicius, Duong Vu
Paper: Romeijn, L., Bernatavicius, A., & Vu, D. (2024). MycoAI: Fast and accurate taxonomic classification for fungal ITS sequences. Molecular Ecology Resources, 24, e14006. https://doi.org/10.1111/1755-0998.14006
Dnabarcoder’s web-interface, 2023
Webpage: http://dnabarcoder.org/
Developers: Ruby van der Holst, Duong Vu
Dnabarcoder, 2022
A software package for computing similarity cut-offs for different (fungal) clades and for classifying eDNA sequences
Github: https://github.com/vuthuyduong/dnabarcoder
Developers: Duong Vu
Paper: Vu, D., Nilsson, R. H., & Verkley, G. J. M. (2022). Dnabarcoder: An open-source software package for analysing and predicting DNA sequence similarity cutoffs for fungal sequence identification. Molecular Ecology Resources, 22, 2793–2809. https://doi.org/10.1111/1755-0998.13651
WI SNP variant calling, 2021
A pipeline for SNP variant calling, using whole genome sequencing
Github: https://github.com/vuthuyduong/SNPanalysis
Developers: Duong Vu
This pipeline was listed in the following community paper comparing the 14 pipelines developed by different labs around the world for SNP variant calling analysis:
Paper: Xiao Li; José F. Muñoz; Lalitha Gade; Silvia Argimon; Marie-Elisabeth Bougnoux; Jolene R. Bowers; Nancy A. Chow; Isabel Cuesta; Rhys A. Farrer; Corinne Maufrais; Juan Monroy-Nieto; Dibyabhaba Pradhan; Jessie Uehling; Duong Vu; Corin A. Yeats; David M. Aanensen; Christophe d’Enfert; David M. Engelthaler; David W. Eyre; Matthew C. Fisher; Ferry Hagen; Wieland Meyer; Gagandeep Singh; Ana Alastruey-Izquierdo; Anastasia P. Litvintseva; Christina A. Cuomo (2023). Comparing genomic variant identification protocols for fungi. Microbial Genomics. https://doi.org/10.1099/mgen.0.000979
Deep learning based fungal classifiers, 2020
A framework for applying the deep learning models including Convolutional Neural Networks (CNN) and Deep Belief Networks (DBN) for fungal classification, and comparing the outputs with traditional methods suchs as BLAST and RDP
Github: https://github.com/vuthuyduong/fungiclassifiers
Developers: Duong Vu
Paper: Vu, D., Groenewald, M. & Verkley, G. Convolutional neural networks improve fungal classification. Sci Rep 10, 12628 (2020). https://doi.org/10.1038/s41598-020-69245-y
Fast Multi-Level Clustering, 2018
fMLC is the official implementation of the MultiLevel Clustering (MLC) algorithm decribed in Vu D. et al.(2014), used to cluster massive DNA sequences. It is written in C++ and supports multi-threaded parallelism. fMLC is also integrated with an interactive web-based tool called DIVE to visualize the resulting DNA sequences based embeddings in 2D or 3D. The work is financially supported by the Westerdijk Fungal Biodiversity Institute and the Netherlands eScience Center.
Project: AID: Artificial Intelligence for DNA barcode identification, a Small-Scale Initiatives Machine Learning project, awarded by the Netherlands eScience Cente
Github: https://github.com/vuthuyduong/fMLC
Developers: Szaniszlo Szoke, Sonja Georgievska, Duong Vu
Paper: D Vu, S Georgievska, S Szoke, A Kuzniar, V Robert, fMLC: fast multi-level clustering and visualization of large molecular datasets. Bioinformatics, Volume 34, Issue 9, May 2018, Pages 1577–1579, https://doi.org/10.1093/bioinformatics/btx810