Tanbir Ahmed

Tanbir Ahmed, Ph. D.

Some AI-technology resources

Over the years, while exploring Artificial Intelligence and Machine Learning, I have encountered various AI frameworks, libraries, and tools. While some of these have become integral to my work and are used frequently, others have been explored occasionally, often driven by the challenges and requirements of projects. This page serves as a curated repository of links to an extensive collection of innovative AI resources that are changing the world.

Spending several years experimenting with cutting-edge AI tools and technologies, I understand how overwhelming it can be to keep track of the vast array of resources available. Whether it’s generative AI models, classical machine learning frameworks, or tools for data processing and visualization, each plays a critical role in shaping AI solutions. I’ve created this page not just as a personal reference but as a resource for fellow AI enthusiasts, researchers, and developers seeking clarity amidst the noise.

Generative AI technologies

Text Generation Technologies
• BERT (Bidirectional Encoder Representations from Transformers) - Contextual text understanding from Google Research. (Paper)
• BLOOM - Open multilingual language generation model from Hugging Face. (Paper)
• Claude - AI assistant for dialogue and creative writing from Anthropic.
• Cohere - Language generation and retrieval-augmented generation.
• GPT (Generative Pre-trained Transformer) - Natural language understanding and generation from OpenAI. (Paper)
• LLaMA (Large Language Model Meta AI) - Natural language understanding and generation from Meta AI. (Paper)
• T5 (Text-to-Text Transfer Transformer) - Text-to-text problem-solving framework from Google Research. (Paper)

Image Generation Technologies
• DALL·E - Text-to-image generation from OpenAI. (Paper)
• Imagen-3 - Generative text-to-image model from Google Deepmind. (Paper)
• Stable Diffusion - Open-source text-to-image generator from Stability.ai. (Paper)

Video Generation Technologies
• Gen-3 Alpha - Accessible tools for AI-powered image and video creation from RunwayML.
• Imagen Video - High-definition video generation from text by Google Research.
• Movie Gen - Text-to-video model by Meta.
• Veo - Generative Video model from Google Deepmind.

Audio Generation Technologies
• WaveNet - Natural text-to-speech synthesis Google Deepmind.
• Jukebox - Music generation from OpenAI.
• VALL-E - Voice replication model from Microsoft Research.
• Overdub - Custom voice cloning for content creation by Descript.
• Flow Machines - AI-generated music by Sony.
• Lyrebird - AI for ultra-realistic voice cloning by Descript.

3D and Animation Technologies
• DreamFusion - Text-to-3D model creation.
• MetaHuman Creator - High-quality, customizable 3D human avatars.

Code Generation Technologies
• Codex - Code generation and debugging by OpenAI.
• AlphaCode - Competitive programming problem-solving from Google Deepmind.
• CodeT5 - A Transformer-based model for code-related tasks from Serp.ai.

Scientific Applications
• AlphaFold - Protein structure prediction from Google Deepmind.

Optimization and Design Technologies
• Generative Design - AI-powered optimization tools for engineering and architecture by Autodesk.
• Runway ML - AI tools for art creation and design.

Text-to-Anything Models
• Gemini 2.0 - Generalist AI model from Google Deepmind. (Paper)
• Make-a-Video - Text-to-video generation by Meta AI. (Paper)
• Phenaki - Long-form video generation from text by Google Research. (Paper)

NLP technologies

Language Models
• See "Text Generation Technologies" above.

Text Classification and Sentiment Analysis
• BERT for Classification - Pre-trained model fine-tuned for text classification.
• RoBERTa (A Robustly Optimized BERT) - An optimized version of BERT for downstream tasks.
• FastText - Lightweight and efficient text classification.

Named Entity Recognition (NER)
• spaCy - Open-source library for NER, parsing, and tokenization.
• Flair - Framework for NER using contextual embeddings.
• BERT-based NER Models - Pre-trained BERT fine-tuned for NER tasks.

Question Answering
• SQuAD (Stanford Question Answering Dataset) - Benchmark dataset for QA models.
• BERT for QA - Fine-tuned BERT models for extracting answers from context.
• AllenNLP - Framework for building QA systems.

Machine Translation
• Google Neural Machine Translation (GNMT) - Neural network for language translation.
• MarianMT - Multilingual translation models on Hugging Face.
• DeepL - High-quality translation service.

Text Summarization
• T5 for Summarization - Summarizes text using text-to-text framework.
• PEGASUS - Specialized model for abstractive text summarization.
• OpenNMT - Toolkit for machine translation and summarization.

Speech Recognition and Text-to-Speech
• Whisper - Automatic speech recognition by OpenAI.
• WaveNet - Text-to-speech synthesis by DeepMind.
• Mozilla TTS - Open-source text-to-speech framework.
• Kaldi - Toolkit for speech recognition.

Information Retrieval
• Elasticsearch - Distributed search engine for text-based queries.
• Weaviate - Vector search engine for semantic search.
• Cohere Semantic Search - AI-powered semantic search APIs.

Sentiment Analysis
• NLTK - Toolkit for text preprocessing and sentiment analysis.
• RoBERTa for Sentiment Analysis - Fine-tuned sentiment classification models.
• VADER - Sentiment analysis tool optimized for social media.

Text Generation
• GPT for Text Generation - Generate coherent and creative text.
• Hugging Face Transformers - API for fine-tuning and generating text.

Preprocessing libraries
• NLTK - Toolkit for tokenization, stemming, and parsing.
• spaCy - Industrial-strength NLP for large-scale projects.
• Flair - Easy-to-use library for embeddings and preprocessing.