Tanbir Ahmed, Ph. D.

Some AI-technology resources

Over the years, while exploring Artificial Intelligence and Machine Learning, I have encountered various AI frameworks, libraries, and tools. While some of these have become integral to my work and are used frequently, others have been explored occasionally, often driven by the challenges and requirements of projects. This page serves as a curated repository of links to an extensive collection of innovative AI resources that are changing the world.

Spending several years experimenting with cutting-edge AI tools and technologies, I understand how overwhelming it can be to keep track of the vast array of resources available. Whether it’s generative AI models, classical machine learning frameworks, or tools for data processing and visualization, each plays a critical role in shaping AI solutions. I’ve created this page not just as a personal reference but as a resource for fellow AI enthusiasts, researchers, and developers seeking clarity amidst the noise.

Generative AI technologies

Text Generation Technologies
GPT (Generative Pre-trained Transformer) - Natural language understanding and generation from OpenAI. (Paper)
BERT (Bidirectional Encoder Representations from Transformers) - Contextual text understanding from Google Research. (Paper)
T5 (Text-to-Text Transfer Transformer) - Text-to-text problem-solving framework from Google Research. (Paper)
BLOOM - Open multilingual language generation model from Hugging Face. (Paper)
Cohere - Language generation and retrieval-augmented generation.
Claude - AI assistant for dialogue and creative writing from Anthropic.

Image Generation Technologies
Imagen-3 - Generative text-to-image model from Google Deepmind.
DALL·E - Text-to-image generation from OpenAI.
Stable Diffusion - Open-source text-to-image generator from Stability.ai.

Video Generation Technologies
Veo - Generative Video model from Google Deepmind.
Movie Gen - Text-to-video model by Meta.
Gen-3 Alpha - Accessible tools for AI-powered image and video creation from RunwayML.
Imagen Video - High-definition video generation from text by Google Research.

Audio Generation Technologies
WaveNet - Natural text-to-speech synthesis Google Deepmind.
Jukebox - Music generation from OpenAI.
VALL-E - Voice replication model from Microsoft Research.
Overdub - Custom voice cloning for content creation by Descript.
Flow Machines - AI-generated music by Sony.
Lyrebird - AI for ultra-realistic voice cloning by Descript.

3D and Animation Technologies
DreamFusion - Text-to-3D model creation.
MetaHuman Creator - High-quality, customizable 3D human avatars.

Code Generation Technologies
Codex - Code generation and debugging by OpenAI.
AlphaCode - Competitive programming problem-solving from Google Deepmind.
CodeT5 - A Transformer-based model for code-related tasks from Serp.ai.

Scientific Applications
AlphaFold - Protein structure prediction from Google Deepmind.

Optimization and Design Technologies
Generative Design - AI-powered optimization tools for engineering and architecture by Autodesk.
Runway ML - AI tools for art creation and design.

Text-to-Anything Models
Gemini 2.0 - Generalist AI model from Google Deepmind.
Make-a-Video - Text-to-video generation by Meta AI. (Paper)
Phenaki - Long-form video generation from text by Google Research. (Paper)


NLP technologies

Language Models
• See "Text Generation Technologies" above.
Text Classification and Sentiment Analysis
BERT for Classification - Pre-trained model fine-tuned for text classification.
RoBERTa (A Robustly Optimized BERT) - An optimized version of BERT for downstream tasks.
FastText - Lightweight and efficient text classification.

Named Entity Recognition (NER)
spaCy - Open-source library for NER, parsing, and tokenization.
Flair - Framework for NER using contextual embeddings.
BERT-based NER Models - Pre-trained BERT fine-tuned for NER tasks.

Question Answering
SQuAD (Stanford Question Answering Dataset) - Benchmark dataset for QA models.
BERT for QA - Fine-tuned BERT models for extracting answers from context.
AllenNLP - Framework for building QA systems.

Machine Translation
Google Neural Machine Translation (GNMT) - Neural network for language translation.
MarianMT - Multilingual translation models on Hugging Face.
DeepL - High-quality translation service.

Text Summarization
T5 for Summarization - Summarizes text using text-to-text framework.
PEGASUS - Specialized model for abstractive text summarization.
OpenNMT - Toolkit for machine translation and summarization.

Speech Recognition and Text-to-Speech
Whisper - Automatic speech recognition by OpenAI.
WaveNet - Text-to-speech synthesis by DeepMind.
Mozilla TTS - Open-source text-to-speech framework.
Kaldi - Toolkit for speech recognition.

Information Retrieval
Elasticsearch - Distributed search engine for text-based queries.
Weaviate - Vector search engine for semantic search.
Cohere Semantic Search - AI-powered semantic search APIs.

Sentiment Analysis
NLTK - Toolkit for text preprocessing and sentiment analysis.
RoBERTa for Sentiment Analysis - Fine-tuned sentiment classification models.
VADER - Sentiment analysis tool optimized for social media.

Text Generation
GPT for Text Generation - Generate coherent and creative text.
Hugging Face Transformers - API for fine-tuning and generating text.

Preprocessing libraries
NLTK - Toolkit for tokenization, stemming, and parsing.
spaCy - Industrial-strength NLP for large-scale projects.
Flair - Easy-to-use library for embeddings and preprocessing.