AI News & Library Index

A curated, link-backed index of AI library releases, model updates, and tooling changes—useful if you are studying modern ML/AI stacks or choosing what to learn next alongside Paath.online tutoring. Each entry points to primary sources; we summarize context, not replace documentation.

What is this index?

This page tracks new AI and ML library releases (left) and AI news (right). Each library entry includes a short summary and a source link; each news item includes a detailed summary and official source. We update it regularly to help you stay current with Python AI frameworks, RAG tools, agent libraries, and model releases.

New library

  • 3 May 2026

    TradingAgents — multi-agent LLM financial trading framework (LangGraph)

    Tauric Research open-source Python framework: analyst/researcher/trader/risk agents, multiple LLM providers (OpenAI, Gemini, Claude, Grok, DeepSeek, Qwen, GLM, OpenRouter, Ollama, Azure), CLI + TradingAgentsGraph API, decision log and checkpoint resume. Apache-2.0; cite arXiv:2412.20138.

    Source: TauricResearch (GitHub)
  • 6 Apr 2026

    Docling — structured PDF & document conversion for AI/RAG pipelines

    The docling-project converts PDFs and office-style documents into structured JSON/Markdown-friendly outputs with layout awareness; widely used upstream of RAG and agent tooling. Official code and docs live on GitHub and docling-project.github.io.

    Source: docling-project (GitHub)
  • 6 Apr 2026

    OpenDataLoader — PDF ingestion focused on reproducible benchmarks

    OpenDataLoader publishes an open PDF data loader stack and benchmark materials aimed at comparable evaluation across parsers and retrieval setups; see opendataloader.org and the GitHub org for install and API details.

    Source: OpenDataLoader
  • 6 Apr 2026

    Scrapling — adaptive Python scraping with resilient selectors

    Scrapling (D4Vinci) is a Playwright-based scraping toolkit that emphasizes selectors that adapt when sites change; distributed on PyPI with documentation on Read the Docs. Teams should still respect robots.txt, site terms, and rate limits.

    Source: Scrapling (GitHub)
  • 5 Apr 2026

    Elasticsearch — Reciprocal Rank Fusion (RRF) retriever

    Official Elastic docs describe RRF for merging two or more child retrievers (for example kNN vector search plus lexical query) into a single ranking without requiring scores on a common scale. Parameters include rank_constant (default 60) and rank_window_size.

    Source: Elastic documentation
  • 5 Apr 2026

    Pinecone — Hybrid search (dense + sparse vectors)

    Pinecone’s guides explain combining semantic (dense embedding) retrieval with lexical (sparse) signals—either in one hybrid-capable index or separate indexes—plus trade-offs for reranking and operations.

    Source: Pinecone docs
  • 4 Apr 2026

    Weaviate — Hybrid search (vector + BM25)

    Weaviate documents hybrid queries that fuse dense vector similarity with BM25-style keyword relevance, including fusion strategies so teams can balance “meaning match” vs “exact term match.”

    Source: Weaviate docs
  • 3 Apr 2026

    OpenSearch — Neural + lexical search patterns

    OpenSearch documents neural (vector) search alongside traditional text search, used in many RAG stacks that need both semantic recall and exact keyword hits (SKUs, IDs, citations).

    Source: OpenSearch docs
  • 15 Mar 2026

    Unsloth — Run & fine‑tune open models locally (LoRA/QLoRA, Studio)

    Unsloth helps you run and train 500+ open models locally with strong VRAM efficiency. Includes Unsloth Studio UI, LoRA/QLoRA fine-tuning guides, exports (GGUF/safetensors), and training observability.

    Source: Unsloth docs
  • 11 Mar 2026

    NVIDIA Nemotron 3 Super — Open hybrid Mamba‑Transformer MoE for agentic reasoning

    Open weights + recipes for an agent-focused hybrid Mamba‑Transformer MoE with native long context. Built to reduce “thinking tax” and improve throughput for multi-step agent workflows.

    Source: NVIDIA developer blog
  • 9 Mar 2026

    Context Hub — Andrew Ng’s CLI for AI coding agent docs & persistent memory

    Open-source CLI from DeepLearning.AI: curated, versioned API documentation for AI agents. chub search/get/annotate; language-specific docs (Python, JS); persistent local annotations. MIT, npm @aisuite/chub.

    Source: GitHub
  • 5 Mar 2026

    OpenAI Symphony — Agentic framework for autonomous AI agents

    Open-source agentic framework (Elixir/BEAM) for orchestrating autonomous AI coding agents with structured, scalable implementation runs. Integrates with issue trackers like Linear and focuses on reliable agent workflows.

    Source: MarkTechPost
  • 5 Mar 2026

    Luma Agents — Creative AI agents on Unified Intelligence models

    End-to-end creative AI agents powered by Luma’s Uni-1 unified intelligence model. Orchestrate multimodal workflows across text, image, video, and audio with persistent context, collaborating with models like Google Veo 3 and ElevenLabs.

    Source: TechCrunch
  • 1 Mar 2026

    Open SWE — Open-source framework for internal coding agents

    LangChain’s Open SWE shares production patterns for internal coding agents: sandboxed execution, curated toolsets, and multi-step agent orchestration with LangGraph.

    Source: LangChain blog
  • 15 Feb 2026

    Recursive Language Models (RLMs) — MIT CSAIL

    Novel inference paradigm for long-context AI: process 10M+ tokens via symbolic recursion and code execution. RLM-Qwen3-8B outperforms base by 28.3% on long-context tasks. Paper and code available.

    Source: arXiv paper
  • 1 Feb 2026

    LiteMind v2026.2 — Unified multimodal AI framework

    Unified API for OpenAI, Anthropic, Google Gemini, and Ollama. Agentic ReAct-style framework, built-in RAG, tool integration. Native support for text, images, audio, video, and PDFs. Python 3.10+.

    Source: PyPI
  • 1 Feb 2026

    Helix — Production agent framework with budget limits

    Semantic caching (40–70% API cost reduction), persistent memory, multi-agent teams, YAML pipelines. Supports OpenAI, Anthropic, Gemini, Groq, Mistral, and 8+ providers.

    Source: GitHub
  • 1 Feb 2026

    PageIndex — Vectorless, reasoning-based RAG

    RAG without vector DBs: builds a tree-structured index (table-of-contents) from documents and uses LLM reasoning + tree search for retrieval. 98.7% on FinanceBench; explainable, section-level references. Chat, API, MCP. By VectifyAI.

    Source: GitHub
  • 1 Feb 2026

    LlamaIndex 0.14 — RAG & agent updates

    Security and crash fixes, TokenBudgetHandler for cost control, agent retry logic for empty LLM responses, LangChain 1.x support. RAG and workflow framework for production.

    Source: LlamaIndex changelog
  • 1 Feb 2026

    AIST aiaccel — ML research acceleration

    Toolkit for HPC clusters: PyTorch/Lightning training, hyperparameter optimization, OmegaConf config. For large-scale ML research.

    Source: PyPI
  • 15 Jan 2026

    Voyage 4 — Embedding models & multimodal

    voyage-4-large (RTEB leaderboard), voyage-4-lite, voyage-4-nano (open-weights). Shared embedding space; voyage-multimodal-3.5 with video retrieval. On Azure, AWS, GCP, MongoDB Atlas.

    Source: Voyage AI blog
  • 1 Jan 2026

    RAGdb — Embeddable SQLite RAG (no vector DB)

    Single-file .ragdb SQLite database: ingestion, multimodal extraction, hybrid retrieval (TF-IDF + keyword) in one portable file. No Docker/cloud; ~99.5% smaller than typical RAG stacks. Python 3.9+, pip install ragdb.

    Source: GitHub
  • 1 Jan 2026

    Orca AI SDK — Unified LLM interface

    Provider-agnostic library for OpenAI, Anthropic, Google Gemini, OpenRouter. Full async/sync and streaming. Simplifies multi-provider apps.

    Source: PyPI
  • 1 Jan 2026

    Trinity-RFT — Reinforcement fine-tuning for LLMs

    Framework for training LLMs with reinforcement fine-tuning (RFT). Python 3.10+. For researchers and practitioners scaling RFT.

    Source: PyPI
  • 30 Sept 2025

    Model Context Protocol (MCP) — Agent tool standard

    Open standard for connecting AI assistants to tools and data. Adopted by major vendors. Safer, auditable integrations for agents.

    Source: MCP site

AI news

  • 27 Apr 2026

    OpenAI & Microsoft amend partnership — multicloud distribution, non-exclusive IP license to 2032

    OpenAI’s “next phase” post spells out commercial clarity: Microsoft remains the primary cloud partner with ship-first rights subject to capability support; OpenAI gains flexibility to host offerings broadly; Microsoft’s model/product IP license runs through 2032 but is no longer exclusive; financial flows shift (Microsoft stops paying OpenAI revenue share; OpenAI continues capped revenue share to Microsoft through 2030). Microsoft’s official blog mirrors the story for enterprise customers. The takeaway for builders is structural: frontier APIs will increasingly be multicloud and multi-surface, even when one hyperscaler remains “first among equals.”

    Source: OpenAI
  • 24 Apr 2026

    Google Cloud Next ’26 — TPU 8t/8i, Virgo fabric, and agent-native Kubernetes

    Google Cloud’s Next ’26 infrastructure posts describe hardware purpose-built for agentic workloads: TPU 8t superpod scale and TPU 8i memory/latency optimizations (including ~80% better inference $/perf vs prior gen in Google’s accounting), Virgo networking for megascale fabrics, and storage advances (Managed Lustre, Rapid Buckets) to keep accelerators fed. Software-side, TorchTPU preview and GKE enhancements aim to cut time-to-first-token and operational friction for multi-agent services. For learners, this is a practical reminder that agent UX depends as much on orchestration + silicon + networking as on raw model scores.

    Source: Google Cloud Blog
  • 23 Apr 2026

    OpenAI releases GPT‑5.5 — agentic coding, stronger knowledge work, expanded safeguards

    OpenAI’s “Introducing GPT‑5.5” post positions the model as a step change for messy, multi-step computer work: coding, research, documents, spreadsheets, and tool-heavy workflows. The announcement publishes benchmark tables (for example Terminal-Bench 2.0 at 82.7% and SWE-Bench Pro at 58.6% in OpenAI’s charts) and discusses latency parity with GPT‑5.4 while improving capability and token efficiency on Codex tasks. Safety content highlights expanded preparedness testing and an updated GPT‑5.5 system card. Developers should read the primary post for availability timelines (ChatGPT tiers vs API) and policy constraints rather than relying on third-party summaries alone.

    Source: OpenAI
  • 22 Apr 2026

    OpenAI ships Privacy Filter — open-weight PII detection model (Apache 2.0)

    OpenAI Privacy Filter targets high-throughput privacy workflows: token-classification architecture, up to 128K-token context, eight label categories spanning people, contact details, account numbers, and secrets, and reported strong F1 scores on PII-Masking-300k (including a corrected benchmark variant). The release is explicitly not a compliance certification—teams still need policy, review, and domain fine-tuning. Availability under Apache 2.0 makes it suitable for on-prem and pipeline integration before data hits third-party LLM APIs—an increasingly common requirement for education, healthcare-adjacent tooling, and enterprise RAG.

    Source: OpenAI
  • 16 Apr 2026

    Anthropic ships Claude Opus 4.7 (GA) with stronger coding, vision, and new cybersecurity guardrails

    Anthropic’s announcement positions Opus 4.7 as a step up from Opus 4.6 on difficult software-engineering work, with image inputs up to 2,576 px on the long edge (~3.75 MP). The post warns that prompts tuned for older models may need retuning because instruction following is more literal. For cybersecurity, Anthropic states Opus 4.7 is less broadly capable than Claude Mythos Preview and ships safeguards that refuse many high-risk requests, steering legitimate security professionals toward its Cyber Verification Program. Developers should read the linked system card and migration guide (tokenizer/token usage changes, effort levels including xhigh). Third-party model comparisons in charts reference “best reported model version available via API” for GPT‑5.4 and Gemini 3.1 Pro per the announcement footnote.

    Source: Anthropic
  • 7 Apr 2026

    Project Glasswing & Claude Mythos Preview — Anthropic’s defensive cybersecurity coalition

    On anthropic.com/glasswing, Anthropic frames Project Glasswing as industry collaboration to secure critical software using frontier model capabilities, centered on Claude Mythos Preview. The page includes evaluation-style figures such as CyberGym “Cybersecurity Vulnerability Reproduction” (reported Mythos Preview vs Claude Opus 4.6) with methodology context on the same page. Separately, Anthropic publishes a Claude Mythos Preview risk report and a Frontier Red Team article (red.anthropic.com) for deeper technical detail. The Claude Opus 4.7 announcement explains that broad Mythos-class availability depends on advancing safeguards—Opus 4.7 is positioned as the first generally available model with new automated cybersecurity refusal patterns to gather deployment lessons.

    Source: Anthropic
  • 5 Apr 2026

    Hybrid retrieval for RAG: why teams fuse semantic search, keywords, and RRF

    Retrieval-Augmented Generation quality depends heavily on finding the right chunks. Pure semantic (dense vector) search handles paraphrases well but can miss exact tokens such as error codes, SKUs, or legal clause numbers. Pure keyword (BM25-style) search does the opposite. Hybrid pipelines run both retrievers and merge results: rank-based fusion methods like Reciprocal Rank Fusion (RRF), described by Cormack, Clarke, and Buettcher (SIGIR 2009) and documented for production in Elasticsearch’s RRF retriever, assign each document a score from the sum of 1/(k + rank) across lists, with a constant k (Elastic’s default rank_constant is 60). Separately, some engines apply weighted combinations when scores are normalized to a comparable range—Weaviate’s hybrid search documentation discusses balancing vector vs keyword contribution. These patterns are now mainstream in vector DB and search-engine documentation, not experimental-only.

    Source: Elastic + IR literature
  • 26 Mar 2026

    Google launches Gemini 3.1 Flash Live (audio & real-time voice)

    In March 2026, Google introduced Gemini 3.1 Flash Live as a real-time audio/voice model built for more natural conversations and lower latency. The update emphasizes improved precision in spoken dialogue and better tonal understanding, making it relevant for AI tutoring, live assistants, and voice-driven agent workflows. Developers can use the Gemini Live preview tooling (Google AI Studio/Gemini platform) to build experiences that respond with less delay—especially important for educational and interactive scenarios.

    Source: Google blog
  • 24 Mar 2026

    vLLM KV cache + continuous batching update: scheduling by full input sequence length

    In March 2026, vLLM’s continuous batching and paged KV cache work added scheduler behavior that accounts for full input sequence length. The practical impact is fewer preemptions and better generation throughput because the scheduler doesn’t over-admit requests based only on the first chunk. This directly matters for applications that run long prompts (education, tutoring, RAG transcripts) where stable throughput is crucial for both UX latency and cost efficiency.

    Source: vLLM GitHub PR
  • 17 Mar 2026

    Gemini API tooling updates: tool combos, context circulation & Maps grounding

    Google announced Gemini API tooling updates in March 2026 to improve agentic development patterns. The updates include combining built-in tools (like Google Search and Maps grounding) with developer function calls in a single flow, plus improved context behavior so the model can access tool call results in later steps. For developers building planning + tool-use agents, these changes reduce orchestration complexity and can improve end-to-end latency. It also makes location-aware tutoring assistants and “plan a task using maps + documents” experiences more straightforward to prototype.

    Source: Google blog
  • 15 Mar 2026

    Unsloth gains traction for VRAM‑efficient local LLM fine‑tuning (LoRA/QLoRA)

    Unsloth has become a popular choice for students and developers who want to fine‑tune open LLMs locally without large GPU budgets. The official documentation emphasizes a practical workflow (install → choose LoRA vs QLoRA → train → export → deploy) and provides a detailed LoRA hyperparameters guide covering learning rate, epochs, effective batch size (batch_size × gradient_accumulation_steps), rank (r), alpha, and target modules. This trend aligns with the broader shift toward local-first AI development: lower cost, better privacy, and faster iteration for projects like tutoring assistants, domain chatbots, and RAG pipelines.

    Source: Unsloth docs
  • 15 Mar 2026

    Model Context Protocol (MCP) 2026 roadmap: Working Groups & production focus

    In early 2026, the MCP ecosystem shifted toward a more production-oriented roadmap: instead of only planning around releases, the community focused on working groups that improve core reliability (agent communication), transport scalability, and error handling. The roadmap also highlights governance practices for safe extension evolution so tools do not silently break across client/server upgrades. For builders, this is a strong signal that MCP is maturing into long-lived infrastructure for tool-and-context integration across AI assistants.

    Source: MCP blog
  • 11 Mar 2026

    OpenAI launches new tools for building agents (Responses API + Agents SDK)

    In March 2026, OpenAI announced new tooling aimed at making agentic applications easier to build and operate. The update highlights a new Responses API that unifies tool use with simple request patterns, an Agents SDK for orchestrating single and multi-agent workflows, built-in tools like web search and file search, and tracing/observability features to debug agent runs. For teams shipping AI products, this reflects a broader shift from “prompt-only apps” toward agent systems that need reliable tool calling, guardrails, and production-grade monitoring.

    Source: OpenAI
  • 10 Mar 2026

    NVIDIA OpenShell — safer runtime patterns for autonomous agents

    As agentic systems become more capable, safety and containment become core engineering concerns. NVIDIA’s OpenShell focuses on running autonomous, self-evolving agents more safely by combining sandboxed execution with policy-based restrictions (what tools/commands are allowed) and operational guardrails. This trend matters for developers building coding agents and automation assistants: the “runtime” and permissions model can be as important as the model itself.

    Source: NVIDIA developer blog
  • 9 Mar 2026

    Andrew Ng’s team releases Context Hub: API docs & persistent memory for AI coding agents

    In March 2026, Andrew Ng’s team at DeepLearning.AI released Context Hub, an open-source CLI tool that acts as a “package manager for AI-readable documentation.” AI coding agents often hallucinate API signatures or use outdated endpoints because they are trained on static data; Context Hub lets them search, fetch, and use up-to-date docs (e.g. chub search openai, chub get openai/chat --lang py) and annotate locally with chub annotate so that learnings persist across sessions. The project is MIT-licensed, available as npm package @aisuite/chub, and has seen strong adoption with integrations for Claude Code and other AI coding tools. It underscores the trend toward giving agents accurate, maintainable context instead of relying only on model weights.

    Source: MarkTechPost
  • 6 Mar 2026

    Microsoft releases Phi-4-Reasoning-Vision-15B open-weight multimodal model

    Microsoft announced Phi-4-Reasoning-Vision-15B in March 2026, a compact 15 billion parameter open-weight multimodal model focused on math, science, and graphical user interface understanding. The model is trained to balance reasoning quality with efficient compute and data usage, making it attractive for teams that want strong performance without frontier-model costs. Because it is open-weight, developers can fine-tune and self-host Phi-4-Reasoning-Vision for AI coding assistants, educational tools, and agentic systems that need reliable tool use and screen understanding. The release continues the trend of high-quality open-weight models that compete closely with proprietary offerings.

    Source: MarkTechPost
  • 5 Mar 2026

    Luma launches creative AI agents on Unified Intelligence models

    In March 2026, Luma introduced Luma Agents, a suite of creative AI agents powered by its new Uni-1 model from the Unified Intelligence family. The agents support multi-step, multimodal workflows across text, images, video, and audio, and maintain persistent context across assets, collaborators, and tools. Luma positions these creative AI agents as production-ready building blocks for studios and brands, integrating with third-party models like Google's Veo 3, ByteDance's Seedream, and ElevenLabs. The launch reflects a broader shift toward agentic AI systems that prioritize reliability, orchestration, and real-world outcomes over single prompt generations.

    Source: TechCrunch
  • 3 Mar 2026

    Gemini 3.1 Flash-Lite — Google’s fast, cost-effective AI model

    Google introduced Gemini 3.1 Flash-Lite in March 2026 as its most cost-effective AI model for high-volume workloads. Priced significantly lower per token than previous generations, Flash-Lite delivers up to 2.5× faster performance than Gemini 2.5 Flash while maintaining similar or better quality on many tasks. It targets use cases like real-time translation, content moderation, UI generation, and large-scale simulations where latency and cost per request matter more than frontier-level reasoning. For developers, this model fits neatly into AI monetization strategies that demand sustainable economics at scale.

    Source: Google DeepMind blog
  • 1 Mar 2026

    AI search trends 2026 — AEO, AI Overviews, and topical authority

    Recent reports on AI search in 2026 highlight how AI Overviews and other generative answer features are reshaping SEO strategy. Instead of only optimizing for blue-link rankings, publishers now focus on AI Engine Optimization (AEO), building topical authority and strong E-E-A-T signals so AI models trust and cite their content. Concepts like semantic relevance, brand visibility, and third-party citations matter more than exact-match keywords, because AI systems fan out from a query to related intents and synthesize answers across multiple trusted sources. For AI news and developer content, this means balancing rich, human-readable explanations with clear keywords around AI coding assistants, agentic systems, open-weight models, and AI search trends so that both users and AI search engines can understand the topic.

    Source: SEO.com blog
  • 19 Feb 2026

    Google Gemini 3.1 Pro — Flagship model release

    Google launched Gemini 3.1 Pro in February 2026 as its most capable model to date. It delivers roughly twice the reasoning performance of Gemini 3 Pro and scores 77.1% on the ARC-AGI-2 benchmark. The model supports a 1 million token context window and can output up to 65K tokens, making it suitable for long-document and code-generation tasks. It ranks first on 12 of 18 tracked benchmarks and excels at software engineering (80.6% on SWE-Bench Verified). Developers can access it via the Gemini API, Google AI Studio, Android Studio, and consumer-facing products.

    Source: Google AI for Developers
  • 17 Feb 2026

    Anthropic Sonnet 4.6 — 1M context, stronger coding & computer use

    Anthropic released Claude Sonnet 4.6 in February 2026 with a doubled context window of 1 million tokens (up from 200K). The model scores 60.4% on ARC-AGI-2, a benchmark aimed at human-like reasoning. Improvements focus on coding, instruction-following, and computer use (screen understanding and control). Sonnet 4.6 became the default model for both Free and Pro plan users on claude.ai and via the API, offering a strong balance of speed and capability for developers and power users.

    Source: Anthropic
  • 16 Feb 2026

    Alibaba Qwen 3.5 — Agentic AI model with vision

    Alibaba unveiled Qwen 3.5 in February 2026, positioning it for the "agentic AI era." The company claims around 60% lower cost and up to 8× better performance on large workloads compared to the previous generation. The model includes visual agentic capabilities, allowing it to understand screens and take actions across applications independently. It targets enterprise and developer use with stronger reasoning and tool use while reducing inference cost, and is available through Alibaba Cloud and open-weight variants.

    Source: Alibaba Cloud
  • 1 Feb 2026

    HyperNova 60B — Compressed open LLM on Hugging Face

    Spanish startup Multiverse Computing released HyperNova 60B 2602 in February 2026, a 50% compressed version of OpenAI's gpt-oss-120B model. Memory footprint drops from 61GB to 32GB using the company's quantum-inspired CompactifAI compression technology. The model shows significant gains in tool-calling and agentic coding, with around 1.5× improvement on the BFCL v4 benchmark. It is freely available on Hugging Face, offering a smaller, faster alternative for teams that need strong reasoning and tool use without the full 120B footprint.

    Source: Hugging Face
  • 20 Jan 2026

    India AI Summit 2026 — $1.1B fund, 7-Sutra governance

    The India AI Summit (India AI Impact Summit) in January 2026 set the tone for India's "AI for All" push. The government announced a $1.1 billion state-backed venture capital fund targeting AI and advanced manufacturing startups, with a goal to attract over $200 billion in AI infrastructure investment within two years. Compute will expand by 20,000 GPUs on top of the existing 38,000. India also released AI Governance Guidelines built around seven principles (the "7 Sutras"): Trust is the Foundation, People First, Innovation over Restraint, Fairness & Equity, Accountability, Understandable by Design, and Safety, Resilience & Sustainability. New institutions include the AI Governance Group, Technology & Policy Expert Committee, and AI Safety Institute. OpenAI will open offices in Bengaluru and Mumbai; Anthropic opened its first Indian office in Bengaluru. Eighty-eight countries signed the New Delhi AI Declaration, and India joined the Pax Silica group for AI infrastructure supply chain resilience.

    Source: PIB

Learn Python & AI with us

Stay ahead with live 1:1 classes on Python, ML, RAG, and modern AI. Book a free demo.

Book a Free Demo Session