Author: Aswin Ak

Aswin Ak
132 POSTS0 COMMENTS
Aswin AK is a consulting intern at MarkTechPost. He is pursuing his Dual Degree at the Indian Institute of Technology, Kharagpur. He is passionate about data science and machine learning, bringing a strong academic background and hands-on experience in solving real-life cross-domain challenges.

Meet PydanticAI: A New Python-based Agent Framework to Build Production-Grade LLM-Powered Applications

Building large language model (LLM)-powered applications for real-world production scenarios is challenging. Developers often face issues such as inconsistent responses from models, difficulties in...

Reimagining Paradigms for Interpretability in Artificial Intelligence

Ensuring AI models provide faithful and reliable explanations of their decision-making processes is still challenging. Faithfulness in the sense of explanations faithfully representing the...

ChatRex: A Multimodal Large Language Model (MLLM) with a Decoupled Perception Design

Multimodal Large Language Models (MLLMs) have shown impressive capabilities in visual understanding. However, they face significant challenges in fine-grained perception tasks such as object...

Chameleon: An AI System for Efficient Large Language Model Inference Using Adaptive Caching and Multi-Level Scheduling Techniques

Large language models (LLMs) have transformed the landscape of natural language processing, becoming indispensable tools across industries such as healthcare, education, and technology. These...

NVIDIA AI Research Unveils ‘Star Attention’: A Novel AI Algorithm for Efficient LLM Long-Context Inference

Transformer-based Large Language Models (LLMs) face significant challenges in efficiently processing long sequences due to the quadratic complexity of the self-attention mechanism. This will...

Enhanced IDS Framework with usfAD for Detecting Unknown Attacks

Intrusion detection systems (IDS) encounter significant challenges in detecting zero-day or unknown cyberattacks, which are not included in the training data. These attacks do...

GRAF: A Machine Learning Framework that Convert Multiplex Heterogeneous Networks to Homogeneous Networks to Make Them more Suitable for Graph Representation Learning

Real-world networks, such as those in biomedical and multi-omics datasets, often present complex structures characterized by multiple types of nodes and edges, making them...

Insight-V: Empowering Multi-Modal Models with Scalable Long-Chain Reasoning

The capability of multimodal large language models (MLLMs) to enable complex long-chain reasoning that incorporates text and vision raises an even greater barrier in...

aiOla Releases Whisper-NER: An Open Source AI Model for Joint Speech Transcription and Entity Recognition

Speech recognition technology has made significant progress, with advancements in AI improving accessibility and accuracy. However, it still faces challenges, particularly in understanding spoken...

Researchers from the University of Maryland and Adobe Introduce DynaSaur: The LLM Agent that Grows Smarter by Writing its Own Functions

Traditional large language model (LLM) agent systems face significant challenges when deployed in real-world scenarios due to their limited flexibility and adaptability. Existing LLM...

Google Upgrades Gemini-exp-1121: Advancing AI Performance in Coding, Math, and Visual Understanding

The field of artificial intelligence (AI) continues to evolve, with competition among large language models (LLMs) remaining intense. Despite recent advances pushing the boundaries...

Jina AI Introduces Jina-CLIP v2: A 0.9B Multilingual Multimodal Embedding Model that Connects Image with Text in 89 Languages

In an interconnected world, effective communication across multiple languages and mediums is increasingly important. Multimodal AI faces challenges in combining images and text for...