Author: Aswin Ak

Aswin Ak
132 POSTS0 COMMENTS
Aswin AK is a consulting intern at MarkTechPost. He is pursuing his Dual Degree at the Indian Institute of Technology, Kharagpur. He is passionate about data science and machine learning, bringing a strong academic background and hands-on experience in solving real-life cross-domain challenges.

MosAIC: A Multi-Agent AI Framework for Cross-Cultural Image Captioning

Large Multimodal Models (LMMs) excel in many vision-language tasks, but their effectiveness needs to improve in cross-cultural contexts. This is because they need to...

PLAID: A New AI Approach for Co-Generating Sequence and All-Atom Protein Structures by Sampling from the Latent Space of ESMFold

Designing accurate all-atom protein structures is a critical challenge in bioengineering, as it involves generating both 3D structural data and 1D sequence information to...

Anthropic Introduces Clio: A New AI System that Automatically Identifies Trends in Claude Usage Across the World

Artificial intelligence systems are becoming integral to various aspects of society, yet understanding their real-world impact presents significant challenges. While user data offers valuable...

AMD Releases AMD ROCm 6.3: An Open-Source Platform with Advanced Tools and Optimizations to Enhance AI, ML, and HPC Workloads

As artificial intelligence (AI), machine learning (ML), and high-performance computing (HPC) become central to innovation across industries, they also bring challenges that cannot be...

Meet Ivy-VL: A Lightweight Multimodal Model with Only 3 Billion Parameters for Edge Devices

The ongoing advancement in artificial intelligence highlights a persistent challenge: balancing model size, efficiency, and performance. Larger models often deliver superior capabilities but require...

Google AI Releases Gemini 2.0 Flash: A New AI Model that is 2x Faster than Gemini 1.5 Pro

Google AI Research introduces Gemini 2.0 Flash, the latest iteration of its Gemini AI model. This release focuses on performance improvements, notably a significant...

ByteDance Introduces Infinity: An Autoregressive Model with Bitwise Modeling for High-Resolution Image Synthesis

High-resolution, photorealistic image generation presents a multifaceted challenge in text-to-image synthesis, requiring models to achieve intricate scene creation, prompt adherence, and realistic detailing. Among...

This AI Paper from UC Santa Cruz and the University of Edinburgh Introduces CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions

Web-crawled image-text datasets are critical for training vision-language models, enabling advancements in tasks such as image captioning and visual question answering.  However, these datasets...

This AI Paper from UCLA Unveils ‘2-Factor Retrieval’ for Revolutionizing Human-AI Decision-Making in Radiology

Integration of AI into clinical practices is very challenging, especially in radiology. While AI has proven to enhance the accuracy of diagnosis, its "black-box"...

Google DeepMind Open-Sources GenCast: A Machine Learning-based Weather Model that can Predict Different Weather Conditions up to 15 Days Ahead

Accurately forecasting weather remains a complex challenge due to the inherent uncertainty in atmospheric dynamics and the nonlinear nature of weather systems. As such,...

Revolutionizing In-Context Learning: The HiAR-ICL Paradigm for Advanced Reasoning with MCTS

Large language models are good at many tasks but bad at complex reasoning, especially when it comes to math problems. Current In-Context Learning (ICL)...

Cohere AI Introduces Rerank 3.5: A New Era in Search Technology

Search and information retrieval have evolved beyond simply finding content—they are now crucial for business efficiency and productivity. Companies often rely on search capabilities...