Adobe Stock 1107050754

AI that thinks ahead

Vladimir Vasilev 13 August, 2025

Artificial intelligence (AI) has been around for decades, but it was the rise of generative AI that has catapulted it into the public spotlight.  

Driving this revolution are large language models (LLMs) – systems that have redefined how machines understand and generate human language. 

Today, the AI landscape is home to multiple LLMs, each trained on vast datasets, earning them the name “large”.  

As we look to 2025 and beyond, the question is no longer if these models will advance, but how they will evolve and what new possibilities they will unlock. 

From natural conversations to intelligent workflows, Vladimir Vasilev, digital lead at Baker Tilly (Dominican Republic), explores four major developments poised to change everything. 

Multimodal AI: beyond text 

124 PEOPLE

Imagine an AI assistant that can listen to your voice, analyse a photo you upload and respond with a custom-generated video – all within a single conversation. 

Well, that may be closer than you think.  

Tomorrow’s AI won’t be limited to words. Multimodal models are being built to understand and generate audio, images and even video within a single system.  

This evolution will make AI interactions feel far more natural and human-like. 

Agentic AI: from small tasks to full workflows 

Computer 2561518 1920

Most AI tools today are built for one-off tasks. Agentic AI, however, is a huge leap forward.  

These systems are designed to manage entire workflows over time – acting like intelligent assistants that can plan, execute and adapt as they go.  

Think AI that doesn’t just draft a single email, but manages your calendar, writes reports, books appointments, and follows up on emails over hours, days or even weeks.  

AI that runs your computer 

Alessandro bianchi 3k KLU4 U Ub U unsplash scaled

Recent breakthroughs show AI can now directly operate software interfaces – clicking buttons, navigating menus and completing tasks in apps. 

This could redefine automation, moving beyond the rigid rules of robotic process automation to create flexible, learning-based agents. 
 
Imagine an AI that sorts your email, responds in your tone of voice, fills out online forms and learns your preferences.  

Instead of scripted steps, it would simply observe, act and adapt, becoming an intelligent layer across all your tools. 

World models: learning through experience 

13 DIGITAL

The next leap in AI could come from models that don’t just read about the world – they experience it. 

So-called “world models” simulate environments where AI agents learn by exploring, much like humans do. 

These virtual worlds offer a powerful new way to train AI – not just to mimic language but to understand and interact with complex systems. 

As we move beyond static datasets, world models open the door to AI that learns continuously through experience.

New opportunities  

These emerging technologies – multimodal AI, agentic systems, embodied software agents and experiential world models – mark a shift towards AI that can act, learn and reason in more human-like ways.  

As they evolve in the coming years, they’ll unlock new opportunities in how we work, create and connect.  

The future of AI isn’t just about smarter tools – it’s about true partners in solving real-world challenges. 

Navigate the new era of intelligent technology

Where borderless becomes limitless. We have the global expertise to help you scale your way to digital transformation.

What Is a LLM?

Adobe Stock 470694828

A Large Language Model (LLM) is an AI system trained to understand and generate human-like text. To achieve this, it’s trained on massive datasets – often starting with raw internet data from sources like CommonCrawl, a non-profit specialising in large-scale web data collection. This data is heavily processed through URL filtering, language selection and the removal of personal information (similar to how Google Maps blurs faces and license plates). For context, CommonCrawl’s April 2024 dataset includes 2.7 billion web pages, totalling 424 terabytes of raw HTML. After filtering and normalisation, high-quality subsets like FineWeb reduce this to around 44 terabytes of curated text, which is then used for LLM pre-training by major AI companies.

Related content

Conversation Digital
Vladimir Vasilev, Dominican Republic • 20 May, 2025
Case study Digital Risk advisory and ESG Life sciences and healthcare North America
17 April, 2025
Conversation Digital
14 April, 2025
Press release Digital Asia Pacific
31 March, 2025
Press release Digital North America
7 October, 2024
Case study Digital Real estate
24 January, 2024
Case study Digital
20 September, 2023
Conversation Digital
Vladimir Vasilev, Dominican Republic • 18 May, 2023
Conversation Digital Technology, software and media
28 February, 2023
Case study Digital Retail, consumer and hospitality
8 February, 2023
People on the ground.
Wherever the opportunity lands.
International enquiries

Multi-jurisdiction and cross-border services

National enquiries

Domestic expertise, local insights