Top 5 Large Language Models and How To Use Them Effectively - Related to 42, platform, modern, complex, means

42 Experts, One Mission: Advancing Platform Engineering

Platform engineering expertise and experience grows exponentially every day among IT management and developer teams as its adoption continues around the world, but until in recent times there was no central clearinghouse to amass all that valuable and hard-earned knowledge.

But that changed when a new Platform Engineering Community Ambassador Program was launched last December by the global developer’s community, [website], which is bringing together some of the best and brightest minds that are driving innovation and exploration in this still-nascent field.

The Ambassador program aims to connect leading platform engineering experts to discuss best practices and frameworks and to learn from each other, . It is also seen as an effort that gives these experts public exposure to gain recognition for their work and accomplishments involving platform engineering.

That mission is what caused Cansu Kavili-Örnek, a principal AI consultant in Red Hat’s AI business unit, to get involved in the program, she told The New Stack. “I believe we evolve for the more effective when we openly share knowledge and experiences, and I see the Ambassador program as a perfect opportunity to contribute to this ecosystem,” she mentioned.

“By being an Ambassador, I get to engage with a community of like-minded professionals, exchange ideas, and help others benefit from platform engineering,” expressed Kavili-Örnek. “It’s motivating to know that I can play a small part in shaping how teams and organizations adopt practices that make their work more efficient and impactful.”.

She mentioned she has seen similar results when sharing platform engineering stories about her experiences in the larger platform engineering community and that those stories seem to resonate with other community members. “When they launched the Ambassador program, it felt like a natural next step,” she mentioned.

Another crucial attribute of the program is that it welcomes a broad diversity of voices, mentioned Kavili-Örnek. “By learning from different perspectives, the program helps refine the concept of platform engineering and make it more useful and accessible. This way, people and companies who are exploring or already practicing platform engineering can gain advanced ideas and practical solutions to improve their work.”.

So far, 42 experienced platform engineering experts have been enrolled in the Ambassador program and added their expertise to the effort, Sam Barlien, the head of ecosystem for [website], told The New Stack. Additional experts are continuing to apply to join the Ambassador Program after providing details about their backgrounds, expertise, accomplishments, activities and training.

“The program’s goal is about sharing knowledge, growing platform engineering, and helping companies by … connecting experts, and putting them in front of others via PlatformCon, other IT events, blogs, and more,” stated Barlien. “A big part of it is giving these experts the opportunity to bounce ideas off each other, discuss things, and share – you know, did you try a process or tool? How is it working for you? That is an element of it. That is what an Ambassador Program is about.”.

Ambassador Program members can be contacted for more information about their work and experience through their LinkedIn profiles or through [website] Slack channel.

Another new member of the Ambassador program, Faisal Afzal, a principal technical consultant with cloud consulting and technology services vendor AHEAD, told The New Stack that sharing his platform engineering experiences with others is what attracts him to the program.

“It provides me with an opportunity to mentor and enable the community,” he presented. “One of my focuses as an Ambassador is to have conversations on a consistent basis around the discipline of platform engineering, including how it is enabling businesses and improving productivity.”.

The growing availability of this expertise today can be transformative for companies, developers and engineers as they work to explore platform engineering, expressed Afzal.

And having these kinds of helpful and approachable resources would have been personally beneficial back when he was entering the field of platform engineering, he added.

“Access to this thought leadership, evolving artifacts, and more mainstream discussion with like-minded people would have helped me by bringing in best practices and real-world insights and accelerating the journey,” mentioned Afzal. “Instead, it took far longer and was painful at the time.”.

But things are continuing to change and it is becoming easier to gain a wealth of useful platform engineering knowledge, he revealed.

“Cloud native and application modernization efforts often fail when organizations lack a platform engineering discipline to deliver reliable, scalable infrastructure or neglect to treat the platform as a product designed to empower internal developers,” he noted. “Without this focus, teams face inefficiencies, fragmented tooling, and poor developer experiences, which hinder innovation and business agility. These challenges motivated me to become a platform engineering Ambassador, driving best practices and enabling organizations to build platforms that truly support their developers and modernization goals.”.

StarlingX has always been a great edge-computing cloud platform, but it can also be helpful in the core.

StarlingX, the open source distributed cloud......

This is my second article in a series of introductions to Spring AI. You may find the first one, where I explained how to generate imag......

Redis is a high-performance NoSQL database that is usually used as an in-memory caching solution. However, it is very useful as the primary datastore ......

Top 5 Large Language Models and How To Use Them Effectively

This article has been updated from when it was originally , 2023.

Modern Large Language Models (LLMs) are pre-trained on a large corpus of self-supervised textual data, then tuned to human preferences via techniques such as reinforcement learning from human feedback (RLHF).

LLMs have seen rapid advances over the last decade, particularly since the development of generative pre-trained transformers (GPTs) in 2012. Google’s BERT, introduced in 2018, represented a significant advance in capability and architecture and was followed by OpenAI’s release of GPT-3 in 2022, and GPT-4 the following year.

While open sourcing AI models is controversial given the potential for widespread abuse — from generating spam and disinformation, to misuse in synthetic biology — we have seen a number of open source alternatives which can be cheaper and as good as their proprietary counterparts.

Given how new this all is, we’re still getting to grips with what may or may not be possible with the technology. But the capabilities of LLMs are undoubtedly interesting, with a wide range of potential applications in business. These include being used as chatbots in customer support settings, code generation for developers and business individuals, and audio transcription summarizing, paraphrasing, translation and content generation.

You can imagine, for example, that customer meetings could be both transcribed and summarized by a suitably-trained LLM in near real time, with the results shared with the sales, marketing and product teams. Or an organization’s web pages might automatically be translated into different languages. In both cases, the results would be imperfect but could be quickly reviewed and fixed by a human reviewer.

LLMs could prove useful when working with other forms of unstructured data in some industries. “In wealth management,” Madhukar Kumar, CMO of SingleStore, a relational database firm, told the New Stack, “we are working with end-individuals who have a huge amount of unstructured and structured data, such as legal documents stored in PDFs and user details in database tables, and we want to be able to query them in plain English using a Large Language Model.”.

SingleStore is seeing clients using LLMs to perform both deterministic and non-deterministic querying simultaneously.

“In wealth management, I might want to say, ‘Show me the income statements of everyone aged 45 to 55 who not long ago quit their job,’ because I think they are right for my 401(k) product,” Kumar stated.

“This requires both database querying via SQL, and the ability to work with that corpus of unstructured PDF data. This is the sort of use case we are seeing a lot.”.

An emerging application of AI is for agentic systems. “We’re seeing a number of new AI companies amongst our consumers who are looking to make their data immediately available to build agentic systems,” Kumar told us. “In cybersecurity, for example, you might take several live video feeds and give that to an AI to make decisions very quickly.”.

Large language models have been applied to areas such as sentiment analysis. This can be useful for organizations gathering data and feedback to improve customer satisfaction. Sentiment analysis is also helpful for identifying common themes and trends in a large body of text, which may assist with both decision-making and more targeted business strategies.

As we’ve noted elsewhere, one significant challenge with using LLMs is that they make stuff up. For example, the winning solution for a benchmarking competition — organized by Meta and based on Retrieval Augmented Generation (RAG) and complex situations — was wrong about half the time. These findings are similar to those from NewsGuard, a rating system for news and information websites, which showed that the 10 leading chatbots made false points to 40% of the time and gave no answers to 22% of questions. Using RAG and a variety of other techniques can help, but eliminating errors completely looks to be impossible. In view of this, LLMs should not be used in any situation where accuracy matters.

Training an LLM from scratch remains a major undertaking, so it makes more sense to build on top of an existing model where possible. We should also note that the environmental costs of both training and running an LLM are considerable; because of this we recommend only using an LLM where there isn’t a smaller, cheaper alternative. We would also encourage you to ask the vendor or OSS project to disclose their figures for training and running the model, though at the time of writing this information is increasingly hard to obtain.

With Kumar’s help, we’ve compiled a list of what we think are the five most key LLMs at the moment. If you are looking to explore potential uses for LLMs yourself, these are the ones we think you should consider.

Reasoning models produce responses incrementally, simulating to a certain extent how humans grapple with problems or ideas.

OpenAI’s o3-mini-high has been fine-tuned for STEM problems, specifically programming, math and science. As such, and with all the usual caveats that apply to benchmarking, it currently scores highest on the GPQA benchmark commonly used for comparing reasoning performance.

Developers can choose between three reasoning effort options—low, medium and high—to optimize for their specific use cases. This flexibility allows o3‑mini to ‘think harder’ when tackling complex challenges, or prioritize speed when latency is a concern. It is also OpenAI’s first small reasoning model to support function calling⁠, structured outputs⁠, and developer messages.

OpenAI no longer discloses carbon emissions, though model size does make a difference, and claimed improvements to response times imply a lower overall carbon running cost.

DeepSeek reasoning models were, they claim, trained on a GPU cluster a fraction of the size of any of the major western AI labs. They’ve also released a paper explaining what they did, though some of the details are sparse. The model is free to download and use under an MIT license.

R1 scores highly on the GPQA benchmark, though it is now beaten by o3-mini. DeepSeek says it has been able to do this cheaply — the researchers behind it claim it cost $6m (£[website] to train, a fraction of the “over $100m” alluded to by OpenAI boss, Sam Altman, when discussing GPT-4. It also uses less memory than its rivals, ultimately reducing the carbon and other associated costs for individuals.

DeepSeek is trained to avoid politically sensitive questions — for example, it will not give any details about the Tiananmen Square massacre on June 4, 1989.

You don’t necessarily need to stick to the version DeepSeek provides, of course. “You could use it to distill a model like Qwen [website] or Llama [website], and it is much cheaper than OpenAI,” Kumar admitted.

While speed of typing or lines of code have long since been debunked as a good measure of developer performance — and many experienced developers have expressed reservations about using AI-generated code — coding is one of the areas where GenAI appears to have early product market fit. It works well because mistakes are typically easy to spot or test for, meaning that the aforementioned accuracy problems are less of an issue.

While most developers will likely favor the code-completion system built into their IDE, such as JetBrains AI or GitHub Copilot, the current best-in-class on the HumanEval benchmark is Claude [website] Sonnet from Anthropic. “When it comes to coding, Claude is still the best,” Kumar told us. “I’ve personally used it for hours and hours, and there is very little debate around it.”.

This proprietary model also scores well on agentic coding and tool use tasks. On TAU-bench, an agentic tool use task, it scores [website] in the retail domain, and 46% in the airline domain. It also scores 49% on SWE-bench Verified.

At the time of writing, Anthropic have just released Claude [website] Sonnet which, the vendor indicates, “displays particularly strong improvements in coding and frontend web development.” Claude [website] Sonnet with extended thinking — letting you see Claude’s thought process alongside its response — is offered as part of a Pro plan. Anthropic also offers a GitHub integration across all Claude plans, allowing developers to connect code repositories directly to Claude.

Both OpenAI’s o3 and DeepSeek’s R1 models score highly as general purpose models, but we’re fans of the Meta Llama family of open source models which come close. It uses a Mixture of Experts (MoE) approach, which is an ensemble learning technique that scales model capacity without significantly increasing training or inference costs. MoEs can dramatically increase the number of parameters without a proportional increase in computational cost.

Llama [website] 405b scores [website] on the MMLU benchmark, putting it a hair’s-breadth behind the considerably more computationally expensive alternatives.

Google’s experimental Gemini Flash [website] scores lower than Llama on the MMLU benchmark, at [website], but it has other capabilities that make it interesting. It supports multimodal output like natively-generated images mixed with text and steerable text-to-speech (TTS) multilingual audio. It can also natively call tools like Google Search and code execution, as well as third-party user-defined functions. It is also impressively fast and has one of the largest context sizes of 1 million tokens.

Google is also actively exploring agentic systems through Project Astra and Project Mariner, and Flash [website] is built with the intention of making it particularly suitable for agentic systems.

Once you’ve drawn up a shortlist of LLMs, and identified one or two low-risk use cases to experiment with, you have the option of running multiple tests using different models to see which one works best for you — as you might do if you were evaluating an observability tool or similar.

It’s also worth considering whether you can use multiple LLMs in concert. “I think that the future will involve not just picking one, but an ensemble of LLMs that are good at different things,” Kumar told us.

Of course, none of this is particularly useful to you unless you have timely access to data. During our conversation, Kumar suggested that this was where contextual databases like SingleStore come in.

“To truly use the power of LLMs,” he mentioned, “you need the ability to do both lexical and semantic searches, manage structured and unstructured data, handle both metadata and the vectorized data, and handle all of that in milliseconds, as you are now sitting between the end user and the LLM’s response.”.

Microsoft Research has introduced BioEmu-1, a deep-learning model designed to predict the range of structural conformations that proteins can adopt. U......

The modern discipline of data engineering considers ETL (extract, transform, load) one of the processes that must be done to manage and transform data......

Threat modeling is often perceived as an intimidating exercise reserved for security experts. However, this perception is misleading. Threat modeling ......

The Modern CDN Means Complex Decisions for Developers

Whether you were building a web site or an application, hosting choices used to be about bandwidth, latency, security and availability (as well as cost), with content delivery networks (CDNs) handling static assets and application delivery networks relying on load balancing for scale.

All those things still matter, but there are also many more choices to take into account — from the backend implications of frontend decisions, to where your development environment lives. CDNs have become complex, multilayer distributed computing systems that might include distributed databases, serverless functions and edge computing. What they deliver is less about static (or even dynamic and personalized) assets and more about global reach, efficiency and user experience.

In fact, the term CDN may not always be that helpful any more, because modern “CDNs” are increasingly handling not just content delivery, but also API management, distributed security and even AI inference at the edge.

It may be more helpful to think of the modern “CDN” as the front door for computing infrastructure.

Platforms like Cloudflare and Fastly describe themselves as “connectivity clouds” and “edge clouds” to reflect their expanded capabilities. (Vercel bills itself as a “frontend cloud.”) On the other hand, traditional CDN providers like Akamai offer compute in their infrastructure to compete with hyperscale cloud providers like AWS, Azure and GCP. Those companies also offer services like load balancing, access management, dynamic site acceleration, rate limiting, web application firewalls, TSL offloading, DDoS protection, API security, bot mitigation and geoblocking — not to mention traditional content delivery capabilities.

With the line between CDN, application platform and cloud computing so blurred, and with ongoing changes in the size and type of what content gets delivered, it may be more helpful to think of the modern “CDN” as the front door for computing infrastructure, whatever that looks like.

Even the traditional CDN has evolved from caching the bandwidth-gobbling images and videos on a site to handling streaming and delivery generally, whether that’s a new game download, dynamically rendered 4K textures streaming in an open world game, or supporting e-commerce sites on Black Friday, Akamai VP of products Jon Alexander told The New Stack.

“Bulk delivery: video downloads, software, the unidirectional push of content out to consumers updating their PlayStation or Xbox, watching the video,” mentioned Alexander. That still requires low latency and a large footprint in key regions to deliver faster traffic throughput.

But CDNs are equally critical for delivering apps and digital experiences where you care more about being close to a specific set of people — or collecting data from a specific set of devices, because they’re no longer only about broadcasting or even narrowcasting content. You might also be collecting that data from sensors, IoT devices or user systems; Alexander notes that some banks are using distributed databases for real-time payment transaction data. That requires not just good bandwidth across the internet backbone, but also a lot of connections at the edge to handle very different traffic patterns — shifting from large file delivery to smaller, more frequent API calls, real-time data streams and small islands of compute.

“The average object size on our network has decreased year over year, so as requests are going up, the payloads are reducing.”.

More of the data Akamai handles is now bidirectional streams. “They’re increasingly chatty; very high request rates, often with much smaller payloads. The average object size on our network has decreased year over year, so as requests are going up, the payloads are reducing. We’re not moving around huge game files anymore. It’s weather updates, it’s beacons.”.

In fact, one of the reasons for the shifts in what CDNs offer is this change in payloads, which also changes the business model of running a CDN because what organizations are willing to pay a CDN for has changed along with the technologies they’re looking for. “It’s very hard to have a sustainable, chargeable model for APIs, because they are super light by design,” APIContext CEO Mayur Upadhyaya points out. One reason there’s less and less separation between being a delivery platform and a compute platform may be because “compute becomes the commodity you can charge for.”.

For developers, Alexander hints at, this convergence of CDN and cloud computing into a distributed architecture “gives them options around how and where they run their code, without increasing the toil or complexity.”.

CDNs (and caches) are key to using static site generation tools like Hugo, Jekyll and Eleventy for building web applications with client-side rendering of APIs via JavaScript (often referred to as “Jamstack” development), because making even a small improvement republishes the entire site — so there’s no risk of mixing updated and stale content on a page.

Although the majority of developers are still building infrastructure-centric applications to run in a specific region, Alexander also points at the rise of serverless and edge-centric frameworks allowing developers to abstract infrastructure from where the code physically runs.

“Your code runs anywhere or everywhere, and so does the tooling, the observability and the automation that goes with it. You have the potential now to run the application in many places, have your data in many places, and to give an incredibly interactive experience. Dynamic generation of personalized content is now feasible at reasonable latencies, so having a personalized condition of a commerce page load in one or two seconds is now very viable with distributed databases at the edge.”.

Increasingly, apps need to be global, but to still feel like they’re local. The modern internet works at a global scale, and that changes the role of the CDN as well. Instead of focusing on regional and metropolitan application delivery, they need to deliver at a much larger scale, routing and storing content wherever there’s demand. Instead of offering pure content caches, the CDN becomes a dynamic, self-reconfiguring, self-repairing network providing edge access to applications and services. It also needs to both respond to and take advantage of cloud availability zones, routing user data to live services in the event of a failure.

Enter the “frontend cloud”: an “edge-distributed, incredibly fast, globally accelerated deployment layer with integrated serverless functionality.”.

A whole landscape of development frameworks — from simple static site generators to complex tooling like [website] — have grown up around the idea of what Netlify CEO Mathias Biilmann calls a “frontend cloud”: an “edge-distributed, incredibly fast, globally accelerated deployment layer with integrated serverless functionality.”.

That’s a long way from “hosting plus a CDN,” he argues. “Hosting is: Where do you put the files, where do you put your server? But that’s not enough to operate in production. You need to have security. If you want to offer a competitive user experience, you need to integrate the concept of a CDN or an application delivery network, you need to be able to get very low ‘time to first byte’ and get things in front of clients really fast. You need things like image CDNs, resizing and so on that you take for granted to optimize the frontend performance and responsiveness, and you need a backend for the frontend layer with serverless functions and some level of cloud storage.”.

Converging these different services and layers of compute, storage, networking, security and delivery requires integration, automation and logging to allow development teams to iterate quickly, especially if you’re using services like identity and authentication, e-commerce or headless CMSes.

“Regardless of where your hosting is, you need a local developer experience for your developers …”.

“You’ve got to have a way to roll back mistakes; you’ve got to have not just a production environment, but test environments and staging environments,” says Biilmann. “Regardless of where your hosting is, you need a local developer experience for your developers that makes their day-to-day development experience map cleanly to that hosting, so there’s not a discrepancy once they push it live, and so it’s easy to get set up with all the right secrets and environment variables. And around all of that, you need a layer of insights and observability: How does that hosting and automation layer talk to your normal observability platform? How can you make sure that you can see anomalies over time and get alerted early to anything that broke due to a deploy, and associate the two easily?”.

Although platforms like Netlify and its competitors offer a wide range of services for both frontend and backend, taking advantage of this approach requires architecture that decouples the frontend layer from the backend. Biilmann maintains that choosing multiple services from a single provider should be about performance, or the benefits of convenience and integration, rather than a technical limitation.

“We used to be in a world where you would buy a CDN for one part of it, you would buy a hosting provider from some different parts of this, you would buy a CI/CD platform for part of this — and have to build all of this yourself. You would have to do a ton of instrumentation manually to figure out [connections] between all of these, and the security layer you would buy separately, and they would each be operated by a different team, which in itself often creates conflict.”.

“Queuing and the way you wire up applications is going to be critical, because these things do live in lots of different places.”.

This shift also allows CDNs to expose some of their traditional functionality to developers in ways that help them deliver a more effective user experience. Using tools like Cloudflare Workers, you can build tools that route traffic, not only to geographically appropriate servers, but also to servers that have available capacity, using event-driven functions to respond as the workload changes. Or you can use information about a device to, for example, route translated content based on browser language settings, without having to have that same content duplicated on every edge cache. You can even put different parts of what used to be your server in different locations — an approach Cloudflare’s Sunil Pai dubs “spatial compute.”.

Even getting to that extreme level of disaggregation, applications will rely on APIs and cloud services and data storage, whether that’s S3-style object storage or relational databases. “Queuing and the way you wire up applications is going to be critical, because these things do live in lots of different places,” explains John Engates, Field CTO at Cloudflare. “You’re going to have to have ways to orchestrate applications in places that are maybe distinct from where your core applications live, and you’ve got to connect all this stuff together.”.

Don’t expect the same level of functionality you get from Azure or AWS, though. Instead, CDN-hosted compute — like Cloudflare Workers — have options to customize the content for where the user is and what device they’re using, as well as ways to configure software-defined networking.

One of the biggest drivers for the change in how content is delivered is simply that we’re building applications differently. Early CDN architecture was designed for monolithic applications that could be fronted with transparent caches, so they could be used to manage static content. “The first era of CDNs was just caching things so we could scale the performance and do it globally,” Cloudflare’s Engates remembers. But the days of putting a handful of NetApp Filers in front of servers (when mechanical drive speed was as much of a physical limit as the speed of light) are long gone.

“Now smart developers and engineers are building out full-fledged application platforms on top of what originated as the CDN,” noted Engates.

CDNs are also having to respond to the underlying changes in modern application architectures.

Many applications are now global, running across multiple data centers and clouds, with content tailored for different regions and routed appropriately, down to monitoring the state of the BGP tables to ensure that it’s delivered as quickly as possible. Developers write code that consumes not only their own services, but needs to work with third-party APIs, which often come with their own CDNs and API management layers. Wrapping services like Shopify and Stripe inside an application means taking a dependency on their infrastructure, as well as their choice of CDNs — something it’s a lot easier not to think about.

“Developers should not have to be experts on how to scale an application; that should just be automatic. But equally, they should not have to be experts on where to serve an application to stay compliant with all these different patchworks of requirements; that should be more or less automatic,” Engates argues. “You should be able to flip a few switches and say ‘I need to be XYZ compliant in these countries,’ and the policy should then flow across that network and orchestrate where traffic is encrypted and where it’s served and where it’s delivered and what constraints are around it.”.

“People can’t build the application that they want, they can’t optimize, because of some of these taxes that are added on moving data around.”.

Along with the physical constraint of the speed of light and the rise of data protection and compliance regimes, Alexander also highlights the challenge of costs as something developers want modern CDNs to help them with. “Egress fees between clouds are one of the artificial barriers put in place,” he indicates. That can be 10%, 20% or even 30% of overall cloud spend. “People can’t build the application that they want, they can’t optimize, because of some of these taxes that are added on moving data around.”.

enhancement patterns aren’t always straightforward either. Take a wiki like Fandom, where Fastly founder and CTO Artur Bergman was previously CTO. “If you think about a wiki, the content doesn’t change very often, but when it changes, it is to change right away, because it’s collaborative editing.” One of his reasons for starting Fastly was that although putting the initially [website] site’s images on a CDN dropped access times for UK consumers from 22 seconds (a statistic that has him reflecting that “it’s a miracle we had any people at all”) to 11, it took running a point of presence and caching the articles in the UK with a cache invalidation time of 150 milliseconds to get page load time down to two seconds. “And surprise, surprise, our usage in the UK went up, because now people could actually consume the product.”.

Many services will face similar challenges as they grow. “Ultimately, the driver is how to reach global audiences, and you have two choices,” he indicates. “You can either embrace the edge or you can spin up your application on every GCP or AWS location around the world. Managing all those duplications of instances and data is expensive: or you can embrace edge and only move the things that matter to the edge.”.

Putting the right content in the right abstraction layer reduces complexity and cost at your origin source. “Any piece of content that is reusable between individuals and doesn’t change every single moment should be cached and served from a server reasonably close to the user, because it’s always going to be more efficient, cheaper and faster to do that than to go around the world and run a database query somewhere.”.

“You don’t need to cache everything: You need to cache the things that most people use most of the time.”.

That works for both static and dynamic content on GitHub (which uses Fastly as a CDN) but also for e-commerce sites, which might be able to serve 80% of inventory queries from cache, or for retail point of sale, where using the edge avoids having to reconcile multiple copies of the truth.

“You don’t need to cache everything: You need to cache the things that most people use most of the time,” unveiled Bergman.

The edge might have started as a necessity, but Bergman encourages more developers to think about the benefits — although he notes that this is easier with platforms like Magento — that have plug-ins and modules integrating CDN options like active cache invalidation into the pipeline so your CMS never gets stale content — than when developers have to implement it themselves.

“The edge is still viewed as something only a few developers should have access to: that it isn’t really part of the core architecture, it’s something you put there because you have to do. People like to cache behind their app server next to the database, and they are not generally very familiar caching in front of their application, between the application and end user.”.

Too often, developers see the boundaries of their code as the edge of their own network, not the edge of the entire network they’re using to deliver their applications. But the shift to building distributed applications, often event-driven and using messaging technologies, means thinking about not just north/south connectivity from the server to the end user, but also the east/west traffic that used to be inside one data center but now needs to move across geographic replicas and even across clouds.

CDNs need to facilitate these connections between nodes, using their caches to help ensure consistency across an application. One option is hosting modern message queues, managing clusters and using their networks to ensure rapid distribution of events to queues. Using CDNs for this east/west connectivity should help keep cloud networking costs to a minimum.

“Using the edge gives you much more flexibility in adapting to the new requirements you discover as you keep developing.”.

This is complicated by the types of proxy, though; CDNs need to support API access to servers, running reverse proxies alongside traditional user-facing caches. While some of this functionality is provided by cloud networking tools, integrating API management into CDN infrastructure will allow developers to get the most from their networking infrastructure, putting all the relevant proxy and cache functionality in one logical layer. This simplifies future-proofing, Bergman believes. “Using the edge gives you much more flexibility in adapting to the new requirements you discover as you keep developing.”.

That might include managing migrations between cloud providers to save money, he points to. That way you can upgrade the DNS to point at the new location, but the CDN will use its existing cache to handle any requests that come in before the data in the new location is fully cached. “They continuously migrate behind the scene, without the user knowing.”.

Different services have made different choices about how to build out their infrastructure, reflecting what content they’re delivering and where — but new technologies are changing those trade-offs. Fastly chose to have fewer, more powerful, located POPs, but Berman notes that “the shift to more edge compute and running WebAssembly is certainly tilting the ratios of things like CPU to memory to disk needed at the edge.”.

AI at the edge can deliver improved experiences.

Akamai and Cloudflare are both starting to offer more powerful hardware at the edge, including GPUs for running AI inference. “Imagine serverless compute, storage, all the security elements, and then combining in the ability to run open source LLM models right there on the edge,” Cloudflare’s Engates indicates.

AI at the edge can deliver enhanced experiences, Akamai’s Alexander agrees. “Whether it’s a commerce application trying to improve conversions, or a financial services corporation trying to detect fraud, I think more data available at the right time will really help with some of those outcomes.”.

Dedicated NPUs and technologies like ONNX will help here, reducing inferencing costs and keeping power demands down (which should be reflected in CDN charges).

CDNs only work if you use them; it might sound obvious, but developers often struggle with the complexity of setting up CDNs and the logic of what to include, because they’re now working with a multitier distributed system that might span multiple clouds and providers rather than a simple app or web site, with traffic flowing between all of those rather than just from the app or website to the end user. Routing traffic doesn’t happen automatically: It needs to be set up — and also maintained as all the providers you depend on make changes to their own infrastructure.

Forgetting to route microservices and APIs as well as web content through the CDN — something even large services get wrong from time to time — can cause performance issues when usage ramps up some time after launch. Part of the problem is that the people writing code are often many steps removed from the team that handles those network settings, and may not have a deep understanding of their infrastructure or the dependencies of the services and frameworks they’re working with.

CDNs aren’t “set and forget.” Even if it works well initially, infrastructure and services change.

“How many of them are actually thinking, every time they spin up a new microservice, that if they don’t put it under the host name being routed by a CDN, it’s going to be outside the CDN,” asks Upadhyaya from APIContext. “Sometimes you end up with applications and mobile apps where maybe the ‘front of house’ is over a CDN: fully distributed, terminating at the edge, great last-mile delivery. But then all the core microservices behind it are routing to a single piece of tin in Virginia because the developer created a new domain and forgot to put it underneath the thing that’s configured by the network team.”.

CDNs aren’t “set and forget.” Even if it works well initially, infrastructure and services change. “We see people push out applications where they do the testing, everything is great, but six months later things have started to slow down, they’re not optimized and they come back and [discover] a lot of refactoring needs to happen,” Alexander unveiled.

Not using your CDN fully results in missed opportunities, including security issues as well as performance problems, he warns.

“With choice and flexibility, comes risk. You can build an application very easily but often without fully understanding all the potential issues. We’ve seen developers launch an application and leave an API endpoint unsecured that they didn’t know explicitly was going to be internet-facing.”.

“The great thing about CDNs is, one, they’re inline, and secondly, you can offload,” Upadhyaya adds. “There’s so much done back at source where it could be done in the CDN; token refreshing, even schema validation could be done on the fly. There’s so much out there that can be workhorses for them; if they could just offload this stuff, there’s a lot of hygiene and guardrails they’d get out of the box.”.

“We’re asking a lot of developers today, because they’ve got to be network experts, cyber experts, database experts, frontend developers — and get their job done.”.

That’s another reason the edge has become so essential for CDNs. “Even if you’re only targeting a small group of people or a small geographic area, your attack surface is global; you can get DDoSed and attacked from around the world,” Fastly’s Bergman points out. “You protect your more expensive origin from attacks around the world. Just as the cheapest place to serve a piece of content is as close to the user as possible, the cheapest place to stop an attack from consuming resources is as close to the attacker as possible.”.

Organizations are increasingly buying in services for critical areas like identity and access management rather than writing their own. They’re probably advanced written and more secure, but the chain of dependencies they bring with them means your application now depends on a complex interconnected network that could involve telecoms infrastructure, cloud platforms, multiple services and your own on-premises systems.

“You have your own digital delivery chain that’s dependent on the public cloud, the public internet, your own internal APIs and all the complexities about how your developers might stick them together, with or without CDNs, and all the third-party systems you buy have this same amount of fragility in them and the same challenges,” Upadhyaya warns.

All the capabilities of modern CDNs are a big opportunity for how you deliver apps with a great user experience — but also a lot of work. “We’re asking a lot of developers today, because they’ve got to be network experts, cyber experts, database experts, frontend developers — and get their job done.”.

Shane Hastie: Good day folks. This is Shane Hastie for the InfoQ Engineering Culture Podcast. Today I'm sitting down with Courtney Nash. C......

In the first part of this series, I introduced the background of Kube Resource Orchestrator (Kro). In this installment, we will define a Resource Grap......

Grafana Loki is a horizontally scalable, highly available log aggregation system. It is designed for simplicity and cost-efficiency. Created by Grafan......

Market Impact Analysis

Market Growth Trend

2018	2019	2020	2021	2022	2023	2024
7.5%	9.0%	9.4%	10.5%	11.0%	11.4%	11.5%

Quarterly Growth Rate

Q1 2024	Q2 2024	Q3 2024	Q4 2024
10.8%	11.1%	11.3%	11.5%

Market Segments and Growth Drivers

Segment	Market Share	Growth Rate
Enterprise Software	38%	10.8%
Cloud Services	31%	17.5%
Developer Tools	14%	9.3%
Security Software	12%	13.2%
Other Software	5%	7.5%

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity:

Competitive Landscape Analysis

Company	Market Share
Microsoft	22.6%
Oracle	14.8%
SAP	12.5%
Salesforce	9.7%
Adobe	8.3%

Future Outlook and Predictions

The Technology Updates and Analysis landscape is evolving rapidly, driven by technological advancements, changing threat vectors, and shifting business requirements. Based on current trends and expert analyses, we can anticipate several significant developments across different time horizons:

Year-by-Year Technology Evolution

Based on current trajectory and expert analyses, we can project the following development timeline:

2024Early adopters begin implementing specialized solutions with measurable results

2025Industry standards emerging to facilitate broader adoption and integration

2026Mainstream adoption begins as technical barriers are addressed

2027Integration with adjacent technologies creates new capabilities

2028Business models transform as capabilities mature

2029Technology becomes embedded in core infrastructure and processes

2030New paradigms emerge as the technology reaches full maturity

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity, influencing adoption timelines and investment priorities:

(Interactive diagram available in full report)

Innovation Trigger

Generative AI for specialized domains
Blockchain for supply chain verification

Peak of Inflated Expectations

Digital twins for business processes
Quantum-resistant cryptography

Trough of Disillusionment

Consumer AR/VR applications
General-purpose blockchain

Slope of Enlightenment

AI-driven analytics
Edge computing

Plateau of Productivity

Cloud infrastructure
Mobile applications

Technology Evolution Timeline

1-2 Years

Technology adoption accelerating across industries
digital transformation initiatives becoming mainstream

3-5 Years

Significant transformation of business processes through advanced technologies
new digital business models emerging

5+ Years

Fundamental shifts in how technology integrates with business and society
emergence of new technology paradigms

Expert Perspectives

Leading experts in the software dev sector provide diverse perspectives on how the landscape will evolve over the coming years:

"Technology transformation will continue to accelerate, creating both challenges and opportunities."
— Industry Expert

"Organizations must balance innovation with practical implementation to achieve meaningful results."
— Technology Analyst

"The most successful adopters will focus on business outcomes rather than technology for its own sake."
— Research Director

Areas of Expert Consensus

Acceleration of Innovation: The pace of technological evolution will continue to increase
Practical Integration: Focus will shift from proof-of-concept to operational deployment
Human-Technology Partnership: Most effective implementations will optimize human-machine collaboration
Regulatory Influence: Regulatory frameworks will increasingly shape technology development

Short-Term Outlook (1-2 Years)

In the immediate future, organizations will focus on implementing and optimizing currently available technologies to address pressing software dev challenges:

Technology adoption accelerating across industries
digital transformation initiatives becoming mainstream

These developments will be characterized by incremental improvements to existing frameworks rather than revolutionary changes, with emphasis on practical deployment and measurable outcomes.

Mid-Term Outlook (3-5 Years)

As technologies mature and organizations adapt, more substantial transformations will emerge in how security is approached and implemented:

Significant transformation of business processes through advanced technologies
new digital business models emerging

This period will see significant changes in security architecture and operational models, with increasing automation and integration between previously siloed security functions. Organizations will shift from reactive to proactive security postures.

Long-Term Outlook (5+ Years)

Looking further ahead, more fundamental shifts will reshape how cybersecurity is conceptualized and implemented across digital ecosystems:

Fundamental shifts in how technology integrates with business and society
emergence of new technology paradigms

These long-term developments will likely require significant technical breakthroughs, new regulatory frameworks, and evolution in how organizations approach security as a fundamental business function rather than a technical discipline.

Key Risk Factors and Uncertainties

Several critical factors could significantly impact the trajectory of software dev evolution:

Technical debt accumulation

Security integration challenges

Maintaining code quality

Organizations should monitor these factors closely and develop contingency strategies to mitigate potential negative impacts on technology implementation timelines.

Alternative Future Scenarios

The evolution of technology can follow different paths depending on various factors including regulatory developments, investment trends, technological breakthroughs, and market adoption. We analyze three potential scenarios:

Optimistic Scenario

Rapid adoption of advanced technologies with significant business impact

Key Drivers: Supportive regulatory environment, significant research breakthroughs, strong market incentives, and rapid user adoption.

Probability: 25-30%

Base Case Scenario

Measured implementation with incremental improvements

Key Drivers: Balanced regulatory approach, steady technological progress, and selective implementation based on clear ROI.

Probability: 50-60%

Conservative Scenario

Technical and organizational barriers limiting effective adoption

Key Drivers: Restrictive regulations, technical limitations, implementation challenges, and risk-averse organizational cultures.

Probability: 15-20%

Scenario Comparison Matrix

Factor	Optimistic	Base Case	Conservative
Implementation Timeline	Accelerated	Steady	Delayed
Market Adoption	Widespread	Selective	Limited
Technology Evolution	Rapid	Progressive	Incremental
Regulatory Environment	Supportive	Balanced	Restrictive
Business Impact	Transformative	Significant	Modest

Transformational Impact

Technology becoming increasingly embedded in all aspects of business operations. This evolution will necessitate significant changes in organizational structures, talent development, and strategic planning processes.

The convergence of multiple technological trends—including artificial intelligence, quantum computing, and ubiquitous connectivity—will create both unprecedented security challenges and innovative defensive capabilities.

Implementation Challenges

Technical complexity and organizational readiness remain key challenges. Organizations will need to develop comprehensive change management strategies to successfully navigate these transitions.

Regulatory uncertainty, particularly around emerging technologies like AI in security applications, will require flexible security architectures that can adapt to evolving compliance requirements.

Key Innovations to Watch

Artificial intelligence, distributed systems, and automation technologies leading innovation. Organizations should monitor these developments closely to maintain competitive advantages and effective security postures.

Strategic investments in research partnerships, technology pilots, and talent development will position forward-thinking organizations to leverage these innovations early in their development cycle.

Technical Glossary

Key technical terms and definitions to help understand the technologies discussed in this article.

Understanding the following technical concepts is essential for grasping the full implications of the technologies discussed in this article. These definitions provide context for both technical and non-technical readers.

CI/CD intermediate

algorithm

API beginner

interface APIs serve as the connective tissue in modern software architectures, enabling different applications and services to communicate and share data according to defined protocols and data formats.

Visual explanation of API concept

Example: Cloud service providers like AWS, Google Cloud, and Azure offer extensive APIs that allow organizations to programmatically provision and manage infrastructure and services.

encryption intermediate

platform Modern encryption uses complex mathematical algorithms to convert readable data into encoded formats that can only be accessed with the correct decryption keys, forming the foundation of data security.

Visual explanation of encryption concept

platform intermediate

encryption Platforms provide standardized environments that reduce development complexity and enable ecosystem growth through shared functionality and integration capabilities.

Kubernetes intermediate

API

framework intermediate

cloud computing

cloud computing beginner

middleware

microservices intermediate

scalability

Top 5 Large Language Models and How To Use Them Effectively - Related to 42, platform, modern, complex, means

42 Experts, One Mission: Advancing Platform Engineering

SHARE

Top 5 Large Language Models and How To Use Them Effectively

SHARE

The Modern CDN Means Complex Decisions for Developers

SHARE

Market Impact Analysis

Market Growth Trend

Quarterly Growth Rate

Market Segments and Growth Drivers

Technology Maturity Curve

Competitive Landscape Analysis

Future Outlook and Predictions

Year-by-Year Technology Evolution

Technology Maturity Curve

Innovation Trigger

Peak of Inflated Expectations

Trough of Disillusionment

Slope of Enlightenment

Plateau of Productivity

Technology Evolution Timeline

Expert Perspectives

Areas of Expert Consensus

Short-Term Outlook (1-2 Years)

Mid-Term Outlook (3-5 Years)

Long-Term Outlook (5+ Years)

Key Risk Factors and Uncertainties

Alternative Future Scenarios

Optimistic Scenario

Base Case Scenario

Conservative Scenario

Scenario Comparison Matrix

Transformational Impact

Implementation Challenges

Key Innovations to Watch

Technical Glossary

CI/CD intermediate

API beginner

Related Terms

encryption intermediate

Related Terms

platform intermediate

Related Terms

Kubernetes intermediate

framework intermediate

cloud computing beginner

Related Terms

microservices intermediate

Related Articles

3 Foundational Principles for Writing Efficient SQL - Related to xai, redis, models, fraud, 3

Terraform Smarter, Not Harder: The Power of Modular Infrastructure as Code - Related to gitlab:, harder:, nice, hear, nit.

How To Run DeepSeek R1 on AWS Using Infrastructure as Code - Related to how, run, using, status, r1