Skip to main content

The history and future of 3DMark, the world’s most popular gaming benchmark

If you have ever cared about how powerful your PC really is, then you’ve almost certainly used a piece of Futuremark software.

But you probably don’t know the story of where that company came from.

PCMark, Sysmark, VRMark and most famously, 3DMark, are just a taste of some of the benchmarks that have come out of a company that has been on the cutting edge of the underlying technology behind the latest games, for the best part of 20 years. Today, just as when gamers were asking each other “can it run Crysis?” overclockers and enthusiasts continue to push systems to their limit in the hope of earning the bragging rights of the highest 3Dmark scores in the world.

But what makes a benchmark? And what are Futuremark’s plans for the future?

Fortunately, there is one man we can turn to for answers to these questions — Futuremark’s commercial director Jani Joki, who has been with the company from its very earliest days.

Early days

Futuremark is an offshoot from Remedy Entertainment, which is best known for the Max Payne games. Back in the late ’90s it had created its first game, Death Rally, and was working on its second, which would ultimately become the pioneer of bullet-time, Max Payne.

“This was the age when 3D acceleration was first coming into being,” said Joki. “Remedy was contacted by a magazine publisher, VNU Publications. Together they had the idea of creating a classical demo scene made for 3D accelerators, that would also measure at the same time.”

Remedy’s programmers were capable of building some of the best graphical showcases in the world. Looking to market that skill, it agreed to team up with VNU Publications to create what would become the first benchmark aimed at gamers and 3D acceleration. The magazine would publish the branded benchmark as its own, as a tool for readers and to show “something cool that they did,” as Joki puts it.

“It was produced as a side project alongside the games Remedy was working on and debuted at Assembly, a large demoscene event,” he said. “It caught on, because it had a basic measuring system to it. It was largely supposed to look cool, but the concept that this new 3D acceleration should be measured, really caught on.”

The late ’90s was an interesting time for graphical hardware. Today’s gamers have no choice but to debate the merits of Nvidia or AMD’s latest graphics cards, but back then, gamers had many companies to choose from. Yet there was little knowledge among consumers, or even hardware and software developers, about which 3D accelerators were any good, or what techniques worked best.

Many companies claimed to have the best solution. Futuremark’s first benchmark provided a chance to prove it.

Building a benchmark

Futuremark’s founding came about at an important time in gaming, as many exciting new technological developments emerging.

“DirectX 7 was a huge new thing,” said Joki, “It was immensely popular, and T&L was standardized then. Almost everyone was using it.”

Games look better because of the graphical artists that work on them.

T&L, or transform, clipping and lighting, is a combination of rendering two-dimensional views of a 3D scene, only drawing parts of a scene that will be present in the picture when rendering is complete, and altering the color of various surfaces depending on the way the scene is lit.

Until supporting APIs and hardware were engineered, T&L had been handled exclusively by software, and processed by the CPU. It was hardware support for technologies like this that gave people a real reason to upgrade to powerful, dedicated graphics cards and in turn, gave them a reason to run testing software like 3DMark 2000. But just as those technologies drive consumers, they also gave Futuremark the spark to create a new benchmark.

“It can be a new generation of DirectX, a new generation of Windows or hardware or a combination, like DirectX 12 and Windows 10. ” said Joki. “We don’t just create benchmarks for the hell of it, there has to be some kind of demand.”

Futuremark/Twitter
Futuremark/Twitter

That’s not to say that Futuremark is entirely reactive. It is aware of what’s coming one or two years down the pipeline, so it can prepare its benchmarks accordingly.

Part of that is the natural progression of hardware and software, with Joki claiming that Futuremark can make educated guesses about the long-term future, but it is also aware of some specific technologies that are coming in the future due to close ties with hardware and software developers.

“When we launch a new benchmark, it’s imperative that it’s valid for what’s already been launched in the past six to 12 months, and should be valid for what’s coming in the next one to two years,” he explained.

Falling behind the artistic curve

Futuremark’s focus on underlying technologies is an important aspect of what makes its software so useful. Because its tests are developed in-house, it’s engine agnostic, which means it’s testing the underlying tools used by everything, from the latest CryEngine to recent Unity releases, on hardware from various manufacturers.

Image used with permission by copyright holder

“The neutrality we have held is always absolute,” laughed Joki. “We joke within Futuremark that if any one hardware vendor is too happy with a benchmark we’re developing, we need to take another look at it.”

That’s important because, if it were to build its test in any one engine already available, there would be technologies utilized in different ways by others, which it wouldn’t be able to cater to as well. It would also make it easy for hardware developers to ‘cheat’ the system, by optimizing for whatever engine is most common.

“We joke within Futuremark that if any one hardware vendor is too happy with a benchmark […], we need to take another look at it.”

But that’s also why software from 3DMark doesn’t look as good as some of the prettiest games. While Futuremark was born in a time where programmers held the key to divine artistry, today it’s much more to do with the artists themselves.

“When 3DMark 2001 came out, we could pretty easily say that we had graphical superiority over most games. The graphics we could create were more interesting and more realistic looking that just about anything that was out at that time,” said Joki “While we can still do that on a technical level today, games now look better because of the graphical artists that work on them. We still employ five great graphical artists, but we can’t compete with companies that hire hundreds to work on their game,” he said.

That’s something that Futuremark has come to terms with over the past decade, and Joki believes it’s a little more obvious in recent ones. Its last benchmarks might not be quite as pleasing to the eye when compared to contemporary games, but they still tax systems like never before. And that, ultimately, is what matters most.

Sure, but can it run Crysis?

Even if today’s 3Dmark software doesn’t look as aesthetically impressive as some of the AAA titles out there, this isn’t the first time developers have challenged Futuremark’s software beauty, nor its ability to test the viability of hardware. At the turn of the century, as 3Dmark and Futuremark’s popularity increased, game developers caught on that offering their game as a testing suite provided more content for gamers, and in the case of reviewers regularly using them for testing purposes, free publicity.

Many games over the years have been used for this purpose, but one game still stands out as a paragon of not only beauty, but its ability to crush hardware hopes and dreams. Crysis.

“Can it run Crysis?” is an old meme at this point, but one that persists in comment sections to this day. And to Joki, it’s still an important question. Not because of Crysis itself, but because in his mind, if you are buying a PC to play a specific game, no test can better tell you how that game will run than a test baked into the game itself.

“Using Crysis to predict how well your PC might run Civilization IV would not have been that easy.”

“What we at Futuremark try to do, is create something that, should you buy a bunch of games and measure all of them and then aggregate all of them into one number — that’s roughly what 3Dmark is designed to do,” Joki said.

But did Futuremark think Crysis was a good way to measure PC performance when it was released?

“In some ways yes,” said Joki, tactfully. “It had a lot of cool stuff in it, so I have nothing against it, though it did use some effects perhaps a little excessively. Because of that, it was a pretty good test for certain aspects of graphics hardware, but the scaling between different graphics components was not necessarily accurate. Using Crysis to predict how well your PC might run Civilization IV would not have been that easy.”

Changing of the guard

With all its programming skill and artistic talent, you might ask why Futuremark hasn’t made games of its own. If you know a little about its history already, though, you’ll know it has. It launched the Futuremark Games Studio in 2008, and released its first game, Shattered Horizons in 2009.

“This was something a lot of people requested from us,” explained Joki. “We had the resources, the manpower, and the interest, and we decided to try and see what would actually happen.”

While Shattered Horizons saw moderate success, it wasn’t the financial hit that Futuremark’s owners were looking for. After five years, the gaming division was sold off to Rovio, the developer of Angry Birds. That profit driven thinking is why today Futuremark is owned by Underwriters Laboratories, an American safety consultation and testing company. You might be familiar with its logo, which can be found on variety of equipment, including most laptop and smartphone power chargers.

Futuremark has always been owned by venture capitalists, who invested in its earliest days, back in 1999. In 2014, with eyes on a different horizon, those investors sold Futuremark off to the testing firm, which Joki sees as a perfect fit.

“Put simply, UL is a company which does testing, and we’re a company that makes testing software,” said Joki. Including benchmarking in UL’s testing system makes sense for everyone and that gelled from our first meeting together.”

A surprising benefit of this move is that it lends more credibility to Futuremark’s stance as a disinterested third party in graphical hardware market. UL is legally bound to be impartial, so Futuremark is now also bound by such constraints. Remember that the next time you read someone complaining that its latest benchmark unfairly favors one graphics card manufacturer over another.

Is there an event horizon for graphics?

Futuremark’s extensive work in standardization of 3D benchmarking has an end goal beyond the creation of an accurate benchmark. It’s also part of the push for faster, better hardware. Games are constantly pushing the boundaries of what’s possible in real-time visual rendering. Photorealism has been touted for years as the point of no return, where we stop being able to distinguish between rendered graphics and reality. But does that mean 3DMark will eventually engineer itself out of work?

That’s not something Joki’s too worried about in the near future. It’s not a problem you can solve just by throwing more hardware at it, he says. You need the software development right alongside the hardware to be able to achieve true-to-life visuals. And while we’re closer to photo-realism than ever before, each success makes the next step harder.

Still, even if we do reach a point where we can make a scene look close to reality, there is still much more to be done, and much more to be tested.

“I just don’t see a point where we can say that there is sufficient performance,” Joki said. “Let’s say you can render a house at a photo-realistic level. Add 100 more houses and suddenly the scene is vastly more complex.”

“Plus, I’m pretty sure that the need for measuring how fast things are in different scenarios will still be needed […] There are situations even today where people could say that it doesn’t matter what hardware you have. If you only use Gmail, then any computer will do, but even the people buying pre-built systems for that still like to see how much faster it is than their old one. People who buy cars still like to know what the 0-60 speed is, even when buying a family sedan.”

Getting down to the mantle

With Futuremark not yet foreseeing its own demise, it has a lot of work still to do, and much of that will be taking advantage of new APIs. With DirectX12 gathering support among developers, and similar APIs like Vulkan helping draw more power from existing hardware, that’s going to be an immediate focus of the benchmark developer.

Is your PC ready for VR? Find out with VRMark.

Joki tells us that there are new DirectX12 tests coming to VRmark, as well as an entirely new version which will be shown off at the Games Developer Conference at the end of this month and will launch shortly after. There’s also “some sort of 3Dmark launch” planned before the end of the year and Futuremark is already working on the next full version of it.

Whatever’s released, it’s certain that Futuremark will remain an essential part of the 3D graphics community. Its long history has rightly given it great influence, and we’ll need the company to use that in building next-generation benchmarks for tomorrow’s gamers.

Jon Martindale
Jon Martindale is the Evergreen Coordinator for Computing, overseeing a team of writers addressing all the latest how to…
A dangerous new jailbreak for AI chatbots was just discovered
the side of a Microsoft building

Microsoft has released more details about a troubling new generative AI jailbreak technique it has discovered, called "Skeleton Key." Using this prompt injection method, malicious users can effectively bypass a chatbot's safety guardrails, the security features that keeps ChatGPT from going full Taye.

Skeleton Key is an example of a prompt injection or prompt engineering attack. It's a multi-turn strategy designed to essentially convince an AI model to ignore its ingrained safety guardrails, "[causing] the system to violate its operators’ policies, make decisions unduly influenced by a user, or execute malicious instructions," Mark Russinovich, CTO of Microsoft Azure, wrote in the announcement.

Read more