quote: As Valve sees things, however, the era of pretty visuals is coming to an end. We have now reached the point where in terms of graphics most people are more than satisfied with what they see.
Nope. That's what Valve said. Graphics and animations can be improved, but there are lots of other gameplay issues that have been pushed to the side in pursuit of better graphics. With cards like the GeForce 8800, they should be able to do just about anything they want on the graphics side of things, so now they just need to do more in other areas.
quote: With cards like the GeForce 8800, they should be able to do just about anything they want on the graphics side of things
And 640K of RAM should be enough for anybody? Yeah right. There are certainly other things besides graphics that need to be tended to, but when even rendered cutscenes don't look convincing, it's extremely premature to say.
So you would take further increases in graphics over anything else? Personally, I'm quite happy with what I see in many titles of the past 3 years. Doom 3, Far Cry, Quake 4, Battlefield 2, Half-Life 2... I can list many more. All of those look more than good enough to me. Could they be better? SURE! Do they need to be? Not really. I'd much rather have some additional improvements besides just prettier graphics, and that's what Valve was getting at.
What happens if you manage to create a photo-realistic game, but the AI sucks, the physics sucks, and the way things actually move and interact with each other isn't at all convincing? Is photo-realism (which is basically the next step -- just look at Crysis screenshots and tell me that 8800 GTX isn't powerful enough) so important that we should ignore everything else? Heck, some games are even better because they *don't* try for realism. Psychonauts anyone? Or even Darwinia? Team Fortress 2 is going for a more cartoony and stylistic presentation, and it looks pretty damn entertaining.
The point is, ignoring most other areas and focusing on graphics is becoming a dead end for a lot of people. What games is the biggest money maker right now? World of WarCraft! A game that will play exceptionally well on anything the level of X800 Pro/GeForce 6800 GT or faster. There are 7 million people paying $15 per month that have basically said that compelling multiplayer environments are more important to them than graphics.
Sorry if I was a bit too argumentative. Basically, the initial statement is still *Valve's* analysis. You can choose to agree or disagree, but I think it's pretty easy to agree that in general there are certainly other things that can be done besides just improving graphics. I don't think Valve intends to *not* improve graphics, though; just that it's not the only thing they need to worry about. Until we get next-gen games that make use of quad cores, though, the jury is out on whether or not "new gameplay" is going to be as compelling as better visuals.
quote: We will take a look at how CPU cache and memory bandwidth affects performance in the future, but at present it pretty clear that Core 2 once again holds a commanding performance lead over AMD's Athlon 64/X2 processors.
I'm pretty sure 'it' in this sentence should be "it's", or "it is" (sorry, but it was bad enough to stop me when reading, thinking I mis-read the sentance somehow).
Good article, and it will be interresting to see who follows suite, and when. Hopefully this will become the latest fad in programming, and has me wanting to code my own services here at home for encoding video, or anything that takes more than a few minutes ;)
Why aren't any high-end AMD CPUs tested? You're testing 2GHz AMD CPUs against 2.6+ GHz Intel CPUs. Doesn't Anandtech have access to faster AMD chips? I know the point of the article is to compare single- and multi-core CPUs, but it seems a bit odd that all the Intel CPUs are top-of-the-line while all AMD CPUs are low end.
AnandTech? Yes. Jarred? Not right now. I have a 5000+ AM2, but you can see that performance scaling doesn't change the situation. 1MB AMD chips do perform better than 512K versions, almost equaling a full CPU bin - 2.2GHz Opteron on 939 was nearly equal to the 2.4GHz 3800+ (both OC'ed). A 2.8 GHz FX-62 still isn't going to equal any of the upper Core 2 Duo chips.
It must be a really great feeling for Valve knowing they have the capacity and capability to deliver this new engine to EVERY customer and player of their games as soon as it's ready. What a massive and ugly patch that would be for virtually any other developer.
Don't really see how you could hate on Steam nowadays considering things like that. It's really powerful and works really well.
While I design software (so not so much programming as GUI design and whatnot), I can remember my University courses dealing with threading, and all the pain threading can bring.
I predicted (though I'm sure many could say this and I have no public proof) that Valve would be one of the first to do such work, they are a very forward thinking company with large resources (like Google--they want to work on ANYthing, they can...), a great deal of experience and, (as noted in the article) the content delivery system to support it all.
Great article about a great subject, goes a long way to putting to rest some of the fears myself and others have about just how well multi-core chips will be used (with the exception of Cell, but after reading a lot about Cell's hardware I think it will always be an insanely difficult chip to code for).
Jarred, I wanted to thank you for explaining in terms simple enough for my extremely non-technical wife to understand why I just bought a dual-core CPU! That was a great progression on it as well, going through the various multi-threading techniques. I am saving that for future reference.
Another excellent article, I am extremely pleased with the depth your articles provide, and somehow, every time I come up with questions while reading, you always seem to answer exactly what I was thinking! It's great to see you can write on a technical level but still think like a common reader so you know how to appeal to them.
With regards to Valve, well, I knew they were the best since Half-Life 1 and it still appears to be so. I remember back in the days when we weren't even sure if Half-Life 2 was being developed. Fast forward a few years and Valve is once again revolutionizing the industry. I'm glad HL2 was so popular as to give them the monetary resources to do this kind of development.
Right now I'm still sitting on a single core system with XP Pro and have lots of questions bustling in my head. What will be the sweet spot for Episode 2? Will a quad core really offer substantially better features than a dual core, or a dual core over a single core? Will Episode 2 be fully DX10, and will we need DX10 compliant hardware and Vista by its release? Will the rollout of the multithreaded Source engine affect the performance I already see in HL2 and Episode 1? Will Valve actually end up distributing different versions of the game based on your hardware? I thought that would not be necessary due to the fact that their engine is specifically designed to work for ANY number of cores, so that takes care of that automatically. Will having one core versus four make big graphical differences or only differences in AI and physics?
Like you said yourself, more questions than answers at this point!
One last question I forgot to put in. Say it was somehow possible to build a 10 or 15 GHz single core CPU with reasonable heat output. Would this be better than the multi-core direction we are moving towards today? In other words, are we only moving to mult-core because we CAN'T increase clock speeds further, or is this the preferred direction even if we could.
A higher clock speed processor would be better, assuming performance scaled well enough anyway. Parallel hardware is less general then serial hardware at increasing performance because it requires parallelism to be present in the workload. If the work is highly serial, then adding parallelism to the hardware does nothing at all. Conversely, even if the workload is highly parallel, doubling serial performance still doubles performance. Doubleing the width of a unit could double the performance of that unit for certain workloads, while doing nothing at all for others. In general, if you can accelerate the entire system equally, doubling serial performance will always double program speed, regardless of the program.
Thats the theory anyway. Practice says you can only make certain parts faster. So you might get away with doubling clock speed, but probably not halving memory latency, so your serial performance doesn't scale like you'd hope. Not to mention increasing serial performance is extremely expensive compared to parallel performance. But if it were possible, no one would ever bother with parallelism. Its a huge pain in the ass from a software perspective, and its becoming big now mostly because we're starting to run out of tricks to increase serial performance.
In my experience relying on atomic CPU swap operations isn't enough as it only works with a single value (32 bit word for example).
While you lock and swap a 32 bit Y value, someone else has just finished reading the newly written X value but beat you to the lock to read the old Y value before you've updated. Clearly whole data structures need to be coherent, not just small atomic values.
Also it’s unusual to modify objects observable states mid frame. Even if you avoided the above example so that the X,Y pair was always updated together, you'd still have different objects interpreting the position as a whole of that object in different places at different times. State data must be held constant to all observers throughout the context of a single frame.
Even if you avoided the above example so that the X,Y pair was always updated together, you'd still have different objects interpreting the position as a whole of that object in different places at different times in the same frame.
I'm assuming your comment is in regards to the PS3/Cell comments on the last page? It's sort of sounds like you're arguing about the way Valve has chosen to go about doing things, or that you disagree with some of the opinions they've expressed concerning other hardware. We have only tried to provide a very high-level overview of what Valve is doing, and we hardly touched the low-level details -- Valve didn't spend a lot of time on specific implementation issues either. All they did was provide us with some information about what they are doing, and a bit of opinion on what they think of the rest of the hardware options.
Preventing anything else from doing write operations to the world state during an entire frame in order to keep things coherent is a big problem with multithreading. Apparently Valve has found a way around that, or at least found a way to do it more efficiently, using lock free and wait free algorithms. No, I can't honestly say I really understand what those algorithms do, but if they say it worked better for their code base I'm willing to trust them.
As far as the PS3/Cell processor goes, Valve did say that they have various thoughts on how to properly utilize the architecture. It is simply going to be more difficult to do relative to Xbox 360 and PC. It's not impossible, and companies are definitely going to tackle this problem. As far as how they tackle it, I'm more than a bit rusty on my coding background, and other than high-level details I'm not too concerned how they improve their multithreading code on any specific platform, just that they do it.
Compiler add-on's or third party APIs can only serve to hide the details or make things look cleaner. But no matter what, the final barrier between the application and the OS are the API calls provided by the OS threading model. Thus no third party implementation can be better than the OS thread model itself in terms of performance and overhead. All those can do is make it easier to use at the top by handling the OS details.
I imagine threading APIs on popular OSes will start to evolve, just like graphics APIs have, once everyone gets on the multi-core bandwagon and starts to get a feel for what's available in the OS APIs and what they'd rather have. So far, Vista's thread pool API looks good, but I still don't see an API to determine such basic things as checking if the work queue is empty and all threads are idle, etc.
Currently I find it's easier to implement my own thread pool manager which does atomic increments and decrements on a 'task count' variable as tasks are entered or completed in the queue. Checking if all tasks are done involves testing that task count against 0 and signaling an event flag that wakes any management threads sleeping until all its work tasks to complete. It also allows for more flexibility in 'before and after' housekeeping as work threads move from task to task and that kind of control isn't offered in the XP’s built in thread pool API, nor Vista’s as far as I can tell.
Not arguing their methods, a lot of things in this article are in line with my own opinions on multithreading, pretty much the best way to got about it. I'm just pointing out that atomic lock/swap operations in hardware are very primitive and typically operate only on CPU word size values, not entire data structures. Thus it's possible between doing two atomic operations on two variables on one core, another core can get an old version of one variable and a new version of another.
core1: compute X
core2: ...
core1: lock/write x
core2: read x, get newly written version
core1: compute Y
core2: read Y, get old y before the update
core1: lock/write Y
core2: ...
The task on core2 is working with inconsistent data, the new X and the old Y. If the task on core2 only uses the data as input, i.e.: AI tracking another AI entity, it has the wrong position, and won't know about it since it has no need to perform its own lock/write (so it never gets the exception that says the value changed). Even if it did, it would have to throw out all work and redo it with the new Y, and then it could possibly change again.
Looping and retrying seems wasteful. And I’m thinking the only way to catch such a hardware error on a failed lock/write update is via exceptions, and handling a thrown exception on an attempt to write a single 32 bit value is very wasteful of CPU cycles.
In my own research I have had excellent results with double buffering any modified data. Each threaded task only updates its hidden internal working state for frame n+1 while all reads to the object are read from its external current state for frame n. At the end of the frame when all parallel tasks have completed, the current/working states are swapped, and the work queue is filled again to start the next frame.
This ensures that throughout the entire computation of frame n+1, the current frame n state will be available to all threads, and guaranteed to not be modified through the duration of current frame. So basically all threads can read anything they want and modify their own data. On PC/360 the time to swap everything is basically nothing; you just swap a few pointers, or a single pointer to an array/structure of current/working data for the frame.
On the PS3 some data copying and moving will be required, but this is mandatory due to design anyway and assisted by an extremely smart and powerful DMAC.
One place to be critical about is message passing between objects since it requires posting (writing) data to be picked up by another object. But the time to lock/post/unlock a queue is negligible compared to the time it takes to process the results leading up to the creation of the message. This is similar to the D3D notion of doing as much as you can before you lock and only do the minimal work needed inside the lock and unlock as quickly as possible.
My question: Will Valve's games in 2007 be released with specificaitons such as: "For minimum requirements you need a dual-core cpu, for maximum results you need a quad-core" or anything to that nature? Because I seem to be confused in what Valve is working on dual or quad or both or neither or something different, and what I should get to best utilize their games and multi-core software in general.
Episode Two should come out sometime in 2007, and before that happens you will get the multithreading patch affecting previous Source engine titles. Right now, it doesn't sound like anything released in the next year or so from valve is going to require dual cores. That's what I was trying to get out on the conclusion page where I mentioned that they are targeting an "equivalent experience" regardless of what sort of processor you are running.
So just like you could turn down the level of detail in Half-Life 2 and run it on DX8 or even DX7 hardware, Source engine should be able to accommodate single core processors all the way up through N-core processors. The engine will spawn as many threads as you have processor cores, with one main thread serving as the controller and N - 1 helper threads. Xbox 360 for example would have 5 helper threads plus the master thread, because it has three course each capable of executing to threads simultaneously.
it's a nice article. i'm anxious to see what develops out of Valve in the near future.
how about using some Athlon FX, Opteron 100/1200 series or higher speed Athlon X2s for crying out loud. i know that here is a performance gap even with the higher AMD models currently, but please show us an attempt at not being as one-sided.
The tests at 2.0 and 2.4 Ghz for the AMD chips are there, you can plot how it 'should' improve. Here are my scaling projections based on the data given for the 939 platform.
Your Tested Systems details on page 8 - the last three show a CPU of a Venice 3200+ oc'd to 2.4GHz (hey that's what I run too!), but the headings suggest they should really show Intel chips. :)
quote: If you think about the way people move through the real world, they are constantly bumping into other objects or touching other objects and people. While this would be largely a visual effect, ... Two characters running past each other could even bump and react realistically, with arms and bodies being nudged to the side instead of mysteriously gliding past each other.
Yes please make it only visual and not actually a part of the game, because much like trying too hard to make graphics look realistic, it really just adds more frustration for the player than anything else. It's already annoying in HL2 getting caught up on the edge of prop_physics related objects. The last thing we need is also getting caught or bumped by another player ;)
If the world is more interactive, there might not actually be a need for "prop physics" objects. There's also the potential to make a game with different gameplay mechanics depending on how the character interacts with the world. It sounds like initially Valve will make it a visual enhancement, to make entities look more lifelike in their behavior, but down the road others could potentially do more. Like most of the enhancements, what we want to see is how they can actually change and improve gameplay beyond just being visual.
"You're doing a disservice to the customer if you're not using all of the CPU power."
okay so that alone shows the need for dual-core so the customer can offload background tasks to a second core so the game can truly get 100% of a core. ;)
Background tasks?? I agree with Gabe that I want all of my CPU working the game. I'm sitting here with an Athlon X2 4800+ and I can't run Flight Sim X at an acceptable frame rate all the while I watch the graph that shows CPU#2 at idle while CPU#1 tries to run the entire program by itself!
On another note, how about Microsoft develope some type of new API (DirecThread?) that automatically takes care of utilizing multiple cores so that game companies like Valve don't have to employ entire multi-threaded R&D divisions just so their games use both cores.
It seems like we are going back to the caveman days before DirectInput, DirectSound, & DirectX etc. Remember when you had to choose your sound card, joystick, and video card from a list within the game?? Let's not go there again!!
That was a direct quote from... Gabe I think. Others might disagree, but I thought it was an interesting take on things. As for background tasks, they usually only need a bit of CPU time (less than 5% in most cases), so unless you really want to encode videos and play games at the same time....
Which obviously ruined the whole article. Seriously yacoub, a little tact with posting would be nice. Rather than stating:
quote: What a worthless attempt at sarcasm:
You could have just said something normal like:
quote: The sarcasm was a bit flat, and you ended with a preposition.
We are real people, and derogatory adjectives like "worthless" tend to irritate more than help. You may not intend it that way, but imagine for a second someone talking about what you worked on for a couple days last week and describing it as "absolute garbage". The key word in "constructive criticism" is to actually make it constructive.
Now, thanks for the suggestion, and I'm more than happy to change things a bit to appeas people. Like most people, I just appreciate a bit more consideration, even if I'm just doing my job here. :) As the old cliche goes, you can catch more flies with honey than with vinegar. (Well, my grandpa used to say that anyway.)
page 8 test setup, clear a cut/paste job... all of the cpu's are the same Athlon.
put an allendale in the benchmark, lot's of people want to see how cache related this multithread sw is.
Valve talks about 64-bit... tests are 32bit? Since some competitors are talking about 64-bit code in there gaming also, should be interesting to see what the difference is vs 32bit.
Sorry 'bout that - I've fixed the CPU line. All systems were tested at stock and 20% OC. The problem with Allendale is that it has different stock clock speeds, so you're not just comparing different CPU speeds (unless you use a lower multiplier on Conroe). Anyway, this is a first look, and the next CPU article should have additional benches.
The tests are all 32-bit. This is not full Episode 2 code, and most people are waiting for Vista to switch to running 64-bit anyway. All we know is that Episode 2 will support 64-bits natively, but we weren't given any information on how it will perform in such an environment yet.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
55 Comments
Back to Article
edfcmc - Thursday, November 23, 2006 - link
I always thought the dude with the valve in his eye was Gabe Newell. Now I know better.msva124 - Thursday, November 9, 2006 - link
Um.....you're kidding, right?
JarredWalton - Thursday, November 9, 2006 - link
Nope. That's what Valve said. Graphics and animations can be improved, but there are lots of other gameplay issues that have been pushed to the side in pursuit of better graphics. With cards like the GeForce 8800, they should be able to do just about anything they want on the graphics side of things, so now they just need to do more in other areas.msva124 - Thursday, November 9, 2006 - link
And 640K of RAM should be enough for anybody? Yeah right. There are certainly other things besides graphics that need to be tended to, but when even rendered cutscenes don't look convincing, it's extremely premature to say.
JarredWalton - Thursday, November 9, 2006 - link
So you would take further increases in graphics over anything else? Personally, I'm quite happy with what I see in many titles of the past 3 years. Doom 3, Far Cry, Quake 4, Battlefield 2, Half-Life 2... I can list many more. All of those look more than good enough to me. Could they be better? SURE! Do they need to be? Not really. I'd much rather have some additional improvements besides just prettier graphics, and that's what Valve was getting at.What happens if you manage to create a photo-realistic game, but the AI sucks, the physics sucks, and the way things actually move and interact with each other isn't at all convincing? Is photo-realism (which is basically the next step -- just look at Crysis screenshots and tell me that 8800 GTX isn't powerful enough) so important that we should ignore everything else? Heck, some games are even better because they *don't* try for realism. Psychonauts anyone? Or even Darwinia? Team Fortress 2 is going for a more cartoony and stylistic presentation, and it looks pretty damn entertaining.
The point is, ignoring most other areas and focusing on graphics is becoming a dead end for a lot of people. What games is the biggest money maker right now? World of WarCraft! A game that will play exceptionally well on anything the level of X800 Pro/GeForce 6800 GT or faster. There are 7 million people paying $15 per month that have basically said that compelling multiplayer environments are more important to them than graphics.
msva124 - Thursday, November 9, 2006 - link
Fine. You win.JarredWalton - Thursday, November 9, 2006 - link
Sorry if I was a bit too argumentative. Basically, the initial statement is still *Valve's* analysis. You can choose to agree or disagree, but I think it's pretty easy to agree that in general there are certainly other things that can be done besides just improving graphics. I don't think Valve intends to *not* improve graphics, though; just that it's not the only thing they need to worry about. Until we get next-gen games that make use of quad cores, though, the jury is out on whether or not "new gameplay" is going to be as compelling as better visuals.Cheers!
Jarred
cryptonomicon - Wednesday, November 8, 2006 - link
http://images.anandtech.com/reviews/tradeshows/200...">http://images.anandtech.com/reviews/tra...2006/val...So, what are those steel or aluminum models at the end? Are they the real world references for the Team fortress source weapon models? :D
yyrkoon - Wednesday, November 8, 2006 - link
I'm pretty sure 'it' in this sentence should be "it's", or "it is" (sorry, but it was bad enough to stop me when reading, thinking I mis-read the sentance somehow).
Good article, and it will be interresting to see who follows suite, and when. Hopefully this will become the latest fad in programming, and has me wanting to code my own services here at home for encoding video, or anything that takes more than a few minutes ;)
yyrkoon - Wednesday, November 8, 2006 - link
meh, sorry, that 'typo' is on the second to the last page, I guess about half way down :/Nighteye2 - Wednesday, November 8, 2006 - link
Ok, so that's how Valve will implement multi-threading. But what about other companies, like Epic? How does the latest Unreal Engine multi-thread?Justin Case - Wednesday, November 8, 2006 - link
Why aren't any high-end AMD CPUs tested? You're testing 2GHz AMD CPUs against 2.6+ GHz Intel CPUs. Doesn't Anandtech have access to faster AMD chips? I know the point of the article is to compare single- and multi-core CPUs, but it seems a bit odd that all the Intel CPUs are top-of-the-line while all AMD CPUs are low end.JarredWalton - Wednesday, November 8, 2006 - link
AnandTech? Yes. Jarred? Not right now. I have a 5000+ AM2, but you can see that performance scaling doesn't change the situation. 1MB AMD chips do perform better than 512K versions, almost equaling a full CPU bin - 2.2GHz Opteron on 939 was nearly equal to the 2.4GHz 3800+ (both OC'ed). A 2.8 GHz FX-62 still isn't going to equal any of the upper Core 2 Duo chips.archcommus - Tuesday, November 7, 2006 - link
It must be a really great feeling for Valve knowing they have the capacity and capability to deliver this new engine to EVERY customer and player of their games as soon as it's ready. What a massive and ugly patch that would be for virtually any other developer.Don't really see how you could hate on Steam nowadays considering things like that. It's really powerful and works really well.
Zanfib - Tuesday, November 7, 2006 - link
While I design software (so not so much programming as GUI design and whatnot), I can remember my University courses dealing with threading, and all the pain threading can bring.I predicted (though I'm sure many could say this and I have no public proof) that Valve would be one of the first to do such work, they are a very forward thinking company with large resources (like Google--they want to work on ANYthing, they can...), a great deal of experience and, (as noted in the article) the content delivery system to support it all.
Great article about a great subject, goes a long way to putting to rest some of the fears myself and others have about just how well multi-core chips will be used (with the exception of Cell, but after reading a lot about Cell's hardware I think it will always be an insanely difficult chip to code for).
Bonesdad - Tuesday, November 7, 2006 - link
mmmmmmmmm, chicken and mashed potatoes....Aquila76 - Tuesday, November 7, 2006 - link
Jarred, I wanted to thank you for explaining in terms simple enough for my extremely non-technical wife to understand why I just bought a dual-core CPU! That was a great progression on it as well, going through the various multi-threading techniques. I am saving that for future reference.archcommus - Tuesday, November 7, 2006 - link
Another excellent article, I am extremely pleased with the depth your articles provide, and somehow, every time I come up with questions while reading, you always seem to answer exactly what I was thinking! It's great to see you can write on a technical level but still think like a common reader so you know how to appeal to them.With regards to Valve, well, I knew they were the best since Half-Life 1 and it still appears to be so. I remember back in the days when we weren't even sure if Half-Life 2 was being developed. Fast forward a few years and Valve is once again revolutionizing the industry. I'm glad HL2 was so popular as to give them the monetary resources to do this kind of development.
Right now I'm still sitting on a single core system with XP Pro and have lots of questions bustling in my head. What will be the sweet spot for Episode 2? Will a quad core really offer substantially better features than a dual core, or a dual core over a single core? Will Episode 2 be fully DX10, and will we need DX10 compliant hardware and Vista by its release? Will the rollout of the multithreaded Source engine affect the performance I already see in HL2 and Episode 1? Will Valve actually end up distributing different versions of the game based on your hardware? I thought that would not be necessary due to the fact that their engine is specifically designed to work for ANY number of cores, so that takes care of that automatically. Will having one core versus four make big graphical differences or only differences in AI and physics?
Like you said yourself, more questions than answers at this point!
archcommus - Tuesday, November 7, 2006 - link
One last question I forgot to put in. Say it was somehow possible to build a 10 or 15 GHz single core CPU with reasonable heat output. Would this be better than the multi-core direction we are moving towards today? In other words, are we only moving to mult-core because we CAN'T increase clock speeds further, or is this the preferred direction even if we could.saratoga - Tuesday, November 7, 2006 - link
You got it.A higher clock speed processor would be better, assuming performance scaled well enough anyway. Parallel hardware is less general then serial hardware at increasing performance because it requires parallelism to be present in the workload. If the work is highly serial, then adding parallelism to the hardware does nothing at all. Conversely, even if the workload is highly parallel, doubling serial performance still doubles performance. Doubleing the width of a unit could double the performance of that unit for certain workloads, while doing nothing at all for others. In general, if you can accelerate the entire system equally, doubling serial performance will always double program speed, regardless of the program.
Thats the theory anyway. Practice says you can only make certain parts faster. So you might get away with doubling clock speed, but probably not halving memory latency, so your serial performance doesn't scale like you'd hope. Not to mention increasing serial performance is extremely expensive compared to parallel performance. But if it were possible, no one would ever bother with parallelism. Its a huge pain in the ass from a software perspective, and its becoming big now mostly because we're starting to run out of tricks to increase serial performance.
JarredWalton - Tuesday, November 7, 2006 - link
What's with the octal posting? Too many CPU cores running? ;)I deleted the other 7 identical posts for you. Careful with that Post Comment button!
saratoga - Tuesday, November 7, 2006 - link
Server kept timing out when I hit post, so I assumed it wasn't committing :)exdeath - Tuesday, November 7, 2006 - link
You can see my recent comments on this topic here:http://www.dailytech.com/Article.aspx?newsid=4847&...">http://www.dailytech.com/Article.aspx?newsid=4847&...
In my experience relying on atomic CPU swap operations isn't enough as it only works with a single value (32 bit word for example).
While you lock and swap a 32 bit Y value, someone else has just finished reading the newly written X value but beat you to the lock to read the old Y value before you've updated. Clearly whole data structures need to be coherent, not just small atomic values.
Also it’s unusual to modify objects observable states mid frame. Even if you avoided the above example so that the X,Y pair was always updated together, you'd still have different objects interpreting the position as a whole of that object in different places at different times. State data must be held constant to all observers throughout the context of a single frame.
exdeath - Tuesday, November 7, 2006 - link
Even if you avoided the above example so that the X,Y pair was always updated together, you'd still have different objects interpreting the position as a whole of that object in different places at different times in the same frame.JarredWalton - Tuesday, November 7, 2006 - link
I'm assuming your comment is in regards to the PS3/Cell comments on the last page? It's sort of sounds like you're arguing about the way Valve has chosen to go about doing things, or that you disagree with some of the opinions they've expressed concerning other hardware. We have only tried to provide a very high-level overview of what Valve is doing, and we hardly touched the low-level details -- Valve didn't spend a lot of time on specific implementation issues either. All they did was provide us with some information about what they are doing, and a bit of opinion on what they think of the rest of the hardware options.Preventing anything else from doing write operations to the world state during an entire frame in order to keep things coherent is a big problem with multithreading. Apparently Valve has found a way around that, or at least found a way to do it more efficiently, using lock free and wait free algorithms. No, I can't honestly say I really understand what those algorithms do, but if they say it worked better for their code base I'm willing to trust them.
As far as the PS3/Cell processor goes, Valve did say that they have various thoughts on how to properly utilize the architecture. It is simply going to be more difficult to do relative to Xbox 360 and PC. It's not impossible, and companies are definitely going to tackle this problem. As far as how they tackle it, I'm more than a bit rusty on my coding background, and other than high-level details I'm not too concerned how they improve their multithreading code on any specific platform, just that they do it.
exdeath - Tuesday, November 7, 2006 - link
The other issue is OS support.Compiler add-on's or third party APIs can only serve to hide the details or make things look cleaner. But no matter what, the final barrier between the application and the OS are the API calls provided by the OS threading model. Thus no third party implementation can be better than the OS thread model itself in terms of performance and overhead. All those can do is make it easier to use at the top by handling the OS details.
I imagine threading APIs on popular OSes will start to evolve, just like graphics APIs have, once everyone gets on the multi-core bandwagon and starts to get a feel for what's available in the OS APIs and what they'd rather have. So far, Vista's thread pool API looks good, but I still don't see an API to determine such basic things as checking if the work queue is empty and all threads are idle, etc.
Currently I find it's easier to implement my own thread pool manager which does atomic increments and decrements on a 'task count' variable as tasks are entered or completed in the queue. Checking if all tasks are done involves testing that task count against 0 and signaling an event flag that wakes any management threads sleeping until all its work tasks to complete. It also allows for more flexibility in 'before and after' housekeeping as work threads move from task to task and that kind of control isn't offered in the XP’s built in thread pool API, nor Vista’s as far as I can tell.
exdeath - Tuesday, November 7, 2006 - link
Not arguing their methods, a lot of things in this article are in line with my own opinions on multithreading, pretty much the best way to got about it. I'm just pointing out that atomic lock/swap operations in hardware are very primitive and typically operate only on CPU word size values, not entire data structures. Thus it's possible between doing two atomic operations on two variables on one core, another core can get an old version of one variable and a new version of another.core1: compute X
core2: ...
core1: lock/write x
core2: read x, get newly written version
core1: compute Y
core2: read Y, get old y before the update
core1: lock/write Y
core2: ...
The task on core2 is working with inconsistent data, the new X and the old Y. If the task on core2 only uses the data as input, i.e.: AI tracking another AI entity, it has the wrong position, and won't know about it since it has no need to perform its own lock/write (so it never gets the exception that says the value changed). Even if it did, it would have to throw out all work and redo it with the new Y, and then it could possibly change again.
Looping and retrying seems wasteful. And I’m thinking the only way to catch such a hardware error on a failed lock/write update is via exceptions, and handling a thrown exception on an attempt to write a single 32 bit value is very wasteful of CPU cycles.
In my own research I have had excellent results with double buffering any modified data. Each threaded task only updates its hidden internal working state for frame n+1 while all reads to the object are read from its external current state for frame n. At the end of the frame when all parallel tasks have completed, the current/working states are swapped, and the work queue is filled again to start the next frame.
This ensures that throughout the entire computation of frame n+1, the current frame n state will be available to all threads, and guaranteed to not be modified through the duration of current frame. So basically all threads can read anything they want and modify their own data. On PC/360 the time to swap everything is basically nothing; you just swap a few pointers, or a single pointer to an array/structure of current/working data for the frame.
On the PS3 some data copying and moving will be required, but this is mandatory due to design anyway and assisted by an extremely smart and powerful DMAC.
One place to be critical about is message passing between objects since it requires posting (writing) data to be picked up by another object. But the time to lock/post/unlock a queue is negligible compared to the time it takes to process the results leading up to the creation of the message. This is similar to the D3D notion of doing as much as you can before you lock and only do the minimal work needed inside the lock and unlock as quickly as possible.
GhandiInstinct - Tuesday, November 7, 2006 - link
Jarred Walton,My question: Will Valve's games in 2007 be released with specificaitons such as: "For minimum requirements you need a dual-core cpu, for maximum results you need a quad-core" or anything to that nature? Because I seem to be confused in what Valve is working on dual or quad or both or neither or something different, and what I should get to best utilize their games and multi-core software in general.
Thanks.
JarredWalton - Tuesday, November 7, 2006 - link
Episode Two should come out sometime in 2007, and before that happens you will get the multithreading patch affecting previous Source engine titles. Right now, it doesn't sound like anything released in the next year or so from valve is going to require dual cores. That's what I was trying to get out on the conclusion page where I mentioned that they are targeting an "equivalent experience" regardless of what sort of processor you are running.So just like you could turn down the level of detail in Half-Life 2 and run it on DX8 or even DX7 hardware, Source engine should be able to accommodate single core processors all the way up through N-core processors. The engine will spawn as many threads as you have processor cores, with one main thread serving as the controller and N - 1 helper threads. Xbox 360 for example would have 5 helper threads plus the master thread, because it has three course each capable of executing to threads simultaneously.
Patrese - Tuesday, November 7, 2006 - link
Great article, good to see dual-quad cores being used for something in games. By the way, the kitchen examples made me hungry... :)primer - Tuesday, November 7, 2006 - link
it's a nice article. i'm anxious to see what develops out of Valve in the near future.how about using some Athlon FX, Opteron 100/1200 series or higher speed Athlon X2s for crying out loud. i know that here is a performance gap even with the higher AMD models currently, but please show us an attempt at not being as one-sided.
jm20 - Tuesday, November 7, 2006 - link
The tests at 2.0 and 2.4 Ghz for the AMD chips are there, you can plot how it 'should' improve. Here are my scaling projections based on the data given for the 939 platform.AMD Speed single dual
2.0 14.0 27.0
2.2 15.5 29.5
2.4 17.0 32.0
2.6 18.8 35.0
2.8 20.8 38.2
3.0 23.1 41.7
I just hope this will show up correctly :/
Very interesting article, I'm very excited to see an almost linear improvement from single to dual core.
yacoub - Tuesday, November 7, 2006 - link
Your Tested Systems details on page 8 - the last three show a CPU of a Venice 3200+ oc'd to 2.4GHz (hey that's what I run too!), but the headings suggest they should really show Intel chips. :)yacoub - Tuesday, November 7, 2006 - link
Yes please make it only visual and not actually a part of the game, because much like trying too hard to make graphics look realistic, it really just adds more frustration for the player than anything else. It's already annoying in HL2 getting caught up on the edge of prop_physics related objects. The last thing we need is also getting caught or bumped by another player ;)
JarredWalton - Tuesday, November 7, 2006 - link
If the world is more interactive, there might not actually be a need for "prop physics" objects. There's also the potential to make a game with different gameplay mechanics depending on how the character interacts with the world. It sounds like initially Valve will make it a visual enhancement, to make entities look more lifelike in their behavior, but down the road others could potentially do more. Like most of the enhancements, what we want to see is how they can actually change and improve gameplay beyond just being visual.yacoub - Tuesday, November 7, 2006 - link
"You're doing a disservice to the customer if you're not using all of the CPU power."okay so that alone shows the need for dual-core so the customer can offload background tasks to a second core so the game can truly get 100% of a core. ;)
timmiser - Tuesday, November 7, 2006 - link
Background tasks?? I agree with Gabe that I want all of my CPU working the game. I'm sitting here with an Athlon X2 4800+ and I can't run Flight Sim X at an acceptable frame rate all the while I watch the graph that shows CPU#2 at idle while CPU#1 tries to run the entire program by itself!On another note, how about Microsoft develope some type of new API (DirecThread?) that automatically takes care of utilizing multiple cores so that game companies like Valve don't have to employ entire multi-threaded R&D divisions just so their games use both cores.
It seems like we are going back to the caveman days before DirectInput, DirectSound, & DirectX etc. Remember when you had to choose your sound card, joystick, and video card from a list within the game?? Let's not go there again!!
-Tim
JarredWalton - Tuesday, November 7, 2006 - link
That was a direct quote from... Gabe I think. Others might disagree, but I thought it was an interesting take on things. As for background tasks, they usually only need a bit of CPU time (less than 5% in most cases), so unless you really want to encode videos and play games at the same time....yacoub - Tuesday, November 7, 2006 - link
What a worthless attempt at sarcasm:Why not just say something normal like:
It's not omg-witty, but at least that sentence doesn't end with a preposition. @___@
JarredWalton - Tuesday, November 7, 2006 - link
Which obviously ruined the whole article. Seriously yacoub, a little tact with posting would be nice. Rather than stating:You could have just said something normal like:
We are real people, and derogatory adjectives like "worthless" tend to irritate more than help. You may not intend it that way, but imagine for a second someone talking about what you worked on for a couple days last week and describing it as "absolute garbage". The key word in "constructive criticism" is to actually make it constructive.
Now, thanks for the suggestion, and I'm more than happy to change things a bit to appeas people. Like most people, I just appreciate a bit more consideration, even if I'm just doing my job here. :) As the old cliche goes, you can catch more flies with honey than with vinegar. (Well, my grandpa used to say that anyway.)
yacoub - Wednesday, November 8, 2006 - link
sorry about that. got a little too excited.Ruark - Tuesday, November 7, 2006 - link
Page 6: ". . . everything crawls to a slow."duploxxx - Tuesday, November 7, 2006 - link
page 8 test setup, clear a cut/paste job... all of the cpu's are the same Athlon.put an allendale in the benchmark, lot's of people want to see how cache related this multithread sw is.
Valve talks about 64-bit... tests are 32bit? Since some competitors are talking about 64-bit code in there gaming also, should be interesting to see what the difference is vs 32bit.
JarredWalton - Tuesday, November 7, 2006 - link
Sorry 'bout that - I've fixed the CPU line. All systems were tested at stock and 20% OC. The problem with Allendale is that it has different stock clock speeds, so you're not just comparing different CPU speeds (unless you use a lower multiplier on Conroe). Anyway, this is a first look, and the next CPU article should have additional benches.The tests are all 32-bit. This is not full Episode 2 code, and most people are waiting for Vista to switch to running 64-bit anyway. All we know is that Episode 2 will support 64-bits natively, but we weren't given any information on how it will perform in such an environment yet.
brshoemak - Tuesday, November 7, 2006 - link
Nice to know where things are headed. Great article.Jarred, 2nd page, 2nd paragraph
should be 'heating the oven' - although quite funny as is, you may want to keep it ;)
JarredWalton - Tuesday, November 7, 2006 - link
Darn speech recognition. And "b" and "h" looked close enough that I missed it. Heh. No ovens were harmed in the making of this article.MrJim - Tuesday, November 7, 2006 - link
This was a very interesting article to read, the future looks bright for us with multi-core systems and Valve games. Excellent work Mr Walton!timmiser - Tuesday, November 7, 2006 - link
Shoot, I didn't make it past the dinner description. Got too hungry!George Powell - Tuesday, November 7, 2006 - link
I quite agree. Top notch article there. It is great to see how Valve are committed to giving us the best gaming experience.Regs - Tuesday, November 7, 2006 - link
The future looks bright for people willing to buy valve and multi-core CPUs!!Regs - Tuesday, November 7, 2006 - link
And I hope Valve pulls it off too. Didn't mean nothing with the above post.puffpio - Tuesday, November 7, 2006 - link
Is it just me, or does the pic of Tom Leonard showcase a huge underarm sweat stain? :Ppeldor - Tuesday, November 7, 2006 - link
Tom's pic makes it looks like he's been fighting with multithreading and losing.Badly.
PeteRoy - Thursday, November 9, 2006 - link
I loved your comment.JarredWalton - Tuesday, November 7, 2006 - link
It was taken after about two hours in the conference room. Sorry Tom! :)