Our SMP core is very different than the way other distributed computing projects handle multi-core CPU's, and I thought it might be interesting for the FAH community to hear about the differences, pro and con. As I think most people interested in computers know, Moore's law stating that the transistor count in CPUs will double every 1.5 years has continued for decades. Most people think of Moore's law in terms of the speed of CPU's, but this isn't what Moore originally had in mind. In the past, more transistors have lead to greater CPU speeds, but that has essentially ended (at least for traditional CPU's) a few years ago.
But if Moore's law is still marching along (as it is), what do all those transistors do? Over the last few years, more transistors have translated into more CPU cores, i.e. more CPUs on a chip. While this is not what we wanted, this is perhaps not necessarily a disaster, if one can use these multiple CPUs to get faster calculations. If we simply do more calculations (i.e. multiple Work Units, or WU's, simultaneously) not faster calculations (a WU completed in less time), distributed computers will run into the same problems that face supercomputers: how to scale to lots and lots of processors -- i.e. how can we use all these processors to do a calculation faster over all.
In FAH, we've taken a different approach to multi-core CPUs. Instead of just doing more WU's (eg doing 8 WU's simultaneously), we are applying methods to do a single WU faster. This is typically much more valuable to a scientific project and it's important to us. However, it comes with new challenges. Getting a calculation to scale to lots of cores can be a challenge, as well as running complex multi-core calculations originally meant for supercomputers on operating systems not meant for this (eg Windows).
Right now, our SMP client seems to be running fairly well under Linux and OSX -- operating systems based on UNIX, as is found on supercomputers. We use a standard supercomputing library (MPI) to run these WU's and MPI behaves well on Unix-based machine. MPI does not run well on Windows and we've been running into problems there. However, as Windows MPI implementations mature, our SMP/Windows app will behave better. Along the way, we also have a few tricks up our sleeve which may help as well. However, if we can't get it to run as well as we'd like on Windows, we may choose to overhaul the whole code, as we did with the GPU1 client (which was really hard to run).
We're very excited about what the SMP client has been able to do so far. One of our recent papers (#53 in our papers web site http://folding.stanford.edu/English/Papers) would have been impossible without the SMP client and represents a landmark calculation in the simulation of protein folding. We're looking forward to more exciting results like that in the years to come!
QUOTE "However, if we can't get it to run as well as we'd like on Windows, we may choose to overhaul the whole code, as we did with the GPU1 client (which was really hard to run)"
What exactly does overhaul mean? Since the GPU1 client was shut down, does the WIN SMP nearing the same fate?
Posted by: EvilAlchemist | June 16, 2008 at 01:20 AM
I cann't agree more, stability is hard to get with it at the moment.
Last 3 days EUE's constantly on 2665 WU's
Did a new install, but no luck.....
Posted by: HuntWarrior | June 16, 2008 at 07:02 AM
STOPPED FOLDING with SMP until NV client arrives.
Posted by: Perl-Freak | June 16, 2008 at 10:31 AM
Actually Moore's law states that the number of transiistors per metric unit will double, or their cost will halve every 18 months. This is the problem Intel is having now, cost is halving, but transistor count is barely moving.
Posted by: Annon | June 16, 2008 at 10:39 AM
nvidia gpu client 1week++ and counting :(
Posted by: NiFkE | June 16, 2008 at 10:57 AM
Stability of the Windows SMP client is really hampering the amount of work my systems can do. Right now I have two systems running SMP 24/7, but I could have two more if the stability (particularly the wireless connection freeze issue) was improved. Folding it into the main client (ha!) would be a big improvement.
Posted by: Clint Oakley | June 16, 2008 at 12:49 PM
If you are folding them 24/7, why not just run them in linux?
Posted by: PlayLoud | June 16, 2008 at 10:33 PM
you all ned to cool off regarding nvidia gpus. if some internet site announced some speculations on release date, it does not mean, that it will be released that particular day/week. Until we get official announcement from Stanford that tomorrow or the day after tomorrow it will be released, nothing is set to stone.
Posted by: muziqaz | June 16, 2008 at 11:34 PM
quote form june 6: "Just as we turn the page on GPU1, we are nearing our open beta release for the GPU2/NVIDIA client. If all goes well, we hope to have that available in a week. I'll give an update if it looks like it will be delayed more than a week."
liar liar... cmon there are maaany people waiting. I think at least some announcment would be appropriate.
Posted by: kenah | June 17, 2008 at 01:42 AM
Re:PlayLoud
Some of them are community machines that run established (read: old) programs for laboratory equipment, so switching OS isn't an option. They're idle most of the time, and when they're not the usage is pretty light. Having SMP built into the main client would also help with the inevitable problem of someone sitting down at the computer, "Oh, what's this black DOS box doing here?" and closing it.
Posted by: Clint Oakley | June 17, 2008 at 10:31 AM
The only people who see the open beta are the press..
To bench the GTX 260/280 from Nvidia...like Anandtech.com / ocaholic 7 computerbase.de.
So far not for normal people like us (which build the strong base of F@H)... you are cheating at us... my frustration and others are growing with each second without any information.
Posted by: Perl-Freak | June 17, 2008 at 10:41 AM
I guess I've been lucky. I've been running the SMP client on Windows since it came out, and haven't had any issues. Pair of AMD Opterons.
Posted by: Dom | July 24, 2008 at 11:50 AM