Monday, November 1, 2010

Food for HPC thought

I am happy with the changes for i2Basque -the fact that is going to be integrated with Ikerbasque, and by the way I hope that the excelent team stays beyond 2011-. I have seen many new HPC centers fail on something that Ikerbasque does very well: partnership with each research institution in the Basque arena. And we can say they have been quite successful.

Because the major failure of new supercomputing centers is that they can not do science: they do not publish, or at least not enough. However they were aimed for that! yeah but the reality is that without people, without ideas, without real scientific centers behind, they are nothing but an empty box, rented to other scientists and normally badly managed as their directors are researchers that have to stop their research to manage -so why the hell do we have management gradutes for?-. I have always argued this, and I still believe that supercomputing centers as such are a very old void idea nowadays. As an example, I have heard that some of the new ones have operated for a few years without having managed to get access to online publication services :from the universities that use their own systems!- ... We need more focus science and better service.

Having said this: what can these centers offer or improve ? Definetely service. It is unsound that nowadays, when most of researchers need less than 512 cores, there is so much burochracy to request a few hours, for a few months. This is unfortunately becoming a major problem as most of their codes will end up running on in-house facilities: no paperwork, no timeframe. That is why new supercomputing centers still fail, because service is conceived as if there was going to be huge runs inside their systems (>10,000 cpus or TBs of memory). But that is not practical, and due to the huge shifts in computer technology prices/performance, this is not sustainable in the present economic environment. Big clusters are for big projects. Lets say that nowadays a cost-effective system is built upon a few thousands (4-k) cpus.

But my point today is service and computer systems research. These two are my wishes for the new i2basque. Top quality service (be inside EJIE is the first step, so congrats for that), top quality systems, 24*7 if needed, top quality code support team, good helpdesk and more importantly happy and passionate employees -lets stop the HPC cowboys forever, please-. On the other side, i2Basque has one potential that can not be ignored. There should be some sort of commitment to pure systems research. What is more, what is the science debt clock with the Linux Kernel itself ? How long will this run for ? How many of our science projects would not exist without it ? For how long supercomputing centers are not going to commit patches and updates to the Linux Kernel ?

