ScotGrid

25 June 2004: ScotGrid's 2nd birthday

ScotGrid clocks up 1 million CPU hours

The ScotGrid system was first deployed on June 25th 2002. Today marks the second anniversary of deployment. Two years on, the total CPU usage is more than 1,000,000 CPU hours and the total number of jobs processed is more than 100,000. This represents a major landmark in terms of deployment of the system.

Michael Connarty visits ScotGrid

Michael Connarty, MP for Falkirk East, is shown the Glasgow Compute Farm by David Martin (ScotGrid System Manager) and Will Bell.

The utilisation (or duty cycle) peaks at around 80-90% currently and has been maintained at around 70% during the year. Peak demand has been controlled using a Maui-based "fair share" scheduling policy. The user service for 12 Groups (ATLAS, BaBar, Bioinformatics, CDF, Device Modelling, Grid Data Management, Information Retrieval, LHCb, Medipix, MICE, UKQCD and ZEUS) corresponding to approximately 100 individual users has been available for 90% of the year, despite a major planned re-build of the system.

The re-build was performed in April 2004 to integrate new system components, doubling the available CPU and quadrupling the available disk at Glasgow. The system is now available for a varied set of applications with individual jobs ranging in length from hours to more than two weeks. During the last year, typically 5 jobs have been processed per hour with an average individual job time of approximately 1 day on each of the processors.

The sharing of the facility for a wide range of applications with typically uncorrelated timescales and the use of "fair shares" has led to good user turnaround times for each of the research users. To take greater advantage of the gains through sharing, we are employing Grid technologies beyond ScotGrid and have begun full-scale integration of the system with prototype Grids across the UK and Europe. To this end, the ScotGrid (Glasgow) Compute Farm is integrated with the LCG-2 Grid, the basic Grid system for the EU Enabling Grids for E-Science in Europe (EGEE) project, and the Grid for UK Particle Physics (GridPP) testbed, and work is well underway to integrate the Edinburgh and Durham systems.

The facility has been monitored using a web-based system built upon the Maui scheduler (for CPU) and unix shell commands (for disk). The current status, polled every 10 minutes, as well as accumulated statistics from the first two years of deployment can be found here.

ScotGrid is currently in Phase 2 of the project where in particular the high performance disk server at Edinburgh has been significantly upgraded, with a total capacity of 24 TBytes. Jobs are currently being submitted testing all aspects of the CPU farms, disk server, network and Grid middleware that will be required as ScotGrid enters the Grid Production phase.

ScotGrid

ScotGrid is a three-site Tier-2 centre consists of an IBM 200 CPU Monte Carlo production facility run by the Glasgow PPE group and an IBM 24TB datastore and associated high-performance server run by Edinburgh Parallel Computing Centre and PPE group. A Sun 100 CPU farm is currently being connected, based at Durham University Institute for Particle Physics Phenomenology.

The ScotGrid project was funded by SHEFC (now SFC) for the analysis of data primarily from the ATLAS and LHCb experiments at the Large Hadron Collider and from other experiments. It is now providing IBM solutions for Grid Computing in Particle Physics, Bioinformatics as well as Grid Data Management, medical imaging and device modelling simulations.