ScotGrid

25 June 2003: ScotGrid's 1st birthday

A summary of the first year of operation

ScotGrid is a £800k prototype two-site Tier 2 centre funded by SFC for the analysis of data primarily from the ATLAS and LHCb experiments at the Large Hadron Collider and from other experiments. The centre currently consists of a 128CPU Monte Carlo production facility run by the Glasgow PPE group and a 5TB datastore and associated high-performance server run by Edinburgh Parallel Computing Centre.

The ScotGrid project is funded by SFC for approximately 3 years, and is now providing IBM solutions for Grid Computing in Particle Physics, Bioinformatics as well as Grid Data Management simulations and Information Retrieval processing.

The ScotGrid system was first deployed on June 25th 2002. Today marks the first anniversary of deployment. One year on, the total CPU usage is close to 500,000 CPU hours and the total number of jobs processed is around 75,000.

The utilisation (or duty cycle) has climbed from 20% in the first quarter, peaking at around 80-90% during Q1 of 2003 and is currently running at around 70%. Peak demand has been controlled using a Maui-based "fair share" scheduling policy. The user service for 9 Groups (ATLAS, BaBar, Bioinformatics, CDF, Grid Data Management, Information Retrieval, LHCb, UKQCD and ZEUS) has been available for more than 95% of the year. Individual jobs range in length from hours to more than two weeks. Typically 10 jobs are processed per hour and an average individual job time of approximately 10 hours on each of the processors.

The sharing of the facility for a wide range of applications with typically uncorrelated timescales and the use of "fair shares" has led to good user turnaround times for more than 50 local users. To take greater advantage of the gains through sharing, we need to employ Grid technologies beyond ScotGrid and encourage integration with prototype Grid systems across the UK and Europe. To this end, the ScotGrid IBM solution system is fully integrated with the EU DataGrid and UK GridPP testbeds, enabling Grid users high priority access to 5% of the total CPU.

The facility has been monitored using a web-based system built upon the Maui scheduler (for CPU) and unix shell commands (for disk). The current status, polled every 10 minutes, as well as accumulated statistics from the first year of deployment can be found here.

ScotGrid has recently entered Phase 2 where, in particular, the high performance disk server has been significantly upgraded, with a total capacity of 24 TBytes. Disk usage has been monitored in terms of TeraByte-Days, a metric that will be increasingly important in the second year of ScotGrid deployment emphasising data access, in the second phase of ScotGrid.