Cluster Administration
From ScotGrid
(Difference between revisions)
| Revision as of 15:42, 14 Aug 2012 David crooks (Talk | contribs) ← Go to previous diff |
Current revision Stuart Purdie (Talk | contribs) Security |
||
| Line 10: | Line 10: | ||
| * [[Glasgow Incident Response Guide]] for responding to security incidents. | * [[Glasgow Incident Response Guide]] for responding to security incidents. | ||
| * [[Adding entries to iptables]] | * [[Adding entries to iptables]] | ||
| + | * [[SELinux notes]] | ||
| ===Setup=== | ===Setup=== | ||
| Line 82: | Line 83: | ||
| * [[ Tracing PANDA jobs ]] | * [[ Tracing PANDA jobs ]] | ||
| * [[ glite-APEL installation notes]] | * [[ glite-APEL installation notes]] | ||
| + | * [[ Installing DELL OMSA services ]] | ||
| * [[ Resolving Room Cooling issues ]] | * [[ Resolving Room Cooling issues ]] | ||
| Line 141: | Line 143: | ||
| * [[Glasgow LHCB Testing ]] | * [[Glasgow LHCB Testing ]] | ||
| + | |||
| + | * [[Glasgow Test Cluster ]] | ||
| + | |||
| ===Actions=== | ===Actions=== | ||
Current revision
| Table of contents |
[edit]
GUScotGrid Cluster Operations and Management
[edit]
Overview
[edit]
Security
- Glasgow Incident Response Guide for responding to security incidents.
- Adding entries to iptables
- SELinux notes
[edit]
Setup
- Glasgow SRM Setup
- Glasgow New Cluster (includes details of the YAIM Peoples Front)
- Crib notes for Server 31 ressurection
- Glasgow SL4 Upgrade Notes Tracks issues for the site's upgrade to SL4
[edit]
Task HOWTOs
[edit]
Scripts
[edit]
Administration Howtos
- CVMFS @ Glasgow
- Arc at glasgow for non-vikings
- Glasgow BDII tips
- DPM service start/stop
- Adding a local user
- Requesting a hostcert? Check here
- Renewed a hostcert? Check here
- Power cycling nodes
- Starting up the whole cluster from power failure
- Glasgow nfs mounted gridmapdir information
- Shutting down the whole cluster
- (Re)installing a node
- Adding a new SubCluster
- Adding a new host (sshkeys and databases)
- Glasgow Using nagios and cfengine for self healing
- Glasgow pdsh groups
- Admin Host Access
- Glasgow VM control
- Accessing fstab
- Glasgow Upgrading Ganga
- Glasgow Installing MediaWiki
- Glasgow VOMS server administration
- Glasgow GLite WMS installation
- Glasgow GLite Cream CE installation
- Glasgow GLite gLExec installation and configuration
- Glasgow GLite CE Tips
- Glasgow GLite CREAM CE Yaim tips
- Glasgow Glasgow GLite CE Publishing Tips
- Glasgow VO share publishing
- Glasgow GLite Torque Tips
- Glasgow GLite WMS Tips
- Glasgow space token maintenance
- Glasgow LVM notes
- Glasgow - Clustervision Support
- Glasgow Enabling the LHCb/ATLAS pilot roles
- Glasgow VOMS Server Upgrade
- SELinux information for SL5 nodes
- a simple qsub
- Glasgow Cream Tips
- Building Lumerical's FDTD
- Building CASTEP
- Building TORQUE
- Building MAUI
- Building OPENMPI
- Building CASTEP for OPENMPI
- Building MPICH
- Running CASTEP with grid-run-castep
- Add :ppn Functionality with CREAM
- Version-locking a software package
- Building and Installing matplotlib
- Arc Install Instructions
- SGE at Glasgow Install Instructions
- Tracing PANDA jobs
- glite-APEL installation notes
- Installing DELL OMSA services
- Resolving Room Cooling issues
- PDU Restart Procedure
- 141 Aircon notes
- Cold Start
- Mac USB-Serial adaptor
- Lustre - care and feeding
- BMS integration
- Virtual Machines
- Gigabit Ethernet Switching
- Checking switch port status
- Glasgow Emergency Network Checks
[edit]
Operations
- What to do if there appears to be Corrupted ATLAS Data
- Setting a downtime? Which queues to stop for different downtimes
- Investigating Maui blockages
- Mobile certs
[edit]
GU-ScotGrid Cluster Extension
[edit]
Logbook
- Node log
- Glasgow Middleware Operations Logbook: Tracking problems with middleware components see at the site.
[edit]
Change Control
- Glasgow site Change Control
- Glasgow site Change Control Notes
- Glasgow site Change Control Regression
- Glasgow site Change Control Record
[edit]
Glasgow Activities
- Testing of the DPM FTPD_TCP_NODELAY option.
[edit]
Actions
[edit]
Testzone
- Glasgow hosts the Scotgrid Testzone when necessary.
[edit]
Pre Production Service
- Glasgow used to run the FTS server for the Pre Production Service, but no longer.
- Recently we have added a second DPM to help optimise data access for ATLAS, which is a PPS like activity.
[edit]
Strategy
Strategic discussions:
[edit]
Technical Contacts
