Search results

Create the page "Batch System" on this wiki!

Page title matches

Batch system status

* [https://twiki.cern.ch/twiki/bin/view/LCG/BatchSystemComparison Batch System Comparison Table] == Sites batch system status ==

11 KB (1,661 words) - 12:47, 21 June 2019
RAL Tier1 Batch System

...ent/uploads/2008/12/batchsystemconfig-nov08.pdf Configuration of the batch system at November 2008] [[Category:Batch Systems]]

325 B (40 words) - 12:23, 18 March 2014

Page text matches

Main Page

* [[New Information System]] * [[:Category:Batch_Systems|Batch Systems]]

8 KB (1,130 words) - 17:31, 17 April 2024
Tier1 Operations Report 2019-06-17

 ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.

14 KB (1,386 words) - 09:37, 19 June 2019
Operations Bulletin Latest

* Technical Meeting last week about the New JSON based Information System: https://indico.cern.ch/event/821105/ ...on Thursday. All SAM tests failed until this was fixed the next morning. Batch farm also did not start any new jobs during this time. We used this accide

41 KB (5,018 words) - 14:09, 30 October 2019
Tier1 Operations Report 2016-03-16

...y LHCb of a low but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when * GDSS620 (GenTape - D0T1) Reported a read-only file system yesterday (Tuesday) morning and was taken out of production. Two T2K files

13 KB (1,356 words) - 09:59, 16 March 2016
Tier1 Operations Report 2015-12-09

...y LHCb of a low but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when ... hours of yesterday morning (8th Dec). This also reported a read-only file system.

13 KB (1,411 words) - 08:55, 10 December 2015
Tier1 Operations Report 2014-05-07

* There was a problem on Thurdsay with the batch farm caused by a particular (biomed) user running very large jobs. This led | Outage of tape system for update of library controller.

13 KB (1,357 words) - 12:47, 9 May 2014
Past Ticket Bulletins 2014

...n F ticketed the CA concerning a possible problem with the ticket reminder system. JK has responded with a reply, and asked that similar tickets in the futur LHCB having cvmfs trouble at IC, which was likely caused by a batch of naughty CMS jobs ruining it for everyone else. LHCB re-enabled IC to see

184 KB (30,332 words) - 17:18, 16 December 2014
Operations Bulletin 170314

...is week there is a [http://indico.cern.ch/event/272785/ pre-GDB meeting on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...re will be a [https://www.gridpp.ac.uk/wiki/Batch_system_status pre-GDB on batch systems] next Tuesday, and a [https://indico.cern.ch/event/272619/timetable

42 KB (5,176 words) - 11:12, 17 March 2014
Operations Bulletin 280414

* CERN batch capacity migrated to SLC6 was at 65% last week. * The APEL accounting system has been undergoing database maintenance to improve performance and reliabi

46 KB (5,930 words) - 18:40, 28 April 2014
Batch system status

* [https://twiki.cern.ch/twiki/bin/view/LCG/BatchSystemComparison Batch System Comparison Table] == Sites batch system status ==

11 KB (1,661 words) - 12:47, 21 June 2019
RAL Tier1 Batch System

...ent/uploads/2008/12/batchsystemconfig-nov08.pdf Configuration of the batch system at November 2008] [[Category:Batch Systems]]

325 B (40 words) - 12:23, 18 March 2014
Operations Bulletin 150413

...ve released a new [https://ggus.eu/pages/didyouknow.php page on using] the system. * Investigations are ongoing into problems at batch job set-up.

43 KB (5,533 words) - 08:50, 18 August 2014
Tier1 Operations Report 2018-07-09

 ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.

17 KB (1,646 words) - 09:31, 11 July 2018
Tier1 Operations Report 2017-01-04

...y LHCb of a low but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when | Outage of Castor Storage System for patching

14 KB (1,476 words) - 14:02, 4 January 2017
Tier1 Operations Report 2015-05-20

* GDSS649 (LHCbUser - D1T0) failed on Saturday 16th May when the system hung up. Following tests a faulty drive was replaced. It was returned to se ...ew configuration of a batch of new worker nodes was reported. Most of this batch have now been re-set to have the usual worker node configuration.

13 KB (1,442 words) - 11:25, 20 May 2015
Tier1 Operations Report 2014-03-19

...r the hypervisor hosting this virtual machine rebooted and this particular system was not configured to re-start. This was resolved by the primary on-call. ...ed during the change. The batch system was also reconfigured such that new batch jobs world not startt during this period. The change was successful. There

14 KB (1,553 words) - 11:36, 19 March 2014
Tier1 Operations Report 2016-01-20

* As reported last week the CMSTape system has been busy - and throughput was compromised by two out of its five disk .... Following the first rebuild a another problematic disk was found and the system was returned to service on Monday (18th Jan) once that too had been resolve

13 KB (1,364 words) - 12:54, 20 January 2016
GridPP approved VOs

|MAGIC is a system of two imaging atmospheric Cherenkov telescopes (or IACTs). MAGIC-I started * high priority in the batch system for the atlassgm user;

78 KB (13,056 words) - 13:44, 23 April 2024
Tier1 Operations Report 2017-07-26

 ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.

18 KB (1,971 words) - 14:03, 26 July 2017
Operations Bulletin 310314

* Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...is week there is a [http://indico.cern.ch/event/272785/ pre-GDB meeting on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard

48 KB (6,293 words) - 07:35, 31 March 2014
Operations Bulletin 240314

* Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...is week there is a [http://indico.cern.ch/event/272785/ pre-GDB meeting on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard

48 KB (6,293 words) - 07:36, 31 March 2014
Operations Bulletin 070414

* Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...reviewed are capable of supporting multicore jobs however a tuning of each system is required to be able to absorb them (draining/reservation of resources) w

45 KB (5,701 words) - 09:21, 7 April 2014
Operations Bulletin 140414

* Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard * CERN batch capacity migrated to SLC6 was at 65% last week.

52 KB (6,980 words) - 08:19, 15 April 2014
Tier1 Operations Report 2014-04-09

...ed. Multiple disk failures were being reported by the disk controller. The system was returned to production yesterday evening (8th April) and is being drain * The EMI3 Argus server is now in use everywehere in the batch farm.

14 KB (1,599 words) - 11:33, 14 April 2014
Operations Bulletin 210412

* CERN batch capacity migrated to SLC6 was at 65% last week. * The APEL accounting system has been undergoing database maintenance to improve performance and reliabi

45 KB (5,796 words) - 22:44, 21 April 2014
Tier1 Operations Report 2014-12-10

... CMS deleting files to make space and a reduction in the number of running batch jobs relieved thd strain. ... brought into use. (Currently Atlas 3D/Frontier still uses the OGMA datase system, although this was also changed to update from CERN using Oracle Golden Gat

14 KB (1,492 words) - 13:08, 10 December 2014
Tier1 Operations Report 2014-04-30

...bers of jobs (from T2K) submitted to the batch system by the WMSs. A batch system parameter (max number of gridftp connections on ARC CEs) has been increased | System be decommissioned. (Replaced my myproxy.gridpp.rl.ac.uk).

14 KB (1,557 words) - 13:24, 30 April 2014
Operations Bulletin 050514

* CERN batch capacity migrated to SLC6 was at 65% last week. * The APEL accounting system has been undergoing database maintenance to improve performance and reliabi

41 KB (5,106 words) - 19:52, 5 May 2014
Tier1 Operations Report 2014-05-14

* Testing CVMFS Client version 2.1.19 ongoing. This is now rolled out to one batch of worker nodes. So far so good. | Outage of tape system for update of tape library controller. (Postponed from 13th May).

13 KB (1,393 words) - 10:46, 14 May 2014
Tier1 Operations Report 2019-02-25

 ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.

17 KB (1,612 words) - 11:29, 27 February 2019
Operations Bulletin 020614

...onday covered some site reports and OS related updates. Tuesday's focus is batch systems. Wednesday covers IPv6, security and benchmarking. Thursday storage ...naged services from Quattor to a new Puppet based Configuration Management system.

41 KB (5,148 words) - 09:38, 2 June 2014
Operations Bulletin 090614

...onday covered some site reports and OS related updates. Tuesday's focus is batch systems. Wednesday covers IPv6, security and benchmarking. Thursday storage ...naged services from Quattor to a new Puppet based Configuration Management system.

41 KB (5,148 words) - 07:10, 9 June 2014
Tier1 Operations Report 2014-06-11

* Today (11th June) a new tape controller system (ACSLS) is being installed. There have been some problems with the new serv | Castor (all SRM endpoints) and batch (all CEs)

15 KB (1,592 words) - 12:26, 11 June 2014
Imperial arc ce for cloud

0) Find and read the "ARC Computing Element System Administrator Guide". <br> 3) Ensure the machine can submit to the batch system & has all of the users. <br>

11 KB (1,578 words) - 15:50, 12 June 2014
Cloud Work at Imperial

...ieve this by using the (Condor) Submit module of a glideinWMS as the batch system and then channeling the jobs via the glideinWMS to the gridpp cloud. <br>

925 B (154 words) - 11:11, 23 August 2019
Operations Bulletin 160614

...r their resources into a ‘pool’ via the [https://e-grant.egi.eu eGrant system]. [https://wiki.egi.eu/wiki/Resource_Allocation_Process More information] i * Castor and batch services currently down for Castor Namserver Upgrade (to version 2.1.14). I

39 KB (4,952 words) - 19:40, 13 June 2014
RAL Memory Limits

...n there is contention between other processes for physical memory will the system force physical memory into swap and push the physical memory used towards t

1 KB (241 words) - 10:28, 11 February 2015
Operations Bulletin 070714

...] needs updating and a consensus! Could the SEs implement some reservation system internally? Is there merit in the suggestion to make use of [https://www.gr * KeyDocs are going to be reviewed (in next 4 weeks) as the system is not working (or not adding anything) in some areas.

43 KB (5,584 words) - 12:52, 7 July 2014
Staged rollout emi3

'''UKI-NORTHGRID-MAN-HEP''': Multicore and passing parameters to the batch system testing requested by the experiments through the WLCG Task Force Alessandra

8 KB (1,155 words) - 11:09, 13 March 2015
Example Build of an EMI-UMD Cluster

...egi-trustanchors.repo Finally, for historical reasons related to our build system, we also installed these two repos from the glite 3.2 instructions - jpacka ...wever you do it, make a munge key using /usr/sbin/create-munge-key on some system that has munge installed on it (this one?) and use the resulting key on all

15 KB (2,429 words) - 10:18, 31 July 2015
Operations Team Completed Actions

| Email everyone on how to hack the publishing system to avoid publishing incorrect GlueSubClusterWNTmpDir. | Plan out the future of CE/Batch System integration. Torque/maui are not supported by EGI. Layout an agenda with pr

33 KB (5,297 words) - 10:13, 15 November 2017
Example Build of an ARC/Condor Cluster

...lable, called HTCondor (or CONDOR for short). We also decided to front the system with an ARC CE. You'll need a copy of the ARC System Admin Manual.

121 KB (17,569 words) - 08:26, 28 November 2019
Operations Bulletin 010914

...or allocation. It is a brokering service only. There is one request in the system for cloud resources. * News: CERN-IT to terminate the SLC5-based interactive and batch services (lxplus5 and lxbatch5) soon. The current target date is 30 Septemb

42 KB (5,358 words) - 10:48, 1 September 2014
Operations Bulletin 150914

... jobs at CCIN2P3 and of the method to passing job requirement arguments to batch systems via CE. ([https://indico.cern.ch/event/339461/ Agenda]) * OSG following up on how to discover HTCondor CEs in the information system.

46 KB (6,062 words) - 10:07, 15 September 2014
Tier1 Operations Report 2014-10-01

...ring Saturday evening. It was restarted and tested but no fault found. The system was returned to service yesterday (30th Sep). * One batch of worker nodes (64 machines) have had Linux cgroups configured to enforce

13 KB (1,429 words) - 10:06, 8 October 2014
RAL Tier1 Incident 20130626 Failure of RAL CVMFS Stratum1 Triggered Batch Farm Problems

==RAL Tier1 Incident 20130626 Failure of RAL CVMFS Stratum1 Triggered Batch Farm Problems=====Description:=== ...s over to use other replicas. However this did not happen across the Tier1 batch farm where many nodes were running a version of the CVMFS client in which t

12 KB (1,968 words) - 15:13, 16 September 2014
Monitoring

...ordinating/publicising local site-admin tools (Nagios plugins, local batch system dashboards)

906 B (116 words) - 08:35, 5 June 2018
Tier1 Operations Report 2014-09-17

...of the systems affected was the argus server and this caused a problem for batch job submissions for an hour or so. * The Atlas Frontier service will be switched to use the new database system that updates from CERN using Oracle "GoldenGate" on 24th Sep.

12 KB (1,195 words) - 14:07, 17 September 2014
Operations Bulletin 220914

... jobs at CCIN2P3 and of the method to passing job requirement arguments to batch systems via CE. ([https://indico.cern.ch/event/339461/ Agenda]) * OSG following up on how to discover HTCondor CEs in the information system.

48 KB (6,422 words) - 08:45, 23 September 2014
Operations Bulletin 200317

*** Durham: Batch system upgrade led to one outage and a University wide internet connection loss le * Ongoing tests ongoing with some batch jobs for the LHC VOs running in SL6 containers on worker nodes running SL7.

42 KB (5,079 words) - 18:37, 19 March 2017

Search results

Page title matches

Page text matches

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Main GridPP website

Navigation

Tools