Search results
Create the page "Batch System" on this wiki!
Page title matches
- * [https://twiki.cern.ch/twiki/bin/view/LCG/BatchSystemComparison Batch System Comparison Table] == Sites batch system status ==11 KB (1,661 words) - 12:47, 21 June 2019
- ...ent/uploads/2008/12/batchsystemconfig-nov08.pdf Configuration of the batch system at November 2008] [[Category:Batch Systems]]325 B (40 words) - 12:23, 18 March 2014
Page text matches
- * [[New Information System]] * [[:Category:Batch_Systems|Batch Systems]]8 KB (1,130 words) - 17:31, 17 April 2024
- <!-- ******************Start Limits On Batch System Jobs***************** -----> ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.14 KB (1,386 words) - 09:37, 19 June 2019
- * Technical Meeting last week about the New JSON based Information System: https://indico.cern.ch/event/821105/ ...on Thursday. All SAM tests failed until this was fixed the next morning. Batch farm also did not start any new jobs during this time. We used this accide41 KB (5,018 words) - 14:09, 30 October 2019
- ...y LHCb of a low but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when * GDSS620 (GenTape - D0T1) Reported a read-only file system yesterday (Tuesday) morning and was taken out of production. Two T2K files13 KB (1,356 words) - 09:59, 16 March 2016
- ...y LHCb of a low but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when ... hours of yesterday morning (8th Dec). This also reported a read-only file system.13 KB (1,411 words) - 08:55, 10 December 2015
- * There was a problem on Thurdsay with the batch farm caused by a particular (biomed) user running very large jobs. This led | Outage of tape system for update of library controller.13 KB (1,357 words) - 12:47, 9 May 2014
- ...n F ticketed the CA concerning a possible problem with the ticket reminder system. JK has responded with a reply, and asked that similar tickets in the futur LHCB having cvmfs trouble at IC, which was likely caused by a batch of naughty CMS jobs ruining it for everyone else. LHCB re-enabled IC to see184 KB (30,332 words) - 17:18, 16 December 2014
- ...is week there is a [http://indico.cern.ch/event/272785/ pre-GDB meeting on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...re will be a [https://www.gridpp.ac.uk/wiki/Batch_system_status pre-GDB on batch systems] next Tuesday, and a [https://indico.cern.ch/event/272619/timetable42 KB (5,176 words) - 11:12, 17 March 2014
- * CERN batch capacity migrated to SLC6 was at 65% last week. * The APEL accounting system has been undergoing database maintenance to improve performance and reliabi46 KB (5,930 words) - 18:40, 28 April 2014
- * [https://twiki.cern.ch/twiki/bin/view/LCG/BatchSystemComparison Batch System Comparison Table] == Sites batch system status ==11 KB (1,661 words) - 12:47, 21 June 2019
- ...ent/uploads/2008/12/batchsystemconfig-nov08.pdf Configuration of the batch system at November 2008] [[Category:Batch Systems]]325 B (40 words) - 12:23, 18 March 2014
- ...ve released a new [https://ggus.eu/pages/didyouknow.php page on using] the system. * Investigations are ongoing into problems at batch job set-up.43 KB (5,533 words) - 08:50, 18 August 2014
- <!-- ******************Start Limits On Batch System Jobs***************** -----> ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.17 KB (1,646 words) - 09:31, 11 July 2018
- ...y LHCb of a low but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when | Outage of Castor Storage System for patching14 KB (1,476 words) - 14:02, 4 January 2017
- * GDSS649 (LHCbUser - D1T0) failed on Saturday 16th May when the system hung up. Following tests a faulty drive was replaced. It was returned to se ...ew configuration of a batch of new worker nodes was reported. Most of this batch have now been re-set to have the usual worker node configuration.13 KB (1,442 words) - 11:25, 20 May 2015
- ...r the hypervisor hosting this virtual machine rebooted and this particular system was not configured to re-start. This was resolved by the primary on-call. ...ed during the change. The batch system was also reconfigured such that new batch jobs world not startt during this period. The change was successful. There14 KB (1,553 words) - 11:36, 19 March 2014
- * As reported last week the CMSTape system has been busy - and throughput was compromised by two out of its five disk .... Following the first rebuild a another problematic disk was found and the system was returned to service on Monday (18th Jan) once that too had been resolve13 KB (1,364 words) - 12:54, 20 January 2016
- |MAGIC is a system of two imaging atmospheric Cherenkov telescopes (or IACTs). MAGIC-I started * high priority in the batch system for the atlassgm user;78 KB (13,056 words) - 13:44, 23 April 2024
- <!-- ******************Start Limits On Batch System Jobs***************** -----> ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.18 KB (1,971 words) - 14:03, 26 July 2017
- * Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...is week there is a [http://indico.cern.ch/event/272785/ pre-GDB meeting on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard48 KB (6,293 words) - 07:35, 31 March 2014
- * Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...is week there is a [http://indico.cern.ch/event/272785/ pre-GDB meeting on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard48 KB (6,293 words) - 07:36, 31 March 2014
- * Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard ...reviewed are capable of supporting multicore jobs however a tuning of each system is required to be able to absorb them (draining/reservation of resources) w45 KB (5,701 words) - 09:21, 7 April 2014
- * Last week there was a [http://indico.cern.ch/event/272785/ pre-GDB on batch systems] and a [http://indico.cern.ch/event/272619/other-view?view=standard * CERN batch capacity migrated to SLC6 was at 65% last week.52 KB (6,980 words) - 08:19, 15 April 2014
- ...ed. Multiple disk failures were being reported by the disk controller. The system was returned to production yesterday evening (8th April) and is being drain * The EMI3 Argus server is now in use everywehere in the batch farm.14 KB (1,599 words) - 11:33, 14 April 2014
- * CERN batch capacity migrated to SLC6 was at 65% last week. * The APEL accounting system has been undergoing database maintenance to improve performance and reliabi45 KB (5,796 words) - 22:44, 21 April 2014
- ... CMS deleting files to make space and a reduction in the number of running batch jobs relieved thd strain. ... brought into use. (Currently Atlas 3D/Frontier still uses the OGMA datase system, although this was also changed to update from CERN using Oracle Golden Gat14 KB (1,492 words) - 13:08, 10 December 2014
- ...bers of jobs (from T2K) submitted to the batch system by the WMSs. A batch system parameter (max number of gridftp connections on ARC CEs) has been increased | System be decommissioned. (Replaced my myproxy.gridpp.rl.ac.uk).14 KB (1,557 words) - 13:24, 30 April 2014
- * CERN batch capacity migrated to SLC6 was at 65% last week. * The APEL accounting system has been undergoing database maintenance to improve performance and reliabi41 KB (5,106 words) - 19:52, 5 May 2014
- * Testing CVMFS Client version 2.1.19 ongoing. This is now rolled out to one batch of worker nodes. So far so good. | Outage of tape system for update of tape library controller. (Postponed from 13th May).13 KB (1,393 words) - 10:46, 14 May 2014
- <!-- ******************Start Limits On Batch System Jobs***************** -----> ...; padding-top: 0.1em; padding-bottom: 0.1em;" | Limits on concurrent batch system jobs.17 KB (1,612 words) - 11:29, 27 February 2019
- ...onday covered some site reports and OS related updates. Tuesday's focus is batch systems. Wednesday covers IPv6, security and benchmarking. Thursday storage ...naged services from Quattor to a new Puppet based Configuration Management system.41 KB (5,148 words) - 09:38, 2 June 2014
- ...onday covered some site reports and OS related updates. Tuesday's focus is batch systems. Wednesday covers IPv6, security and benchmarking. Thursday storage ...naged services from Quattor to a new Puppet based Configuration Management system.41 KB (5,148 words) - 07:10, 9 June 2014
- * Today (11th June) a new tape controller system (ACSLS) is being installed. There have been some problems with the new serv | Castor (all SRM endpoints) and batch (all CEs)15 KB (1,592 words) - 12:26, 11 June 2014
- 0) Find and read the "ARC Computing Element System Administrator Guide". <br> 3) Ensure the machine can submit to the batch system & has all of the users. <br>11 KB (1,578 words) - 15:50, 12 June 2014
- ...ieve this by using the (Condor) Submit module of a glideinWMS as the batch system and then channeling the jobs via the glideinWMS to the gridpp cloud. <br>925 B (154 words) - 11:11, 23 August 2019
- ...r their resources into a ‘pool’ via the [https://e-grant.egi.eu eGrant system]. [https://wiki.egi.eu/wiki/Resource_Allocation_Process More information] i * Castor and batch services currently down for Castor Namserver Upgrade (to version 2.1.14). I39 KB (4,952 words) - 19:40, 13 June 2014
- ...n there is contention between other processes for physical memory will the system force physical memory into swap and push the physical memory used towards t1 KB (241 words) - 10:28, 11 February 2015
- ...] needs updating and a consensus! Could the SEs implement some reservation system internally? Is there merit in the suggestion to make use of [https://www.gr * KeyDocs are going to be reviewed (in next 4 weeks) as the system is not working (or not adding anything) in some areas.43 KB (5,584 words) - 12:52, 7 July 2014
- '''UKI-NORTHGRID-MAN-HEP''': Multicore and passing parameters to the batch system testing requested by the experiments through the WLCG Task Force Alessandra8 KB (1,155 words) - 11:09, 13 March 2015
- ...egi-trustanchors.repo Finally, for historical reasons related to our build system, we also installed these two repos from the glite 3.2 instructions - jpacka ...wever you do it, make a munge key using /usr/sbin/create-munge-key on some system that has munge installed on it (this one?) and use the resulting key on all15 KB (2,429 words) - 10:18, 31 July 2015
- | Email everyone on how to hack the publishing system to avoid publishing incorrect GlueSubClusterWNTmpDir. | Plan out the future of CE/Batch System integration. Torque/maui are not supported by EGI. Layout an agenda with pr33 KB (5,297 words) - 10:13, 15 November 2017
- ...lable, called HTCondor (or CONDOR for short). We also decided to front the system with an ARC CE. You'll need a copy of the ARC System Admin Manual.121 KB (17,569 words) - 08:26, 28 November 2019
- ...or allocation. It is a brokering service only. There is one request in the system for cloud resources. * News: CERN-IT to terminate the SLC5-based interactive and batch services (lxplus5 and lxbatch5) soon. The current target date is 30 Septemb42 KB (5,358 words) - 10:48, 1 September 2014
- ... jobs at CCIN2P3 and of the method to passing job requirement arguments to batch systems via CE. ([https://indico.cern.ch/event/339461/ Agenda]) * OSG following up on how to discover HTCondor CEs in the information system.46 KB (6,062 words) - 10:07, 15 September 2014
- ...ring Saturday evening. It was restarted and tested but no fault found. The system was returned to service yesterday (30th Sep). * One batch of worker nodes (64 machines) have had Linux cgroups configured to enforce13 KB (1,429 words) - 10:06, 8 October 2014
- ==RAL Tier1 Incident 20130626 Failure of RAL CVMFS Stratum1 Triggered Batch Farm Problems=====Description:=== ...s over to use other replicas. However this did not happen across the Tier1 batch farm where many nodes were running a version of the CVMFS client in which t12 KB (1,968 words) - 15:13, 16 September 2014
- ...ordinating/publicising local site-admin tools (Nagios plugins, local batch system dashboards)906 B (116 words) - 08:35, 5 June 2018
- ...of the systems affected was the argus server and this caused a problem for batch job submissions for an hour or so. * The Atlas Frontier service will be switched to use the new database system that updates from CERN using Oracle "GoldenGate" on 24th Sep.12 KB (1,195 words) - 14:07, 17 September 2014
- ... jobs at CCIN2P3 and of the method to passing job requirement arguments to batch systems via CE. ([https://indico.cern.ch/event/339461/ Agenda]) * OSG following up on how to discover HTCondor CEs in the information system.48 KB (6,422 words) - 08:45, 23 September 2014
- *** Durham: Batch system upgrade led to one outage and a University wide internet connection loss le * Ongoing tests ongoing with some batch jobs for the LHC VOs running in SL6 containers on worker nodes running SL7.42 KB (5,079 words) - 18:37, 19 March 2017