HPC Analyst
Bologna, Italy

Job reference: VN23-02

Location: Bologna, Italy

Deadline for applications: 19/02/2023

Publication date: 23/01/2023

Salary and Grade: Grade A2: EUR 70,794.48, net annual basic salary + other benefits

Contract type: STF-C

Department: Computing

Contract Duration: Four years, with the possibility of a further contract.

About ECMWF

ECMWF is the European Centre for Medium-Range Weather Forecasts. It is an intergovernmental organisation created in 1975 by a group of European nations and is today supported by 34 Member and Co-operating States, mostly in Europe. The Centre’s mission is to serve and support its Member and Co-operating States and the wider community by developing and providing world-leading global numerical weather prediction. ECMWF functions as a 24/7 research and operational centre with a focus on medium and long-range predictions and holds one of the largest meteorological archives in the world. The success of its activities relies primarily on the talent of its scientists, strong partnerships with its Member and Co-operating States and the international community, some of the most powerful supercomputers in the world, and the use of innovative technologies such as machine learning across its operations.

Over the years, ECMWF has also developed a strong partnership with the European Union, and for nearly a decade has been an entrusted entity for the implementation and operation of the Copernicus Climate Change and the Atmosphere Monitoring Services of the EU Space Programme, as well as a contributor to the Copernicus Emergency Management Service. The collaboration does not stop there and includes other areas of work, including High Performance Computing and the development of digital tools. It is enabling ECMWF to now provide data and products covering weather, climate, air quality, fire and flood prediction and monitoring.

ECMWF is a major partner in the implementation of the Destination Earth (DestinE) initiative, phase 1 of which started in late 2021, together with ESA and EUMETSAT as partners. The objective of the European Commission’s DestinE initiative is to deploy several highly accurate thematic digital replicas of the Earth, called Digital Twins, to monitor and predict natural and human activities as well as their interactions, to develop and test scenarios that would enable more sustainable developments and support corresponding European policies for the Green Deal.

ECMWF has recently become a multi-site organisation, with its headquarters based since its creation in Reading, UK, a new data centre in Bologna, Italy, and new offices in Bonn, Germany. ECMWF has adopted a hybrid organisation model which allows its staff to mix office working and teleworking. This generous and flexible model provides our staff with considerable flexibility to spend time outside or away from their duty station and decide how they wish to manage their professional working time at ECMWF. ECMWF is an organisation that values the whole being and understands and values the need for flexibility in the way its staff work.

For additional details, see www.ecmwf.int

Summary of the role

ECMWF’s High-Performance Computing Facility (HPCF) is a mission-critical central service. A quarter of the aggregate time on the facility is allocated to 24x365 operational forecasting based on ECMWF’s own Integrated Forecasting System (IFS). This suite repeats four times a day, and in conjunction with associated time-critical data assimilation suites of IFS presents critical activity at nearly all times of day and night. The suites run to extremely tight production and dissemination schedules monitored for delays on the scale of a few minutes, making the reliability and efficiency of the service vitally important. 

ECMWF Member States have another quarter of the available time on the system for activities such as running some of their own time-critical operational forecasts or supporting their organisations’ research and other specific project work. 
The remaining system resources are used by the ECMWF Research Department to continually improve ECMWF’s own data assimilation and forecasting suites. 

In total the HPCF runs several hundred thousand jobs each day.

The current operational HPC service is provided by four Atos BullSequana compute clusters with a set of supporting Lustre parallel filesystems based on DDN ExaScaler. Details can be found at  https://www.ecmwf.int/en/computing/our-facilities/supercomputer-facility

To ensure that the HPCF meets the demanding requirements for availability, performance and usability, ECMWF has a team of analysts dedicated to looking after the systems in collaboration with support teams provided by the HPC systems supplier.

This ECMWF HPC Systems Team comprises of five positions, based in the data centre in Bologna, Italy.

A core element of the role is participation in the proximity on-call rota to provide 24x7 support to resolve urgent issues on ECMWF's mission-critical HPC systems. 

While on-call, the HPC analyst must:
•    Remain available by telephone or via an IT system, as agreed with the line manager, and respond to any call-out message within fifteen minutes.
•    Set themselves up to always have a laptop or workstation available and remain within areas with good network connectivity and be able to start remote working within fifteen minutes of responding to the initial message 
•    Be able to present him/herself at the duty station within one hour, from the moment the need for physical on-site support arises from an incoming message or from the ensuing remote analysis performed.

As restrictions are placed on personal activities when on-call, compensatory leave will be granted. All five members of the HPC Systems Team participate equally in the on-call rota. 


Main duties and key responsibilities

​​​​​​​
  • Promoting efficient use of ECMWF’s HPC facilities, and to that end, providing ECMWF’s support groups, developers and end users with assistance, tools and training 
  • Working closely with other members of the HPC team, users of the HPCF, ECMWF user support and the HPCF supplier’s engineers to assist in aspects of:
  • Resolving user and operational problems with a particular focus on operational problems relating to the operating system and HPC software stack, and to software packages maintained by the HPC team;
  • Configuring, testing, tuning and bringing into production new HPC hardware;
  • Installing, maintaining, configuring and tuning the operating system, high-performance interconnects, parallel filesystems, batch scheduling systems, standard utilities, user environment and locally developed tools on the HPC facilities;
  • Integrating the HPC facilities with the workflows of research and time-critical operational applications;
  • Continuously improve resiliency
  • Planning for and accompanying installation of new software upgrades, releases and bug-fixes;
  • Providing on-site general 24x7 monitoring staff with information, procedures and training that they need for the day-to-day running of the HPCF service;
  • Implementing a strong security posture for the HPC systems;
  • Participating in a shared rota to provide 24x7 proximity on-call support to resolve urgent issues on ECMWF's mission-critical HPC systems 
  • Contributing to the research and evaluation of successor systems to ECMWF’s current HPC Facilities
  • Representing ECMWF in meetings with supercomputer vendors and at international technical conferences

Personal attributes

​​​​​​
  • Excellent analytical and problem-solving skills, working methodically and proactively
  • Ability to work independently under pressure in a time critical situation; working reliably and responsibly, with flexibility to adapt to changing requirements
  • Good written and oral communication skills 
  • Good interpersonal skills to work collaboratively with other members of a small team and enjoy actively contributing to team technical discussions to jointly identify the best way to proceed
  • Appreciate and seek opportunities to work with subject matter experts from other fields to understand users’ needs, consider complex technical circumstances, and translate the findings into deliverable requirements

Education, experience, knowledge and skills (including language)

Education

  • A university degree (EQF Level 6), preferably in computer science or in a technical or scientific subject, or equivalent professional qualification and experience

Experience

  • Relevant experience in massively parallel processing computer systems, large-scale Linux clusters and parallel cluster filesystems or other machines of similar architecture.
  • Relevant experience in Unix or Linux systems administration; firm, demonstrated knowledge of Unix shells, Python and/or Perl and software configuration management.
  • Familiarity with programming using C, Fortran, MPI and/or OpenMP is highly desirable.

Language

  • Candidates must be able to work effectively in English and interviews will be conducted in English
  • Fluency in Italian would be advantageous as would knowledge of one of the Centre’s other working languages (French or German)

Other information

Grade remuneration

The successful candidate will be recruited at the A2 grade, according to the scales of the Co-ordinated Organisations and the annual basic salary will be EUR 70,794.48 net of tax. ECMWF also offers a generous benefits package, including a flexible teleworking policy. The position is assigned to the employment category STF-C as defined in the ECMWF Staff Regulations. Full details of salary scales and allowances available on the ECMWF website at www.ecmwf.int/en/about/jobs, including the ECMWF Staff Regulations and the terms and conditions of employment.

Starting date: As soon as possible.

Length of contract: Four years, with the possibility of a further contract.

Location: The position will be located at ECMWF's duty station in Bologna, Italy.

As a multi-site organisation, ECMWF has adopted a hybrid organisation model which allows flexibility to staff to mix office working and teleworking.

Successful applicants and members of their family forming part of their households will be exempt from immigration restrictions.

Interviews by videoconference (Via Teams) are expected to take place in March 2023.

Who can apply

Applicants are invited to complete the online application form by clicking on the apply button.

At ECMWF, we consider an inclusive environment as key for our success. We are dedicated to ensuring a workplace that embraces diversity and provides equal opportunities for all, without distinction as to race, gender, age, marital status, social status, disability, sexual orientation, religion, personality, ethnicity and culture. We value the benefits derived from a diverse workforce and are committed to having staff that reflect the diversity of the countries that are part of our community, in an environment that nurtures equality and inclusion.

Applications are invited from nationals from ECMWF Member States and Co‑operating States, listed below:
Austria, Belgium, Bulgaria, Croatia, Czech Republic, Denmark, Estonia, Finland, France, Georgia, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Latvia, Lithuania, Luxembourg, Montenegro, Morocco, the Netherlands, Norway, North Macedonia, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey and the United Kingdom.

Applications from nationals from other countries may be considered in exceptional cases. 

 

The closing date for this job has now passed.

Back