We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

HPC Systems Integration Engineer 4 - 130890

UC San Diego
United States, California, Oakland
1111 Franklin Street (Show on map)
Nov 16, 2024
HPC Systems Integration Engineer 4 - 130890
Click Here to
Apply Online
Job Description
Extended Deadline: Thu 12/5/2024
UC San Diego values equity, diversity, and inclusion. If you are interested in being part of our team, possess the needed licensure and certifications, and feel that you have most of the qualifications and/or transferable skills for a job opening, we strongly encourage you to apply.

UCSD Layoff from Career Appointment: Apply by 7/8/24 for consideration with preference for rehire. All layoff applicants should contact their Employment Advisor.

Special Selection Applicants: Apply by 7/31/24. Eligible Special Selection clients should contact their Disability Counselor for assistance.

Job posting will remain open until position is filled.

DESCRIPTION

The Mission of the San Diego Supercomputer Center is to translate innovation into practice. SDSC adopts and partners on innovations in industry and academia in the areas of software, hardware, computational and data sciences, and related areas, and translates them into cyberinfrastructure that solves practical problems across any and all scientific domains and societal endeavors. Cyberinfrastructure refers to an accessible, integrated network of high-performance computing, data, and networking resources and expertise, focused on accelerating scientific inquiry and discovery. With more than 250 employees and $30-50M of revenue a year, SDSC is a global leader in the design, development, and operations of cyberinfrastructure.

SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from earth sciences and biology to astrophysics, bioinformatics, and health IT. SDSC presently operates multiple large HPC systems ranging from a 120k x86 CPU core general purpose system to a system explicitly designed for Artificial Intelligence and Machine Learning, and a nationally distributed system open for all of academia to integrate with. SDSC offers research data services across the entire vertical stack from universally scalable storage to consulting services on FAIR, Big Data, and AI. SDSC offers a rich set of cloud services both on-premise, in the commercial cloud, and as hybrid services across both.

SDSC has three geographic scopes, a national scope supporting cyberinfrastructure for the entire US research and education community, a California scope with a special focus on convergence research that addresses the three dominant threats to CA: Drought, Fire, Earthquakes, and a campus scope focusing on advancing the global impact of SDSC by advancing the research objectives of the UC San Diego faculty, researchers, and students. SDSC impacts researchers at scales from 1,000's to Millions. SDSC annually trains thousands of researchers in cyberinfrastructure tools and software, and supports thousands of individual researchers via Unix accounts on its large HPC systems. SDSC was a leader developing the Science Gateway concept, and continues to be a global leader in its evolution. SDSC operates multiple major such gateways with user communities ranging from the tens of thousands to the millions. SDSC's educational programs includes online courses that have been attended by more than a million students.

SDSC is committed to democratizing access to cyberinfrastructure across all of its geographic scopes. SDSC strives towards a culture that supports our employees to be their best, achieve their goals, and enjoy their lives, both professionally and personally.

SDSC's High-Performance Systems Group is responsible for and operates SDSC's high-performance computing clusters and related systems. The group operates large-scale compute and storage systems funded by the National Science Foundation (currently the XSEDE program), the UC San Diego campus (e.g., the Triton Shared Compute Cluster) and other entities; these systems support users from campus, national, and international communities across a broad range of scientific disciplines. The group is part of SDSC's Data-Enabled Scientific Computing (DESC) Division.

POSTION OVERVIEW:

The incumbent will apply advanced systems and software integration concepts, and location or institutional objectives, to resolve highly complex issues where analysis of systems and software requires an in-depth evaluation of variable factors to resolve and implement medium to large projects of broad scope and complexity. They will regularly resolve highly complex business processes, system functionality, implementation issues, and system and software integration issues where analysis of situations or data requires an in-depth evaluation of variable factors. They will select tools, methods, techniques, and evaluation criteria to obtain results, give technical presentations to associated team, other technical units, and management as well as evaluate new technologies including performing moderate to complex cost/benefit analyses. The incumbent may lead a team of systems/infrastructure professionals.

This position has primary responsibility for Triton Shared Computing Cluster, a UC San Diego computing resource operated on behalf of the UC San Diego research community. TSCC comprises approximately 300 general computing and GPU nodes and is designed to grow as more researchers participate. In this capacity, the incumbent interacts with condo owners to provide access to the computational resources and software required for their research, develops policies to ensure reliable and efficient cluster operation, and works with user support staff to provide effective support while balancing competing and sometimes incompatible needs and desires from different laboratories to ensure equitable treatment of different projects.

The incumbent also provides project, system administration support, and on-call duties for other resources at SDSC including but not limited to: Expanse, an NSF-funded supercomputer operated on behalf of the national research community; Popeye, an HPC resource managed for the Simons Foundation; Voyager an AI supercomputer and Cosmos which will be deployed in late 2024.

The incumbent works extensively with members of the SDSC HPC systems group to coordinate operations between TSCC and SDSCs other HPC systems and storage. The position involves researching existing cluster operations, monitoring, and reporting tools as well as designing, implementing, and documenting new ones. In this role the incumbent may lead the design and implementation of new high performance cluster resources, determining the best architecture solutions using state-of-the-art computational, storage, and network technologies. They will oversee multiple vendor proposals, evaluating the relative strengths and weaknesses of each to determine the best solution for continued cluster operations and expansion. TSCC and other cluster operations rely on various cluster management systems including Rocks and Bright. The position requires detailed knowledge of these or similar cluster management tools to maintain the configuration state of all managed systems. Knowledge of version control systems such as GIT is required to track system changes.

For more information, please visit: https://www.sdsc.edu/

QUALIFICATIONS
  1. Bachelor's degree in Computer Science or in related area and / or equivalent experience / training.

  2. Advanced Knowledge of HPC and Cyber Infrastructure.

  3. Advanced knowledge of HPC middleware stack including cluster management tools, job schedulers and resources managers. Examples include: Slurm, PBS, Maui, Rocks and Bright Cluster Manager.

  4. Demonstrated experience with Infiniband, RoCe, Ethernet and IP networking, including VLANs, subnets, and routing.

  5. Demonstrated experience with parallel filesystems such as Lustre or GPFS.

  6. Demonstrated experience programming/scripting with Bash and Python and using version control tools such as git.

SPECIAL CONDITIONS
  • Job offer is contingent upon satisfactory clearance based on Background Check results.

Pay Transparency Act

Annual Full Pay Range: $104,900 - $198,900 (will be prorated if the appointment percentage is less than 100%)

Hourly Equivalent: $50.24 - $95.26

Factors in determining the appropriate compensation for a role include experience, skills, knowledge, abilities, education, licensure and certifications, and other business and organizational needs. The Hiring Pay Scale referenced in the job posting is the budgeted salary or hourly range that the University reasonably expects to pay for this position. The Annual Full Pay Range may be broader than what the University anticipates to pay for this position, based on internal equity, budget, and collective bargaining agreements (when applicable).

If employed by the University of California, you will be required to comply with our Policy on Vaccination Programs, which may be amended or revised from time to time. Federal, state, or local public health directives may impose additional requirements.

To foster the best possible working and learning environment, UC San Diego strives to cultivate a rich and diverse environment, inclusive and supportive of all students, faculty, staff and visitors. For more information, please visit UC San Diego Principles of Community.

UC San Diego is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age or protected veteran status.

For the University of California's Affirmative Action Policy please visit: https://policy.ucop.edu/doc/4010393/PPSM-20
For the University of California's Anti-Discrimination Policy, please visit: https://policy.ucop.edu/doc/1001004/Anti-Discrimination

UC San Diego is a smoke and tobacco free environment. Please visit smokefree.ucsd.edu for more information.

Application Instructions

Please click on the link below to apply for this position. A new window will open and direct you to apply at our corporate careers page. We look forward to hearing from you!

Apply Online
Payroll Title:
SYS INTEGRATION ENGR 4
Department:
San Diego Supercomputer Center
Hiring Pay Scale
$101,200 - $130,000 / Year
Worksite:
Hybrid
Appointment Type:
Career
Appointment Percent:
100%
Union:
Uncovered
Total Openings:
1
Work Schedule:
Days, 8 hrs/day, Mon - Fri
Click Here to
Apply Online
X
Share This Page
HPC Systems Integration Engineer 4 - 130890
Share link. Copy this URL:

Posted: 11/20/2024

Job Reference #: 130890

Applied = 0

(web-5584d87848-9vqxv)