We are seeking a highly skilled and experienced engineer to architect, deploy, and optimize private and public cloud environments. If you’re an openstack expert with experience in scalable cloud solutions, this role is a perfect match for you.
you will design and deploy a new openstack infrastructure, ensuring high availability, security, and performance, while automating deployment and management for greater efficiency.
join a team of experts working on cutting-edge cloud projects, driving innovation in ai and high-performance computing.
customer
our client is a pioneering leader in ai and high-performance computing, driving innovation in distributed cloud infrastructure. Their focus is on delivering scalable, efficient, and cost-effective solutions tailored for ai/ml workloads. The company collaborates with industry leaders to empower developers, researchers, and enterprises with cutting-edge computing resources.
by maintaining a strong emphasis on reliability, performance, and accessibility, the company is redefining ai infrastructure standards. Engineers joining this project will have the opportunity to work with state-of-the-art cloud technologies and contribute to a mission that is shaping the future of ai and distributed computing.
project
this project is at the forefront of ai and high-performance computing, leveraging cutting-edge gpu infrastructure to power advanced research, analytics, and ai/ml workloads. The team is composed of experts with decades of experience in high-performance distributed computing, including building some of the earliest distributed clusters at nasa and operating mission-critical systems at companies like google, microsoft, coreweave, and more.
responsibilities
* design, deploy, and manage openstack-based private and public cloud environments
* ensure high availability, scalability, and performance in high-performance computing (hpc) environments
* automate deployment and operations using ansible, terraform, and kubernetes
* monitor and optimize cloud security, performance, and reliability, implementing proactive improvements
* collaborate with development, devops, and it teams to deliver scalable infrastructure solutions
* troubleshoot and resolve complex cloud issues, providing advanced support as needed
* manage and optimize storage, networking, and compute resources in openstack
* stay up-to-date with openstack releases and contribute to upgrade and patching strategies
* implement secure configuration best practices and develop ci/cd workflows for automated deployments
* oversee containerization technologies like kubernetes and docker for efficient workload management
requirements
* at least 5 years of experience as an openstack engineer or cloud infrastructure engineer
* deep understanding of openstack components (nova, neutron, cinder, keystone, swift, etc.)
* hands-on experience with cloud automation tools like ansible, terraform, etc.
* proficiency in scripting languages (python, bash, etc.)
* strong knowledge of networking concepts (vlans, sdn, vpns, load balancing)
* familiarity with monitoring tools (prometheus, grafana, nagios) and logging solutions
* solid understanding of virtualization technologies (kvm, qemu)
* proficiency in containerization using docker and kubernetes
* knowledge of security best practices for cloud infrastructure
* excellent problem-solving and communication skills
* at least an upper-intermediate level of english
would be a plus:
* openstack certification (coa – certified openstack administrator)
* nvidia certifications (e.g., nvidia certified systems engineer, nvidia dli – deep learning institute)
* experience with ceph storage
* proficiency in infiniband
seniority level
mid-senior level
employment type
full-time
job function
information technology
industries
software development
#j-18808-ljbffr