A technology firm in Dublin is seeking an experienced Incident Manager responsible for ensuring service reliability and customer trust. The role involves leading incident response efforts, troubleshooting complex technical issues, and optimizing infrastructure for high-performance computing. Ideal candidates will have deep expertise with Linux and Kubernetes, alongside strong customer-facing leadership experience. Offers include a competitive salary, advanced technical challenges, and a focus on innovation in AI-enabled technology. #J-18808-Ljbffr
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability. Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure. About This Role As an Incident Manager at Crusoe, you will be the frontline defender of our service reliability and customer trust. This role is pivotal to our mission, directly impacting the company’s success by minimizing downtime and orchestrating rapid resolutions to critical technical challenges. You will spearhead the management of high-visibility incidents and customer escalations, ensuring that our innovative climate-aligned computing platform remains robust and dependable. In this full-time position, you will lead the charge on transformative projects, including designing self-serve support processes and partnering with our engineering teams to drive product improvements based on real-world incident data. We are looking for a technically fearless professional who thrives in high-pressure situations and possesses the leadership skills to guide both customers and internal teams through complex technical landscapes. What You’ll Be Working On Incident Response Leadership: Lead the end-to-end management of high-visibility technical incidents and customer escalations, ensuring rapid restoration of services and effective communication throughout the lifecycle. Complex Troubleshooting: Diagnose and resolve sophisticated technical issues involving Infiniband, containerization, and distributed training to maintain peak operational efficiency for our customers. Infrastructure Optimization: Guide and assist customers in implementing and fine-tuning their HPC infrastructure, directly contributing to their performance goals and technical success. Strategic Collaboration: Act as a critical bridge between customers and internal engineering/product teams, translating frontline feedback into actionable product enhancements and quality improvements. Knowledge Empowerment: Develop and deliver high-impact training materials, internal documentation, and knowledge base articles to empower both teammates and customers to navigate our solutions effectively. Process Innovation: Design and implement robust incident response strategies and self-serve support processes to scale our ability to handle complex technical challenges. Risk Mitigation: Participate in and manage on-call rotations, providing a reliable safety net for our infrastructure and ensuring 24/7 readiness for critical service interruptions. What You’ll Bring to the Team Technical Linux & Virtualization Expertise: Demonstrate deep technical experience with Linux, Virtualization, and Kubernetes to effectively manage and resolve infrastructure incidents. Network Fundamentals: Apply a solid understanding of the TCP/IP stack to troubleshoot connectivity and performance issues across distributed systems. Infrastructure-as-Code (IaC) Knowledge: Utilize your understanding of IaC practices to navigate and support modern automated environments. Proven Customer Leadership: Bring 4-5 years of customer-facing experience, including 3-5+ years in a leadership role acting as a primary liaison between technical teams and stakeholders. Exceptional Communication: Leverage elite written and verbal communication skills to translate complex technical concepts into clear, actionable updates for diverse audiences. Analytic Problem-Solving: Apply a rigorous problem-solving mindset to diagnose, isolate, and resolve multifaceted technical issues under pressure. Bonus Points Programming Proficiency: Experience writing or debugging code in one or more programming languages. HPC Familiarity: Prior experience working with High-Performance Computing environments or large-scale distributed systems. Advanced Certifications: Industry-recognized certifications in Linux administration, Kubernetes (CKA), or Incident Management frameworks. Scalability Mindset: Experience scaling support or incident functions within a high-growth technology startup. Benefits Crusoe also offers a competitive benefits package designed to support financial security, health, and overall well-being, including pension contributions, private health and dental insurance, income protection, life assurance and more. Compensation Compensation will be paid as salary or hourly. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data. Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation. #J-18808-Ljbffr
A leading energy technology company in Dublin is seeking an Incident Manager to take the lead in service reliability and customer trust. The ideal candidate will have extensive experience in managing technical incidents, demonstrating expertise in Linux and Kubernetes, and will act as a liaison between customers and engineering teams. This full-time role offers opportunities for innovation and significant contributions to the company's mission of sustainable technology. #J-18808-Ljbffr
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability. Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure. About This Role: As an Incident Manager at Crusoe, you will be the frontline defender of our service reliability and customer trust. This role is pivotal to our mission, directly impacting the company’s success by minimizing downtime Pflege and orchestrating rapid resolutions to critical technical challenges. You will spearhead thescroll management of high-visibility incidents and customer escalations, ensuring that our innovative climate-aligned computing platform remains robust and dependable. In this full-time position, you will lead the charge on transformative projects, including designing self-serve support processes and partnering with our engineering teams Starr drive product improvements based on real-world incident data. We are looking for a technically fearless professional who thrives in high-pressure situations and possesses the leadership skills to guide both customers and internal teams through complex technical landscapes. What You’ll Be Working On: Incident Response Leadership: Lead the end-to-end management of high-visibility technical incidents and customer escalations, ensuring rapid restoration of services and effective 乐介 throughout the lifecycle_NORMAL. Complex Troubleshooting: Diagnose and resolve sophisticated technical issues involving Infiniband, containerization, and distributed training to maintain peak operational efficiency for our customers. Infrastructure Optimization: Guide and assist customers in implementing and fine-tuning their HPC infrastructure, directly contributing to their performance goals and technical success. Strategic Collaboration: Act as a critical bridge between customers and internal engineering/product teams,’ils translating frontline feedback into actionable product enhancements and quality improvements. Knowledge Empowerment: Develop and deliver high-impact training materials, internal documentation, and knowledge-base articles to empower both teammates and customers to navigate our solutions efficiently. Process Innovation: Design and implement robust incident response strategies and self-serve support processes to scale our ability to handle complex technical challenges. Risk Mitigation: Participate in andದ್ದು manage on-call rotations, providing a reliable safety net for our infrastructure and ensuring 24/7 readiness for critical service interruptions. What You’ll Bring to the Team: Technical Linux & Virtualization expertise: Demonstrate deep technical experience with Linux, Virtualization, and Kubernetes to effectively manage and resolve infrastructure incidents. Network Fundamentals: Apply a solid understanding of the TCP/IP stack to troubleshoot connectivity and performance challenges across distributed systems. Infrastructure-as-Code (IaC) Knowledge: Utilize your understanding of IaC practices to navigate and support modern automated environments. Proven Customer Leadership: Bring 4-5 years of customer-facing experience, including 3-5+ years in a leadership role acting as a primary liaison between technical teams and stakeholders. ямаุ Exceptional Communication: Leverage elite written and verbal communication skills to translate complex technical concepts into clear, actionable updates for diverse audiences. Analytic Problem‑Solving: Apply a rigorous problem‑solving mindset to diagnose, isolate, and resolve multifaceted technical issues under pressure. Bonus Points: Programming Proficiency: Experience writing or debugging code in one or more programming languages. HPC Familiarity: Prior experience working with High-Performance Computing environments or large-scale distributed systems. Advanced Certifications: Industry-recognized certifications in Linux administration, Kubernetes (CKA), or Incident Management frameworks. Scalability Mindset: Experience scaling support or incident functions within a high-growth technology startup. Benefits: Crusoe also offers a competitive benefits package designed to support financial security, health, and overall well-being, including pension contributions, private health and dental insurance, income protection, life assurance and more. Compensation: Compensation will be paid as salary or hourly. Compensation to be determined by the applicantაციების education, experience, knowledge, skills and abilities, as well as internal equity and alignment withحدد market data. Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race муҳ color, religion bare bril, disability, genetic information, pregnancy, citizenship, marital status, gender/sex, sexual preference/orientation, gender identity, age, veteran status, national origin, or any other protected status under law or regulation. #J-18808-Ljbffr
A technology company in Dublin seeks a Site Reliability Engineer to enhance its AI platform. The role focuses on automating processes, ensuring system reliability, and collaborating with development teams. Ideal candidates have 1-3 years of SRE experience, proficiency in programming, and knowledge of infrastructure tools like Docker and Kubernetes. Candidates should also have a Bachelor's Degree in Computer Science or related fields, showcasing a commitment to innovation and excellence in cloud infrastructure. #J-18808-Ljbffr
A sustainable technology firm is looking for a Director of Engineering for their Dublin office. This role requires over 10 years of experience in engineering leadership, focusing on cloud infrastructure and fostering a high-performing team. You will drive operational excellence and collaborate with US-based leadership to align regional priorities with global strategies. The ideal candidate will bring expertise in managing large-scale cloud infrastructure and building effective teams to create a sustainable future in computing. #J-18808-Ljbffr
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability. Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure. About This Role: As the Director of Engineering, Cloud Availability, you will lead our engineering organization in Dublin, serving as a critical bridge between our European operations and our US headquarters. This is a high-impact leadership role where you will oversee Site Reliability Engineering (SRE), Network Engineering, and Data Center Infrastructure Engineering to ensure the global resiliency of Crusoe Cloud. You will be the primary "culture carrier" for our Dublin office, fostering a high-trust, high-performance environment that remains deeply integrated with our global mission to build the future of sustainable computing. The ideal candidate is a strategic organizational leader who thrives in matrixed environments and possesses a deep technical background in large-scale cloud infrastructure. You will be responsible for scaling our EMEA presence, managing critical reliability initiatives, and implementing a seamless "follow-the-sun" operational model. This is a full-time position based in Dublin, offering the opportunity to lead transformative projects—from building SRE functions for AI/ML managed services to driving a culture of operational excellence across multiple continents. What You’ll Be Working On: Organizational Leadership: Partner closely with Data Center, Network, and SRE teams to build and scale a world-class engineering organization in Dublin. Site Leadership & Culture: Serve as the primary point of contact and face of Crusoe leadership in Dublin, proactively managing office sentiment and ensuring the team remains focused on high-impact objectives. Global Strategic Alignment: Build high-trust partnerships with US-based leadership to ensure local priorities are perfectly synchronized with the global business roadmap. Operational Excellence: Implement and refine "follow-the-sun" protocols to enable smooth hand-offs between time zones, ensuring zero customer disruption and 24/7 reliability. Unified Team Vision: Foster a "one-team" mindset across geographic boundaries, breaking down silos and promoting deep collaboration between Dublin and US offices. Talent Development: Level up the Dublin engineering team by identifying individual strengths and establishing a culture of mentorship to grow the next generation of Engineering Leads and ICs. Reliability Initiatives: Lead the development of SRE functions for IaaS and managed services, including Inference, SLURM, and automated cluster management. What You’ll Bring to the Team: Extensive Leadership Experience: 10+ years of engineering leadership experience with a proven track record of managing high-performing technical teams. Cloud Infrastructure Expertise: Deep technical knowledge of public cloud infrastructure and experience building or operating large-scale platforms (Public, Private, or Hybrid). Reliability Mastery: Expert-level understanding of availability, observability, SLIs/SLOs, and modern incident management frameworks. Global Collaboration: Proven ability to lead remote teams and successfully collaborate with US-based engineering organizations. Matrix Proficiency: Demonstrated success navigating and leading within a matrix organizational structure. Container Orchestration: Strong familiarity with virtual and managed Kubernetes platforms, such as EKS, GKE, or AKS. Strategic Thinking: The ability to balance long-term organizational strategy with the immediate tactical needs of a fast-growing engineering site. Bonus Points: AI/ML Infrastructure: Prior experience working with or building infrastructure platforms specifically tailored for AI and Machine Learning workloads. Startup Scaling: Experience navigating the rapid growth phases of a high-scale startup environment. Large-Scale Infrastructure: A background managing massive-scale infrastructure projects that exceed standard enterprise requirements. Advanced Reliability Architectures: Experience designing automated recovery systems and "self-healing" infrastructure at scale. Benefits: Crusoe also offers a competitive benefits package designed to support financial security, health, and overall well-being, including pension contributions, private health and dental insurance, income protection, life assurance, and more. Compensation: Compensation will be paid as a salary or hourly. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data. Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation. #J-18808-Ljbffr
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability. Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure. About This role: Crusoe Cloud is revolutionizing high-performance computing by offering sustainable, low-cost GPU compute power. As a Cloud Support Engineer, you'll play a crucial role in empowering our customers to leverage this technology for groundbreaking advancements in fields like AI/ML, physics simulations, and computational biology. You will be the primary point of contact for technical support, ensuring our customers can seamlessly utilize Crusoe Cloud to achieve their goals. This role directly impacts Crusoe's mission by enabling our customers to accelerate their research and development, contributing to a more sustainable future. You will be involved in exciting projects, working with cutting‑edge technologies and collaborating with a talented team to solve complex challenges. The ideal candidate is a highly motivated and experienced technical professional with a passion for customer success, a deep understanding of cloud technologies, and a commitment to Crusoe’s values. This is a full-time position. What You’ll Be Working On: Customer Support: Provide exceptional technical support to customers via Zendesk, meeting SLAs and maintaining high CSAT (95%+). On‑Call Rotation: Participate in a 24/7 on‑call rotation to ensure timely resolution of critical issues. Troubleshooting: Diagnose and resolve issues related to VMs, hardware failures, and scaling tests using CLI and internal tools. Alert Triage and Maintenance: Manage alert triage, prepare for maintenance windows, and conduct node delivery testing. Collaboration: Work closely with SRE, Networking, and Storage teams from initial triage to root cause analysis (RCA) delivery. Global Teamwork: Adhere to global team collaboration and handoff processes for ticketing and on‑call procedures. Knowledge Sharing: Develop onboarding/training materials, knowledge base documentation, and standard operating procedures (SOPs). What You’ll Bring to the Team: Education/Experience: Bachelor's degree in IT, Computer Science, Engineering, or a related field, or 4+ years of equivalent technical experience. Linux Proficiency: Strong command‑line interface (CLI) skills in Linux environments. Version Control: Proficiency with Git for code management and collaboration. Customer Support Experience: 5+ years of experience in a customer support role, ideally within cloud, storage, or networking environments. Cloud Technologies: Experience with container orchestration (e.g., Kubernetes), workload management (e.g., Slurm, Terraform), and monitoring tools (e.g., Grafana). Public Cloud Knowledge: Familiarity with other public cloud platforms (e.g., AWS, Azure, GCP). Communication Skills: Excellent communication and customer service skills, including the ability to prioritize competing escalations. HPC Knowledge: Understanding of HPC technologies such as Infiniband, RDMA, RoCE, and Software Defined Networking (SDN). Bonus Points: Certifications: CKA, CKAD, CKS, KCNA, AWS Machine Learning - Specialty, Data Analytics - Specialty, Solutions Architect - Professional, Developer - Associate, NVIDIA AI Infrastructure and Operations, Generative AI and LLMs, Generative AI Multi-modal, Infiniband, Linux Foundation IT Associate, System Administrator. Cloud Expertise: Deep understanding of specific cloud platforms and services. Automation Skills: Experience with automation tools and scripting languages. Problem‑Solving Abilities: Demonstrated ability to analyze complex technical issues and develop effective solutions. Collaboration and Mentorship: Proven ability to mentor, train, and onboard colleagues. Passion for Sustainability: A strong interest in contributing to a more sustainable future through technology. Benefits: Crusoe also offers a competitive benefits package designed to support financial security, health, and overall well‑being, including pension contributions, private health and dental insurance, income protection, life assurance and more. Compensation: Compensation will be paid as salary or hourly. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data. Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation. #J-18808-Ljbffr
A leading sustainable technology company in Dublin is seeking a Cloud Support Engineer to provide exceptional technical support to customers leveraging cutting-edge solutions. The role requires deep knowledge of cloud technologies and Linux proficiency, along with exceptional communication skills. The ideal candidate will ensure customer satisfaction and effectively troubleshoot issues while working closely with cross-functional teams. This full-time position offers competitive compensation and supports a sustainable future through innovative technology. #J-18808-Ljbffr
About This Role: Crusoe Cloud is seeking a Senior+ Solutions Engineer to work closely with our most strategic enterprise customers deploying AI/ML workloads on Crusoe’s high-performance GPU infrastructure. This is a hands-on, customer-facing role requiring deep technical expertise in Kubernetes, MLOps, and cloud infrastructure. You’ll guide customers through end-to-end deployment—owning the PoC process, optimizing workloads post-sale, and serving as a critical technical voice between our customers and engineering teams. Ideal candidates are passionate about AI infrastructure, fluent in containerized environments, and confident translating workloads across cloud platforms. What You'll Be Working On: Customer Enablement: Lead technical onboarding and deployment of complex AI/ML workloads with strategic enterprise customers—owning the PoC through to post-sales optimization. Kubernetes + MLOps Focus: Architect and deploy ML workloads using Kubernetes-based stacks (e.g., Ray, Kubeflow) Design infrastructure that balances performance, scalability, and efficiency. Infrastructure-Centric Thinking: Go beyond abstracted services—deploy and optimize AI/ML workloads directly on Crusoe infrastructure. Ensure performance at the container and hardware level. Cross-Cloud Translation: Help customers migrate and adapt workloads across AWS, Azure, and GCP. Understand and explain the tradeoffs between cloud-native and Crusoe-native approaches. Technical Storytelling: Conduct workshops, live demos, and solution reviews. Contribute to case studies, solution briefs, and blog posts that highlight real-world customer success. Voice of the Customer: Relay feedback to internal engineering and product teams to continuously improve Crusoe’s platform based on real-world implementation experience. What You'll Bring to the Team: Deep Kubernetes Expertise: 3-5 years building and deploying containerized workloads. Experience with Helm, Terraform, Docker, and multi-node orchestration a must. MLOps Deployment Experience: Demonstrated success deploying ML frameworks (e.g., Ray, MLflow, Airflow) on Kubernetes—especially for inference and model training workflows. Hands-on Cloud Infrastructure Knowledge: Familiarity with compute, storage, networking, and scaling in AWS, GCP, or Azure. Experience translating workloads across clouds is highly desirable. Customer-Facing Technical Confidence: Able to navigate stakeholder conversations, gather requirements, lead technical engagements, and support customers in both pre- and post-sales environments. Strong Linux and CLI Proficiency: Comfortable operating in Linux environments and troubleshooting infrastructure issues via CLI. Collaborative Energy: Strong communication skills and eagerness to partner cross-functionally with Engineering, Product, and Sales to make customers successful. Bonus Points: Experience with Ray, Kubeflow, or other distributed ML orchestration platforms Exposure to Slurm, but with a primary focus on containerized MLOps over traditional HPC Multi-cloud deployment or migration experience (especially AWS → Crusoe transitions) Content contributions (tech talks, blogs, public case studies) Must be able to pass a background check Embody the Company values Benefits: Crusoe offers a competitive benefits package designed to support financial security, health, and overall well-being, including pension contributions, private health and dental insurance, income protection, life assurance and more. Compensation: Compensation will be paid as salary or hourly. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data. Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation. #J-18808-Ljbffr