Career | <?phpecho $jobTitle;?> | <?phpecho $companyName;?>

Site Reliability Engineer

OvationCXM

Remote
  • Job Type: Full-Time
  • Function: IT
  • Industry: Customer Support
  • Post Date: 03/16/2023
  • Website: www.ovationcxm.com
  • Company Address: 1622 Tiburon Boulevard, Belvedere Tiburon, CA, 94920
  • Salary Range: $50,000 - $150,000

About OvationCXM

OvationCXM is the Customer Experience Management (CXM) company helping businesses and their partner ecosystems deliver exceptional customer experiences with complete visibility and precise control so they can own the journey, guide the experience and unleash the benefits.

Job Description

OvationCXM, a dba under Boomtown Network Inc., is the Customer Experience Management company helping businesses and their partner ecosystems deliver exceptional customer experiences with complete visibility and precise control so they can own the journey, guide the experience and unleash the benefits. The OvationCXM Platform (“CXMEngine”) includes pre-built CRM connectors, customer journey orchestration and automation tools, ecosystem aggregation, as well as knowledge delivery and integrated communication solutions in one seamless platform. Customer experience professionals choose OvationCXM because the CXM technology delivers simplicity at a massive scale, streamlining CXM efforts at every customer touchpoint.

We are the emerging market leader in the Customer Experience Management space. As we scale, we are continually looking to bring on new talented and passionate team members who can help us in our bold objective of digitally transforming the Customer Experience from end-to-end. If you’re passionate about working at a dynamic organization that is committed to facing the challenges of the day, this is the place for you. We are seeking candidates who want to be at the forefront of powering entire ecosystems of businesses to work together in ways that had previously been impossible. Sound good? We’d love to hear from you.

About the Role

OvationCXM is looking for a Site Reliability Engineer (SRE) to join our team. This role maintains performance, incident response, upgrades, maintenance, and any installations needed for our production platform. The SRE team educates other teams and act as the first point of contact for support with the OvationCXM platform. The ideal candidate will have strong time-management skills and excellent communication skills, which enable him/her to work with various employees and leaders within the company. The SRE team owns all production problems and feel passionate about managing services to service level objectives.  

Responsibilities

  • Configuration management: use Chef and configuration management tools to effectively manage our infrastructure
  • Infrastructure as code: use Terraform and GitLab CI/CD for automation, containerize our environments (Kubernetes), and leverage cloud technologies to meet our goals
  • Systems: manage, configure and troubleshoot operating system issues, storage (block and object), networking (VPCs, proxies and CDNs), and administer high-availability MySQL and Redis clusters
  • Monitoring and instrumentation: implement metrics in InFlux, Grafana, log management and related system, and Slack/PagerDuty integrations
  • Engineering practices: availability, reliability and scalability, as well as disaster recovery
  • Planning: familiarity with agile methodologies; use epics, issues to drive projects
  • Management: able to self-organize and report asynchronously
  • Identify Service Level Indicators (SLIs) that will align the team to meet the availability and latency objectives
  • Contributing to documentation, create and update runbooks and general documentation
  • Completing Root Cause Analysis (RCA) investigations and perform readiness reviews
  • Improving team practices through code reviews, handoffs of work and incidents
  • Involvement in hiring process: contributing and reviewing questionnaires, qualifying candidates, interviewing.
  • Knowledge sharing, mentoring.
  • Develop monitoring and alerting to measure release process velocity
  • Identify process bottlenecks and introduce optimizations

Skills and Experience

  • Bachelor's degree in technology or computer science is required
  • General knowledge of 4 technical expertise areas, with deep knowledge in 1 area
  • Chef (basic syntax, recipes, cookbooks)
  • Terraform basic syntax and GitLab CI/CD configuration, pipelines, jobs
  • Cloud resources provisioning and configuration through CLI/API
  • Kubernetes basic understanding, CLI, service re-provisioning
  • Provisioning and setup metric in CheckMk, Graylog, Datadog, and Grafana, alerts and silences
  • Provision and setup logs and queries for general questions
  • Operating system (Linux) configuration, package management, startup and troubleshooting
  • Block and object storage configuration
  • Networking VPCs, proxies and CDNs

If you’re excited about redefining how B2B companies deliver exceptional customer and product experiences, come join us and help shape the future of our company. OvationCXM is a close group that values hard work, dedication, learning, and creativity. We offer equity participation in an early-stage, high growth technology company, competitive health/dental/vision benefits, unlimited PTO, and cash & stock bonuses for excellent performance.

 

We use cookies to customize your user experience. Click “Agree” if you agree with our Policy.