Zonal Retail Data Systems Limited
The Zonal group are one of the UK’s largest technology providers to the hospitality industry. Our products are used by over 16,000 pubs, restaurants, and hotels. Customers include national brands like Pizza Express, JD Wetherspoons and All Bar One. We provide our customers with the solutions they need to make their business a success. These solutions include mobile apps for ordering and web apps for engaging with consumers either through loyalty or reservations. By linking these solutions to Zonal’s EPoS (till) system, we help hospitality brands to understand their customers’ behaviour and preferences, enabling them to excel in an increasingly competitive market. If you have booked a table or hotel room, ordered, and paid for food and drinks, received loyalty offers, or downloaded your favourite hang out’s app, you will likely have used a Zonal product. We are a family business with Scottish roots. We operate from our modern head office in Edinburgh to our Marketing Technologies Division in Staffordshire, or our Innovation Centre in Abingdon and hotel management solutions base in Cardiff. What you’ll do This role sits within the Zonal Managed Services team and is part of the wider Zonal Technical Services business unit. Our suite of SaaS, distributed systems and product integrations help our customers run their critical business operations and provide their customers in turn with industry leading hospitality technology products. You’ll play a key role in the formation of a new area within Zonal: our Production Operations (ProdOps) team that aims to drive operational excellence and customer focus into the operation of our SaaS hosted application suite. As a Support Technician, you’ll bring your experience in providing level 4 support in distributed and centralised systems, integrations, and deployments to the Production Ops team underpinning our industry leading SaaS solution. You will triage and take ownership of incidents and requests from level 3 helpcentre support, working closely with teammates in SRE, Development and Platform Delivery to provide a responsive, stateful incident workflow, identifying opportunities to knowledge-share, enhance documentation, improve tooling and reduce toil. The makeup of our systems is changing rapidly, and you’ll play a key part in helping us drive this forward. We’re moving towards a modern DevOps landscape with technologies like Docker, IaC and microservices. Initially we are working with our own hosted data centre infrastructure technology however you’ll play a key role in our drive towards a future hybrid public-cloud position. You and your team will: Build strong, collaborative relationships acting as the glue between in-house customer facing support and delivery teams, and platform engineering (R&D) teams Own, run and continually improve: Incident triage, response through to resolution Logging, monitoring, and alerting services and infrastructure Dashboards, internal and external status pages Automation and tooling of manual processes Team processes, driving technical debt down Capacity analytics and demand management Disaster Recovery models, planning and testing Reduce toil (work that is largely manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as our services grow), maximising engineering capacity Bring expertise and a streetwise perspective to problem solving, reduction of complexity to operations Participate in On-Call cover and Incident Response Proactively manage delivery of key SLOs covering Detection / early warning and self-healing Act as key stakeholders in the technical debt reduction of our Products Who you are You will have a background in deploying, managing, and operating mission critical SaaS and distributed systems having spent at least some of your career as a member of a fast-paced product engineering, web operations and/or platform delivery team. Ideally you will have a demonstrable track record of operating systems in hybrid datacentre/cloud infrastructures. When things go wrong – and they will – we want to know about the problem before our users, so will need your solid understanding of modern monitoring and toolsets to support triage, investigation, and remediation. A self-starter with a passion for technology and problem solving, with excellent analytical skills who thrives in a fast-paced autonomous environment Solid experience in scripting, tooling, automation, and data access – with PowerShell, T-SQL and MySQL would be an advantage Excellent understanding of traditional ops in a virtualised Windows/Linux environment Knowledge and experience of monitoring frameworks such as Zabbix, data retrieval and event correlation from Graylog Quick to spot opportunities and new capabilities in technologies Familiar with docker and container ecosystems Comfortable in complex provisioning and deployment scenarios A strong collaborator, organised, with a safe pair of hands A team player who enjoys influencing change and representing the operational and customer impacts in Tech Debt prioritisation Comfortable interacting with mixed audiences of Support, Product Delivery, Engineering, and Incident Management Minimum 3+ years’ experience operating and supporting production software. What we value Passion, Teamwork, Innovation and Professionalism are the values we believe make us the company we are. We are looking for someone who understands great culture and will help us shape it as it evolves.