In this role, we are seeking a talented Contractor DevOps Engineer for the eCommerce engineering team with experience to design, implement and scale client technologies, approach, and platform to support HA systems. Responsible for deploying & maintaining deliverables in all phases of the Software Development Life Cycle (SDLC) by collaborating with business, development & systems engineering teams.
**Can be remote – EST hours (team generally not on site but if candidate prefers they can come on site)
- Responsibilities include scope definition for infrastructure, infrastructure build out, application environments maintenance, application deployment, application robustness verification through performance testing and continued application support through on-call cadence.
- Qualified candidates will bring the latest software development, automation, and deployment practices.
- You will develop systems to accelerate deployments for a variety of software projects.
- Work collaboratively with the developer to automate, deploy, operate, maintain, and monitor systems in multiple data centers.
- Project Leadership
- Hands-on leadership of design and work closely with developer along with supporting operations functions within an Agile/Scrum environment
- Responsibility for related application, technology, data scalability, and security pertaining to eCommerce growth.
- Gather requirement, create plan, create an estimated timeline, and execute the project.
- Lead the high-quality execution of software products against project plans and delivery commitments.
- Engineering Strategy and Design
- Collaborate on milestones for infrastructure buildouts, engineering, development, testing and implementation of application environments
- Responsibility for multi-channel software development lifecycle, enhancements/modifications, system configuration, migrations/upgrades, and support
- Provide support and troubleshooting for all related systems and technologies
- Responsible for being a technical subject matter expert on Cloud related technologies
- Production and Operational Excellence
- Participate in troubleshooting effort to find the root cause and provide suggestions to prevent recurrence
- Provide guidance and standards for web-site optimization techniques such as CDN, Cloud, and web-caching techniques
- Resolve difficult technical issues, remove obstacles for teams and help all projects to move forward on schedule and budget
- Build and operate a high performance stable and resilient eCommerce platform
- Champion Site Reliability Engineering approach and techniques
- Work closely with cross functional IT and technology partners to ensure system interactions are top notch and to standards
- 5+ years of hands-on experience implementing, operating, and maintaining infrastructure for high volume enterprise web applications.
- 5 years of experience in distributed system development (design and support of systems with scalability and disaster recovery robustness) to support compute use cases for business requirements.
- 3+ years of implementation and operations experience with production systems in public cloud environments (AWS and/or GCP Preferred).
- Hands-on experience automating infrastructure operations and with modern best practices such as infrastructure-as-code, cross-region & multi-provider redundancy, and event correlation solutions.
- Proficient with containerization and cluster management technologies like Docker and Kubernetes
- Deep understanding and hands-on experience with Cloud Native deployment and monitoring tools/technology with expertise in areas like Kubernetes, Helm charts, container based deployment, Service Mesh, Prometheus, Grafana, etc.
- Motivated by a DevOps culture and Site Reliability Engineering concepts.
- 5 years of experience in operating systems (Windows, RedHat, CentOS, Amazon Linux), networking (Akamai, Nginx, Apache, AWS/GCP VPC), and/or software (Terraform, Bash, Sh) packages.
- 5 years of experience integrating monitoring, alerting, and reporting tools (NewRelic, Akamai, Grafana, Elasticsearch, Prometheus) with existing and newly developed systems.
- 4 years of cloud engineering and development experience. Must have experience extending and supporting cloud-based systems using Terraform with AWS and/or GCP.
- 3 years of experience implementing and supporting microservices architecture using containers, with tools such as Docker, AWS ECS or GCP Compute/GKE & Rancher.
- 3 years of experience working with database systems such as AWS RDS, Oracle SQL, MongoDB & Elasticsearch.
- Design and Build CI-CD pipeline for code deployment using Travis, Codebuild, Jenkins and Bamboo.
- Must have a solid experience with multiple Apache Projects including Web, HTTP Server, Tomcat, Ant
- Experience in supporting open-source Web and Application Services (Java, Ruby, PHP, Python, Perl)
- Experience with bash, perl, or other shell scripting required.
- Experience with Git fundamentals and Stash required.
- Experience with support, monitoring of customer facing systems and may participate after hours deployment and support activities as needed