Job Description


We are seeking talented DevOps-Engineers with focus on Elastic Stack (ELK) to join our dynamic DPS team.  In this role, you will be responsible for refining and advising on the further development of an existing monitoring solution based on the Elastic Stack (ELK). You will independently handle tasks related to architecture, setup, technical migration, and documentation.
The current application landscape features multiple Java web services running on JEE application servers, primarily hosted on AWS, and integrated with various systems such as SAP, other services, and external partners. DPS is committed to delivering the best digital work experience for the customers employees and customers alike.

Responsibilities

•    Install, set up, and automate rollouts using Ansible/CloudFormation for all stages (Dev, QA, Prod) in the AWS Cloud for components such as Elastic Search, Kibana, Metric beats, APM server, APM agents, and interface configuration.
•    Create and develop regular "Default Dashboards" for visualizing metrics from various sources like Apache Webserver, application servers and databases.
•    Improve and fix bugs in installation and automation routines.
•    Monitor CPU usage, security findings, and AWS alerts.
•    Develop and extend "Default Alerting" for issues like OOM errors, datasource issues, and LDAP errors.
•    Monitor storage space and create concepts for expanding the Elastic landscape in AWS Cloud and Elastic Cloud Enterprise (ECE).
•    Implement machine learning, uptime monitoring including SLA, JIRA integration, security analysis, anomaly detection, and other useful ELK Stack features.
•    Integrate data from AWS CloudWatch.
•    Document all relevant information and train involved personnel in the used technologies.

Requirements

•    Experience with Elastic Stack (ELK) components and related technologies.
•    Proficiency in automation tools like Ansible and CloudFormation.
•    Strong knowledge of AWS Cloud services.
•    Experience in creating and managing dashboards and alerts.
•    Familiarity with IAM roles and rights management.
•    Ability to document processes and train team members.
•    Excellent problem-solving skills and attention to detail.