Site Reliability Engineer/remote
IqtalentAnywhere22 days agoFull-time
Position: Site Reliability Engineer (remote)
Job Description
SUMMARY: Invest Edge is seeking a Site Reliability Engineer. Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other Invest Edge production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments and the Invest Edge codebase.
SREs specialize in systems (web application stacks, operating systems, storage subsystems, networking), while implementing best practices for availability, reliability, and scalability, with varied interests in algorithms and distributed systems.
SREs work on the Production Support Team. The team’s experience feeds back into other Engineering groups within the company, as well as to Invest Edge resellers running self-managed installations.
SRE Responsibilities
• Be on an on-call rotation to respond to... incidents that impact availability and provide support for Customer Success staff with customer incidents. On-call shifts may include weekend and overnight work.
• Use your on-call shift to prevent incidents from ever happening.
• Investigate incidents with MSSQL, Log Analysis, RDP, and other monitoring tools.
• Build monitoring that alerts on symptoms rather than on outages.
• Document every action so your findings turn into repeatable actions and then into automation.
• Improve operational processes (such as deployments and upgrades) to make them as boring as possible.
• Design, build, and maintain core infrastructure with our Infrastructure, Engineering, and Dev Ops teams that enables Invest Edge scaling to support hundreds of thousands of concurrent users.
• Debug production issues across services and levels of the stack.
• Develop and debug configurations for our ETL tooling.
• Debug and troubleshoot logical issues in database code for a large existing relational data set.
• Debug and troubleshoot performance issues in database code.
• Understand the business domain of the application.
• Work in an Agile environment on a cross-functional team.
Required Skills:
• Ability to work 12am-9am EST, Monday to Friday (e.g. 5am-2pm GMT, 6am-3pm CET, etc.)
• Strong programming skills: Shell and MSSQL.
• Ability to collaborate and communicate asynchronously.
• Desire to document all processes to avoid repetitive learning.
• Enthusiastic, proactive attitude towards fixing issues.
• Ability to deliver quickly and effectively, with a focus on rapid iteration.
• Familiarity with VS Code, SSMS, and other IDEs.
• Previous deployment experience leveraging CI/CD practices and tool chains.
• Ability to use Gitlab or other VCS (Git, SVN, etc.).
#J-18808-Ljbffr