Data Engineer (1397) | Augusta, 30905, Augusta, GA, US
Summary:
The Data Engineer is part of a skilled IT team that will work independently and/or collectively with other engineers to identify and test potential enhancements and resolve technical issues with the rapid scaling of an intelligence application.
Duties and Responsibilities:
- Data Engineering:
- Collaboratively develop and implement custom-developed ETL scripts; and support software products as needed.
- Ensure data quality, consistency, and integrity throughout the pipeline.
- Create, monitor, and adjust data ingestion flows.
- Process raw data by transforming it into a usable form (ETL).
- Rapidly create data pipelines to support training exercise with little prior notification to connect fielded systems to cloud-based infrastructure as well as other fielded systems.
- Develop for and work with the ZIngest system, an ingestion system developed by Zapata Technology.
- Systems Administration:
- Database management, data management, and fixing issues that arise.
- Perform basic SysAdmin tasks, including the installation of peripheral software and software updates.
- Software Engineering:
- Implement rapid features and API capabilities to meet evolving customer needs for data access and dissemination.
- Interface with the customer(s) to create ingestion feeds for unique systems with ability to implement solutions with few prior known details/documentations about said systems.
- Use source-code control to track and protect changes to the code baseline.
- Infrastructure Management:
- Manage containerization efforts (Docker, Podman) and orchestration.
- Upgrade and migrate applications to containers to allow the required scalability and move legacy monolith applications to a microservices type environment.
- Document Infrastructure administration, tear down, and creation.
- Automation and Security:
- Implement Infrastructure as Code (IaC) using Puppet, Ansible or similar technology to meet security mandated change management guidelines.
- Ensure application and database compliance with DISA and GISA standards and regulations.
- Document API endpoints using OpenAPI documentation standards.
- Work with Cyber-Security personnel to document security vulnerabilities and any mitigations to reduce risks that cannot be removed.
Required Qualifications:
- Knowledge of JavaScript frameworks and API development.
- Knowledge of Agile programming and development concepts including Jira or similar.
- Knowledge of DevSecOps processes including Infrastructure as Code using Puppet, Ansible, etc.
- Knowledge of containerization and container orchestration including networking using Docker, Podman, or Kubernetes. management development.
- Ability to look at the big picture while evaluating software requests in order to determine where existing requirements intersect or can be used in conjunction with new products as needed.
- Interpret specifications, troubleshoot, and define needed software solutions.
- Detail-oriented and organized; able to understand information systems and ensure accuracy of work.
- A solid understanding of and experience in NiFi data flow, from configuration to creating new flows and making new processors.
- Linux System administration knowledge: know the Linux command line, service management, software configuration concepts in lieu of virtual machine constraints.
- Ability to work with the customer(s) to find out specifications and formats for the data.
- Ability to convert raw data into a human-readable and easily viewable format.
- Experience with troubleshooting the NiFi software.
- Knowledge of NiFi, with ability to keep the cluster running.
- Ability to write scripts in Groovy and Python within NiFi and Linux.
- Ability to write new processors in NiFi using Java.
Preferred Qualifications and Skills include:
- Ability to convert data and fix malformed data. This requires an understanding of XML, JSON, and regular expressions.
- Grasp of Software Engineering concepts, especially scripting in Python and Groovy.
- Experience with the full DataOps lifecycle including ingestion and ETL processes into databases, data lakes, or data warehouses.
- Experience with SIGINT and All Source Intelligence reports standards such as USMTF and lifecycle requirements.
- An understanding of pub-sub messaging systems including RabbitMQ, Pulsar and Kafka.
- Understand the full life cycle database design and software development systems analysis to include requirements gathering; development; testing; documentation; supporting technical writers and testers; configuration; maintenance of developed software projects.
- Understanding the lifecycle development of software, gaining and understanding requirements, executing actual development, documentation, maintenance, and expansion of deliverables as the need arises.
Education/Experience/Certifications include:
- 12 years of experience
- Bachelors degree
- 4 YOE can be substituted for a degree
Other Requirements:
- TS/SCI security clearance
- IAT Levell III certification
Working Conditions:
Prolonged periods sitting at a desk and working on a computer. May require use of standard office equipment such as scanners and printers, phones, etc.
Position Type:
Full time, Exempt
AAP/EEO Statement:
Zapata Technology is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, genetic information, creed, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.
AAP/EEO Employer: Minorities/Females/Disabled/Veterans