Data Scientist: Full-Time | Calgary-Based (hybrid/remote)
Data Scientist
Full-Time | Calgary-Based (hybrid/remote)
About Us
Arolytics is on a mission to improve operational efficiency in oil and gas through the lens of strategic emissions management. Our platform AroIQ turns raw SCADA signals (pressures, temperatures, flows, valve states, etc) into emission event detection, root cause analysis, and volume quantification, bridging the gap between operations teams and environmental teams at some of the largest energy companies in North America. We are at the forefront of software-based emissions monitoring.
We've been in this space since 2018, and we're at an inflection point, our customers are scaling, our models are getting sharper, and we need another scientist who thinks in physics first and code second.
How We Work
We are Rigorous
Our models produce numbers that show up in regulatory filings and board-level emissions reports. Every value needs to be defensible, reproducible, and physically plausible. We don't ship "close enough."
We are Pragmatic
We work with real industrial process data, gapped, noisy, mislabeled, and inconsistent. Elegant theory is nice; working solutions for messy data are better.
We are Curious
The problems we solve sit at the intersection of thermodynamics, signal processing, and machine learning. No single discipline has the full answer. We need people who cross boundaries and ask good questions.
About the Role
We're looking for a Data Scientist with an engineering background to join our data science team. You'll report to our Lead Data Scientist and work alongside emissions engineers, data engineers, and product leadership.
This is a hands-on, mostly-data-science role. The majority of your time (~60-70%) will be spent working directly with oil and gas SCADA data — cleaning it, exploring it, engineering features from it, and validating that model outputs make physical sense. The remaining ~30-40% is supporting and contributing to ML model development for emissions detection, quantification, and causation.
The ideal candidate has an engineering degree (chemical, mechanical, petroleum, environmental) and has built strong data science and programming skills through graduate work, self-study, or early-career experience. You think in terms of mass balance and thermodynamics first, and build complex AI models on top of this.
What You'll Do
Data Science & Analysis
- Clean, transform, and extract signal from messy, gapped, and inconsistently labeled SCADA and sensor data
- Perform exploratory data analysis to identify patterns, anomalies, and relationships across operational variables
- Engineer features from raw SCADA telemetry- temporal features, signal decomposition, cross-sensor correlations
- Validate model outputs against known physical behavior and engineering first principles
- Build analysis pipelines and dashboards to support model development and evaluation
- Translate findings into clear insights for engineers, product, and customer-facing teams
ML & Modeling
- Support development and improvement of models for emissions event detection, duration estimation, volume quantification, and root cause analysis
- Run experiments, evaluate model performance, and contribute to model iteration cycles
- Help implement strategies for handling noisy and missing data — imputation, uncertainty estimation, and signal denoising
- Contribute to feature engineering that encodes domain knowledge (thermodynamic relationships, conservation laws) into model inputs
- Support model deployment, monitoring, and retraining workflows
What We're Looking For
Required
- Engineering background: degree in chemical, mechanical, petroleum, environmental, or process engineering (BSc or MSc)
- 1–3 years of experience in a data science, applied analytics, or research role (graduate research counts)
- Strong Python proficiency and comfort with the scientific computing stack (NumPy, Pandas, SciPy, scikit-learn)
- Solid fundamentals in statistics, time-series analysis, and working with noisy real-world data
- Intuition for physical systems — you understand concepts like mass balance, pressure-volume relationships, and gas behavior, and can spot when a model output doesn't make physical sense
- Experience cleaning and wrangling messy datasets — you've dealt with gaps, sensor drift, mislabeled data, and inconsistent formats
- Strong communication skills — you can explain what the data is saying to both technical and non-technical audiences
- Comfortable and excited to work in a fast-moving startup environment
Preferred
- MSc or PhD with thesis work involving time-series modeling, sensor data, or applied ML in a physical/engineering domain
- Experience with at least one deep learning framework (PyTorch preferred)
- Exposure to probabilistic modeling, uncertainty quantification, or Bayesian methods
- Familiarity with cloud data infrastructure (AWS, Supabase/PostgreSQL, or similar)
- Background in oil and gas, energy, environmental monitoring, or industrial IoT
- Familiarity with emissions regulations (OGMP 2.0, Subpart W, Canadian and US methane regulations)
Growth Opportunities
As you grow in the role, you'll have the opportunity to:
- Design hybrid models that integrate physics-based constraints directly into ML pipelines
- Develop probabilistic forecasting capabilities for emission events and equipment states
- Explore advanced time-series architectures (transformers, neural forecasting)
- Apply NLP and generative AI to extract structured information from regulatory and technical documents
- Take increasing ownership of model design, deployment, and production systems
What You Aren't
- A pure ML engineer who wants to build models without understanding what the data physically represents
- Someone who needs clean, labeled datasets to be productive
- Uncomfortable with ambiguity — we're a startup solving problems that don't have textbook solutions yet
- Only interested in the modeling step — you'll spend more time with the data than with the model
What We Offer
Compensation & Benefits
- Competitive salary and comprehensive benefits
- Direct impact on a product used by major energy companies
Culture & Growth
- Small team, high ownership — your work ships to production and shows up in customer dashboards
- Work directly with experienced emissions engineers and data scientists who will accelerate your growth in the domain
- Mentorship from a Lead Data Scientist with deep expertise in physics-informed modeling for industrial applications
- Flexible work arrangements — Downtown Calgary office or fully remote
Location
- Preference given to Calgary-area candidates, with optional in-office or remote settings.
---
Arolytics is an equal opportunity employer. All qualified applicants will be considered for employment.
If this sounds like you, we'd love to hear from you. Send a brief introduction about why Arolytics interests you and your resume to info@arolytics.com, referencing ‘Data Scientist Career’.
Indeed link: https://ca.indeed.com/job/data-scientist-2bcc114b95e66f4f
Close Date: April 20th 2026