Data Scientist: Full-Time | Calgary-Based (hybrid/remote)

Data Scientist

Full-Time | Calgary-Based (hybrid/remote)

About Us

Arolytics is on a mission to improve operational efficiency in oil and gas through the lens of strategic emissions management. Our platform AroIQ turns raw SCADA signals (pressures, temperatures, flows, valve states, etc) into emission event detection, root cause analysis, and volume quantification, bridging the gap between operations teams and environmental teams at some of the largest energy companies in North America. We are at the forefront of software-based emissions monitoring.

We've been in this space since 2018, and we're at an inflection point, our customers are scaling, our models are getting sharper, and we need another scientist who thinks in physics first and code second.

How We Work

We are Rigorous

Our models produce numbers that show up in regulatory filings and board-level emissions reports. Every value needs to be defensible, reproducible, and physically plausible. We don't ship "close enough."

We are Pragmatic

We work with real industrial process data, gapped, noisy, mislabeled, and inconsistent. Elegant theory is nice; working solutions for messy data are better.

We are Curious

The problems we solve sit at the intersection of thermodynamics, signal processing, and machine learning. No single discipline has the full answer. We need people who cross boundaries and ask good questions.

About the Role

We're looking for a Data Scientist with an engineering background to join our data science team. You'll report to our Lead Data Scientist and work alongside emissions engineers, data engineers, and product leadership.

This is a hands-on, mostly-data-science role. The majority of your time (~60-70%) will be spent working directly with oil and gas SCADA data — cleaning it, exploring it, engineering features from it, and validating that model outputs make physical sense. The remaining ~30-40% is supporting and contributing to ML model development for emissions detection, quantification, and causation.

The ideal candidate has an engineering degree (chemical, mechanical, petroleum, environmental) and has built strong data science and programming skills through graduate work, self-study, or early-career experience. You think in terms of mass balance and thermodynamics first, and build complex AI models on top of this.

What You'll Do

Data Science & Analysis

  • Clean, transform, and extract signal from messy, gapped, and inconsistently labeled SCADA and sensor data
  • Perform exploratory data analysis to identify patterns, anomalies, and relationships across operational variables
  • Engineer features from raw SCADA telemetry- temporal features, signal decomposition, cross-sensor correlations
  • Validate model outputs against known physical behavior and engineering first principles
  • Build analysis pipelines and dashboards to support model development and evaluation
  • Translate findings into clear insights for engineers, product, and customer-facing teams

ML & Modeling

  • Support development and improvement of models for emissions event detection, duration estimation, volume quantification, and root cause analysis
  • Run experiments, evaluate model performance, and contribute to model iteration cycles
  • Help implement strategies for handling noisy and missing data — imputation, uncertainty estimation, and signal denoising
  • Contribute to feature engineering that encodes domain knowledge (thermodynamic relationships, conservation laws) into model inputs
  • Support model deployment, monitoring, and retraining workflows

What We're Looking For

Required

  • Engineering background: degree in chemical, mechanical, petroleum, environmental, or process engineering (BSc or MSc)
  • 1–3 years of experience in a data science, applied analytics, or research role (graduate research counts)
  • Strong Python proficiency and comfort with the scientific computing stack (NumPy, Pandas, SciPy, scikit-learn)
  • Solid fundamentals in statistics, time-series analysis, and working with noisy real-world data
  • Intuition for physical systems — you understand concepts like mass balance, pressure-volume relationships, and gas behavior, and can spot when a model output doesn't make physical sense
  • Experience cleaning and wrangling messy datasets — you've dealt with gaps, sensor drift, mislabeled data, and inconsistent formats
  • Strong communication skills — you can explain what the data is saying to both technical and non-technical audiences
  • Comfortable and excited to work in a fast-moving startup environment

Preferred

  • MSc or PhD with thesis work involving time-series modeling, sensor data, or applied ML in a physical/engineering domain
  • Experience with at least one deep learning framework (PyTorch preferred)
  • Exposure to probabilistic modeling, uncertainty quantification, or Bayesian methods
  • Familiarity with cloud data infrastructure (AWS, Supabase/PostgreSQL, or similar)
  • Background in oil and gas, energy, environmental monitoring, or industrial IoT
  • Familiarity with emissions regulations (OGMP 2.0, Subpart W, Canadian and US methane regulations)

Growth Opportunities

As you grow in the role, you'll have the opportunity to:

  • Design hybrid models that integrate physics-based constraints directly into ML pipelines
  • Develop probabilistic forecasting capabilities for emission events and equipment states
  • Explore advanced time-series architectures (transformers, neural forecasting)
  • Apply NLP and generative AI to extract structured information from regulatory and technical documents
  • Take increasing ownership of model design, deployment, and production systems

What You Aren't

  • A pure ML engineer who wants to build models without understanding what the data physically represents
  • Someone who needs clean, labeled datasets to be productive
  • Uncomfortable with ambiguity — we're a startup solving problems that don't have textbook solutions yet
  • Only interested in the modeling step — you'll spend more time with the data than with the model

What We Offer

Compensation & Benefits

  • Competitive salary and comprehensive benefits
  • Direct impact on a product used by major energy companies

Culture & Growth

  • Small team, high ownership — your work ships to production and shows up in customer dashboards
  • Work directly with experienced emissions engineers and data scientists who will accelerate your growth in the domain
  • Mentorship from a Lead Data Scientist with deep expertise in physics-informed modeling for industrial applications
  • Flexible work arrangements — Downtown Calgary office or fully remote

Location

  • Preference given to Calgary-area candidates, with optional in-office or remote settings.

---

Arolytics is an equal opportunity employer. All qualified applicants will be considered for employment.

If this sounds like you, we'd love to hear from you. Send a brief introduction about why Arolytics interests you and your resume to info@arolytics.com, referencing ‘Data Scientist Career’.

Indeed link: https://ca.indeed.com/job/data-scientist-2bcc114b95e66f4f

Close Date: April 20th 2026