Deep Render is the world-leading R&D team for AI-based video and image compression technologies, with 50+years of combined research experience and 20+ patents. Our team stems from world-leading research institutions such as Imperial, Oxford, UC Berkeley and McGill University with publications in top-tier journals and conferences such as CVPR, ICCV, ICML, ECCV, and NeurIPS.

We’re looking for Performance Engineers to join us to be part of a highly talented team bringing the next step-change in compression technology to billions of users, creating enormous global impact and value.

We're looking for engineers who will help us deliver AI-based compression to end-users by porting our codec from GPU-based systems to (primarily) mobile platforms with NPUs and (secondarily) mid-range GPU/CPU systems, thus going from research to production. You’ll enjoy working with low-level code and are comfortable with programming across multiple platforms.

The ideal candidate will have a deep understanding of optimisation methodologies to reduce runtime and memory footprint, preferably for neural networks; and/or experience implementing high-performance entropy coding algorithms such as Huffman Coding, Arithmetic Coding, Range Coding, or Asymmetric numeral systems. The ideal candidate will have some experience with taking algorithms from research to deployment.



  • Work in a team to port ML research algorithms to edge devices with an initial focus on smartphones (Android, iOS)
  • Profile various algorithms to analyse performance and identify any bottlenecks. Profiling includes data loading, data movement, data caching, operation count, execution chipset, warm-up latency and others
  • Implement solutions to the identified bottlenecks
  • Implement a high-performance entropy coding algorithm, e.g. Range Coding or Asymmetric Numeral Systems, across different hardware architectures
  • Optional: Write custom operations using the low-level API for Android (OpenGL ES) and iOS (Metal) systems
  • Optional: Apply standard neural network runtime optimisation methods such as pruning, low-bit quantisation, architecture tuning, batching and others


  • At a minimum, a Bachelor's degree in computer science or related field (Mathematics, Physics, Engineering)
  • At a minimum, 3-5 years of experience in performance optimisation
  • One of the following: Either some formal training in machine learning (understanding PyTorch and/or Tensorflow) or some formal training in entropy coding methods (understanding Range Coding or similar algorithms).
  • Formal training could come through education, work experience and/or extensive private projects.
  • Expertise in Python (Expert) and C++ (Semi-Expert)
  • Some experience with optimisation techniques. Examples include SIMD (SSE, AVX), vectorisation, loop dependencies, multithreading, multi-processor usage, and tensor cores

Preferred skills

  • A strong machine learning background
  • Significant experience with ML-programming in either Android and IOS: Android Studio, XCode, Google ML, Core ML. Knowledge of the development stack for Android and iOS
  • Experience with Android NNAPI and or other Android-based NPU SDKs (Exynos, Hexagon HiSilicon)


  • If you do not have the right to work in the UK, we can sponsor your visa
  • The newest and best equipment: Notebooks/Macbooks, 4K dual screens, mechanical keyboards, drawing pads, headphones and standing desks
  • A comprehensive private health insurance plan by AXA
  • An amazing WeWork office overlooking Tower Bridge and the Tower of London
  • Cycle to work scheme
  • Stock options; taking part in the upside potential of Deep Render

Join the

We look forward to receiving your application for the position of ML PERFORMANCE ENGINEER