top of page

We’re Hiring

Rust Developer (100% Remote)

Team

Viable Systems is developing a next-generation, high-performance LLM inference engine and supporting infrastructure for the open-source AI community. We are a team of systems experts with deep experience in hardware-software co-design, distributed computing, and performance engineering. Our work sits at the intersection of memory management, concurrent systems, and advanced machine learning architectures.

Our Work 

Currently, we are developing a highly optimized, Rust-based core inference runtime designed to maximize throughput and minimize latency for large-scale transformer models. In addition to the core runtime, we build developer tooling, such as profiling dashboards for tensor operations and memory allocators. We regularly publish deep dives into inference optimization, step-by-step guides on continuous batching, and architectural breakdowns of modern AI workloads.

The Position 

We are looking for an experienced systems engineer ready to immediately tackle the hardest problems in model serving. As our primary codebase is in Rust, fluency in the language particularly around unsafe Rust, memory concurrency, and FFI (Foreign Function Interface) to hardware kernels is expected. We are in the critical phase of a multi-year project, and we need someone who can independently drive complex optimizations from day one.

What we expect of you:


LLM inference is fundamentally a memory-bandwidth and compute-bound problem. Since we are building infrastructure that dictates the economic viability of serving AI, our primary objective is achieving maximum token throughput and minimal Time-to-First-Token (TTFT) while maintaining strict system stability. We expect you to bring a relentless, cycle-counting mindset to this work.
 

You will develop and optimize the core inference runtime. Day-to-day tasks include:

  • Designing and implementing efficient KV cache management systems (e.g., PagedAttention).

  • Optimizing continuous batching and asynchronous request scheduling.

  • Writing and tuning multithreaded operations, custom memory allocators, and hardware-specific abstractions.

  • Working closely with tensor parallelism and distributed inference over high-speed interconnects.

  • While deep knowledge of model training is not required, a solid structural understanding of transformer architectures and modern quantization techniques is highly beneficial.

Requirements 
  • At least 2 years of production experience with Rust, specifically in highly concurrent environments.

  • 4+ years in low-level systems programming (Rust, C, C++).

  • Deep expertise in advanced data structures, custom memory allocation, and algorithmic optimization.

  • Strong debugging, profiling, and performance tuning skills (using tools like perf, VTune, or Nsight).

  • Rigorous multithreaded and lock-free programming knowledge.

  • Experience with asynchronous I/O and network programming for distributed systems.

  • Attention to clean architecture, code readability, and robust error handling.

  • Passion for hacking on Linux systems and extracting maximum performance from bare metal.

©2021 by Viable Systems.

bottom of page