Deploying deep learning models for computer vision (CV) often requires a careful trade-off between performance, efficiency, and cost. In the context of edge inference of machine learning systems, this trade-off translates into a challenge, as computing and memory constraints imposed by the hardware severely limit the size and architecture of implementable models. These constraints are exacerbated in scenarios where a single edge device must concurrently host several vision models running in series or in parallel. In this work, we benchmark model compression strategies for the joint deployment of multiple CV models in two steps. First, we consider the problem of detecting human faces in unfavorable imaging conditions as a prototypical CV task requiring the concurrent implementation of multiple image restoration and detection models. Second, we evaluate the performance of pruning and quantization techniques for model compression in the context of our prototypical restoration and detection multi-model system, and propose Joint Multi-Model Compression (JMMC), an adaptation of Quantization Aware Training (QAT) and pruning techniques in which the multi-model system is fine-tuned as a single unit with an adapted loss function.