Building demand flexibility is essential for balancing the grid as renewable energy generation increases. However, the limited adoption of flexibility-enabling technologies underscores the need for scalable solutions to evaluate flexibility. Despite increasing availability of building data and scalable data-driven methods, these tools often fail to capture critical what-if scenarios for building operations, market conditions, and potential impacts on occupant comfort. To address these challenges, we propose a data-driven framework to assess HVAC-related demand flexibility in commercial buildings. Using only smart metering data, our approach integrates physics-based insights with data-driven techniques to deliver credible what-if scenarios and occupant-centric operation. Our tool has three main features: (1) generating representative scenarios of load profiles based on weather conditions and inferred occupancy level; (2) capturing general indoor temperature behavior to guide comfort-related decisions without precise measurements; and (3) introducing an occupant-centric flexibility envelope, explicitly quantifying trade-offs between power levels, duration, and occupant satisfaction in flexible operation. We showcase the implementation of this framework using a real-world office building dataset, demonstrating both its accuracy and adaptivity for demand flexibility evaluations. This work marks a significant step toward scalable, efficient, and credible data-driven flexibility assessments. It provides a versatile tool for building operators and market designers, delivering tailored solutions for individual buildings while scaling across large portfolios.