Artificial Intelligence (AI) and accessible computing have tremendous potential to improve the quality of life for users. Despite significant advancements in AI, the needs and preferences of people who have low vision (e.g., older adults and those who are blind and visually impaired) are not appropriately considered. A multimodal AI model may misinterpret critical directional information in navigation tasks, causing potential safety hazards. When users attempt to provide detailed feedback, the model often fails to follow the request. The model may respond with long and inaccurate descriptions. It may not recognize the needs of the user and focus on irrelevant information. This research highlights a critical gap in AI’s ability to quickly recognize and adapt to the specific interaction needs of users with varying abilities. One of the results of this research is to create the first publicly available, large-scale multi-modal dataset with labeled preferences from users with low vision. This work aligns the data with the goal of AI working effectively and safely with all users. The research also emphasizes and supports the importance of scalable multimodal systems for goal-oriented language generation and real-world decision-making tasks. Through pre-training and generalized feedback from users of these multimodal large language models, it will make these AI-based systems more usable. The methods and results will support those with low vision but can be adapted to address other