This project will produce a unified method for detection, anonymization, and review of Protected Health Information (PHI) in imaging data files, especially as found in whole slide images. In order to share imaging data for cancer research, data files need to be anonymized by removing all PHI without removing or degrading the data more than necessary. An expandable set of file format handlers will be used to read a wide variety of image data along with associated textual information. A set of algorithms will be developed that detect PHI in text and images. Images and text will be de-identified, modifying the original files while tracking provenance. A user interface will be developed to easily facilitate review of the de-identified results, enabling a researcher to confirm anonymization and to further process data files that were not fully deidentified. In developing the detection, modification, and review process, the quality of the results will be measured, allowing each step to be improved over time. This will reduce the burden of sharing imaging data between researchers, allowing a broader range of information to be used in research and clinical work.