Discovering Substructures in the Chemical Toxicity Domain
The researcher's ability to interpret the data and discover interesting patterns within the data is of great importance as it helps in obtaining relevant SARs [Srinivasan et al.], for the cause of chemical cancers (e.g., Progol identified a primary amine group as a relevant SAR for the cause of chemical cancers [Srinivasan et al. 1997]). One method for interpreting and discovering interesting patterns in the data is the identification of common substructures within the data. These substructures should be capable of compressing the data and identifying conceptually interesting substructures that enhance the interpretation of data.