Finding semantically similar images is a problem that relies on image annotations manually assigned by amateurs or professionals, or automatically computed by some algorithm using low-level image features. These image annotations create a keyword space where a dissimilarity function quantifies the semantic relationship among images. In this setting, the objective of this paper is two-fold. First, we compare amateur to professional user annotations and propose a model of manual annotation errors, more specifically, an asymmetric binary model. Second, we examine different aspects of search by semantic similarity. More specifically, we study the accuracy of manual annotations versus automatic annotations, the influence of manual annotations with different accuracies as a result of incorrect annotations, and revisit the influence of the keyword space dimensionality. To assess these aspects we conducted experiments on a professional image dataset (Corel) and two amateur image datasets (one with 25,000 Flickr images and a second with 269,648 Flickr images) with a large number of keywords, with different similarity functions and with both manual and automatic annotation methods. We find that Amateur-level manual annotations offers better performance for top ranked results in all datasets (MP@20). However, for full rank measures (MAP) in the real datasets (Flickr) retrieval by semantic similarity with automatic annotations is similar or better than amateur-level manual annotations.