This paper proposes a method to recognize scene categories using bags of visual words obtained by hierarchically partitioning into subregion the input images. Specifically, for each subregion the Textons distribution and the extension of the corresponding subregion are taken into account. The bags of visual words computed on the subregions are weighted and used to represent the whole scene. The classification of scenes is carried out by discriminative methods (i.e., SVM, KNN). A similarity measure based on Bhattacharyya coefficient is proposed to establish similarities between images, represented as hierarchy of bags of visual words. Experimental tests, using fifteen different scene categories, show that the proposed approach achieves good performances with respect to the state-of-the-art methods.