This paper proposes a method to recognize scene categories using bags of visual words obtained hierarchically partitioning into subregion the input images. Specifically, for each subregion the Textons distribution and the extension of the corresponding subregion are taken into account. The bags of visual words computed on the subregions are weighted and used to represent the whole scene. The classification of scenes is carried out by a Support Vector Machine. A k-nearest neighbor algorithm and a similarity measure based on Bhattacharyya coefficient are used to retrieve from the scene database those that contain similar visual content to a given a scene used as query. Experimental tests using fifteen different scene categories show that the proposed approach achieves good performances with respect to the state of the art methods.