45 lines
		
	
	
		
			1.8 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			45 lines
		
	
	
		
			1.8 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
.. _olivetti_faces_dataset:
 | 
						|
 | 
						|
The Olivetti faces dataset
 | 
						|
--------------------------
 | 
						|
 | 
						|
`This dataset contains a set of face images`_ taken between April 1992 and 
 | 
						|
April 1994 at AT&T Laboratories Cambridge. The
 | 
						|
:func:`sklearn.datasets.fetch_olivetti_faces` function is the data
 | 
						|
fetching / caching function that downloads the data
 | 
						|
archive from AT&T.
 | 
						|
 | 
						|
.. _This dataset contains a set of face images: https://cam-orl.co.uk/facedatabase.html
 | 
						|
 | 
						|
As described on the original website:
 | 
						|
 | 
						|
    There are ten different images of each of 40 distinct subjects. For some
 | 
						|
    subjects, the images were taken at different times, varying the lighting,
 | 
						|
    facial expressions (open / closed eyes, smiling / not smiling) and facial
 | 
						|
    details (glasses / no glasses). All the images were taken against a dark
 | 
						|
    homogeneous background with the subjects in an upright, frontal position 
 | 
						|
    (with tolerance for some side movement).
 | 
						|
 | 
						|
**Data Set Characteristics:**
 | 
						|
 | 
						|
    =================   =====================
 | 
						|
    Classes                                40
 | 
						|
    Samples total                         400
 | 
						|
    Dimensionality                       4096
 | 
						|
    Features            real, between 0 and 1
 | 
						|
    =================   =====================
 | 
						|
 | 
						|
The image is quantized to 256 grey levels and stored as unsigned 8-bit 
 | 
						|
integers; the loader will convert these to floating point values on the 
 | 
						|
interval [0, 1], which are easier to work with for many algorithms.
 | 
						|
 | 
						|
The "target" for this database is an integer from 0 to 39 indicating the
 | 
						|
identity of the person pictured; however, with only 10 examples per class, this
 | 
						|
relatively small dataset is more interesting from an unsupervised or
 | 
						|
semi-supervised perspective.
 | 
						|
 | 
						|
The original dataset consisted of 92 x 112, while the version available here
 | 
						|
consists of 64x64 images.
 | 
						|
 | 
						|
When using these images, please give credit to AT&T Laboratories Cambridge.
 |