The files in this directory provide counts of how often an image, video, or audio file from upload.wikimedia.org has been transferred to users. 1. On-wiki documentation 2. Contained data 2.1. Selected requests 2.1.1. Corner cases 2.2. Fields in the TSVs 3. Contact / Bugs 1. On-wiki documentation ========================== While this README.txt is currently (2015-02-27) up-to-date, this file cannot easily be updated by the community. Hence, we consider the on-wiki documentation at https://wikitech.wikimedia.org/wiki/Analytics/Data/Mediacounts the authoritative documentation. This README.txt is just a convenience for people that have issues accessing the on-wiki documentation. 2. Contained data =================== 2.1 Selected requests ------------------------- The stream contains all requests from the upload cache group that have * HTTP status code 200 (OK), or * HTTP status code 206 (Partial Content) and a Range header that starts in "bytes=0-", but is not "bytes=0-0". The first condition matches the plain fetches of image, movie and audio files. The second condition matches beginnings of streamed media. 2.1.1. Corner cases ..................... * After some discussion with stake-holders (some parts in on-wiki, most parts in emails), requests with HTTP status code 304 (Not modified) do not get counted at this point, as more interest seems to be on media transfers than media requests. Ideally, it would be media consumption or media views, but there is currently no way to detect that easily from the logs. * When consuming streamed media and jumping back to the beginning of the file after having watched part of the file, counts as a new transfer. * When using Media viewer to view images, some images are prefetched for better user experience, but need not yet been shown to the user. Currently, those prefetched images are getting counted, as there is as of now no way to detect whether an image was actually shown to the user or not. 2.2. Fields in the tsvs ------------------------- The TSVs hold the following fields: +---------+--------------------------------------------------------------------+ | Field # | Description | +---------+--------------------------------------------------------------------+ | 1 | The name of the raw, original file without the leading | | | https?://upload.wikimedia.org | | | | | | So for example for each of | | | | | | * https://upload.wikimedia.org/wikipedia/commons/e/ec/Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg | | * https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg/161px-Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg | | * http://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg/402px-Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg | | * http://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg/687px-Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg | | | | | , the base_name is | | | | | | /wikipedia/commons/e/ec/Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg | | | | | For images from Commons, you can get the file's page by replacing | | | the first four path segments of the base_name by | | | https://commons.wikimedia.org/wiki/File: | | | . So for the above basename, the file's page on Commons is | | | https://commons.wikimedia.org/wiki/File:Mona_Lisa%2C_by_Leonardo_da_Vinci%2C_from_C2RMF_retouched.jpg | | | +---------+--------------------------------------------------------------------+ | 2 | Total number of response bytes sent to the users for that file | | | (and its transcodings). | +---------+--------------------------------------------------------------------+ | 3 | Total number of transfers (counting both transfers of the raw, | | | original and tiny thumbs as 1). | +---------+--------------------------------------------------------------------+ | 4 | Total number of transfers of the raw, original file (transcodings, | | | thumbs and the like are not counted here). | +---------+--------------------------------------------------------------------+ | 5 | Total number of transfers of a file that got transcoded to an | | | audio file. So for example when a FLAC file is requested as OGG | | | file, the request is counted in this column. (Transfers for the | | | raw, original FLAC file, would get counted in column #4.) | +---------+--------------------------------------------------------------------+ | 6 | Reserved for future use. | +---------+--------------------------------------------------------------------+ | 7 | Reserved for future use. | +---------+--------------------------------------------------------------------+ | 8 | Total number of transfers of a file that got transcoded to an | | | image file. So for example when a WebM file, or a GIF file is | | | requested as JPG file, the request is counted in this column. | | | (Transfers for the raw, original WebM, or the raw, original GIF | | | file, would get counted in column #4.) | +---------+--------------------------------------------------------------------+ | 9 | Total number of transfers of a file that got transcoded to an | | | image file, where 0 <= width <= 199. (This is a drill-down of | | | column #8.) | +---------+--------------------------------------------------------------------+ | 10 | Total number of transfers of a file that got transcoded to an | | | image file, where 200 <= width <= 399. (This is a drill-down of | | | column #8.) | +---------+--------------------------------------------------------------------+ | 11 | Total number of transfers of a file that got transcoded to an | | | image file, where 400 <= width <= 599. (This is a drill-down of | | | column #8.) | +---------+--------------------------------------------------------------------+ | 12 | Total number of transfers of a file that got transcoded to an | | | image file, where 600 <= width <= 799. (This is a drill-down of | | | column #8.) | +---------+--------------------------------------------------------------------+ | 13 | Total number of transfers of a file that got transcoded to an | | | image file, where 800 <= width <= 999. (This is a drill-down of | | | column #8.) | +---------+--------------------------------------------------------------------+ | 14 | Total number of transfers of a file that got transcoded to an | | | image file, where 1000 <= width. (This is a drill-down of | | | column #8.) | +---------+--------------------------------------------------------------------+ | 15 | Reserved for future use. | +---------+--------------------------------------------------------------------+ | 16 | Reserved for future use. | +---------+--------------------------------------------------------------------+ | 17 | Total number of transfers of a file that got transcoded to a movie | | | file. So for example when a WebM file is requested as OGV file, | | | the request is counted in this column. (Transfers for the raw, | | | original WebM file, would get counted in column #4.) | +---------+--------------------------------------------------------------------+ | 18 | Total number of transfers of a file that got transcoded to a movie | | | file, where 0 <= height <= 239. (This is a drill-down of | | | column #17.) | +---------+--------------------------------------------------------------------+ | 19 | Total number of transfers of a file that got transcoded to a movie | | | file, where 240 <= height <= 479. (This is a drill-down of | | | column #17.) | +---------+--------------------------------------------------------------------+ | 20 | Total number of transfers of a file that got transcoded to a movie | | | file, where 480 <= height. (This is a drill-down of | | | column #17.) | +---------+--------------------------------------------------------------------+ | 21 | Reserved for future use. | +---------+--------------------------------------------------------------------+ | 22 | Reserved for future use. | +---------+--------------------------------------------------------------------+ | 23 | Total number of transfers with a Referer from a WMF domain. | +---------+--------------------------------------------------------------------+ | 24 | Total number of transfers with a Referer from a non-WMF domain. | +---------+--------------------------------------------------------------------+ | 25 | Total number of transfers with an empty or invalid Referer. | +---------+--------------------------------------------------------------------+ 4. Contact / Bugs =================== You can reach the analytics team via email at analytics@lists.wikimedia.org or via IRC on freenode in #wikimedia-analytics .