The internet movie database, http://imdb.com/, is a website devoted to collecting movie data supplied by studios and fans. It claims to be the biggest movie database on the web and is run by amazon. More about information imdb.com can be found online, http://imdb.com/help/show_leaf?about, including information about the data collection process, http://imdb.com/help/show_leaf?infosource.

movies

Format

A data frame with 28819 rows and 24 variables

  • title. Title of the movie.

  • year. Year of release.

  • budget. Total budget (if known) in US dollars

  • length. Length in minutes.

  • rating. Average IMDB user rating.

  • votes. Number of IMDB users who rated this movie.

  • r1-10. Multiplying by ten gives percentile (to nearest 10%) of users who rated this movie a 1.

  • mpaa. MPAA rating.

  • action, animation, comedy, drama, documentary, romance, short. Binary variables representing if movie was classified as belonging to that genre.

Details

Movies were selected for inclusion if they had a known length and had been rated by at least one imdb user.

Examples

dim(movies)
#> [1] 58788    24
head(movies)
#> # A tibble: 6 × 24
#>   title      year length budget rating votes    r1    r2    r3    r4    r5    r6
#>   <chr>     <int>  <int>  <int>  <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 $          1971    121     NA    6.4   348   4.5   4.5   4.5   4.5  14.5  24.5
#> 2 $1000 a …  1939     71     NA    6      20   0    14.5   4.5  24.5  14.5  14.5
#> 3 $21 a Da…  1941      7     NA    8.2     5   0     0     0     0     0    24.5
#> 4 $40,000    1996     70     NA    8.2     6  14.5   0     0     0     0     0  
#> 5 $50,000 …  1975     71     NA    3.4    17  24.5   4.5   0    14.5  14.5   4.5
#> 6 $pent      2000     91     NA    4.3    45   4.5   4.5   4.5  14.5  14.5  14.5
#> # ℹ 12 more variables: r7 <dbl>, r8 <dbl>, r9 <dbl>, r10 <dbl>, mpaa <chr>,
#> #   Action <int>, Animation <int>, Comedy <int>, Drama <int>,
#> #   Documentary <int>, Romance <int>, Short <int>