![project z telugu movie runtime project z telugu movie runtime](https://i.imgur.com/uOSmLop.jpg)
# "original_title" "overview" "popularity" It contains 20 variables and 4803 observations across 88 different countries, 20 genres, 86 spoken languages.ĭata Import code: movies <- read.csv("tmdb_5000_movies.csv", header = TRUE, stringsAsFactors = FALSE)Ĭlass(movies) # "ame" dim(movies) # 4803 20 colnames(movies) # "budget" "genres" "homepage" The data set was created to answer different questions about movies such as what can be said about success of a movie, have certain production companies found a formula for success of a movie, why does a film with very high budget fail at box office and many more such questions.
![project z telugu movie runtime project z telugu movie runtime](http://image.tmdb.org/t/p/w1280/k3UEKMVnkljOlsO5sLmz87YGlaG.jpg)
Source of data is Kaggle - TMDB 5000 Movie Data set. An analysis of movies will help understand how factors such as runtime, languages, genres influence revenue. Some movies generate very high revenue while others go into loss. How is it useful to consumers: Primary objective of producing movies is to make profits.Data Visualization - Using ggplots, word cloud to visualize data.New columns were created to store release year, gross (revenue - budget) and gross_flag (profit or loss). The resulting data frames were joined to existing data set. Data Manipulation - the data set has few columns in JSON format from which data has been extracted.
![project z telugu movie runtime project z telugu movie runtime](https://d2zub9v50g8scn.cloudfront.net/yupptv/Movies/yupp/1080x400/Project_Z_roku.jpg)
Cleaning of data - removal of duplicate observations, spurious characters in title.Preliminary analysis of data to see dimensions and stucture of data, number of missing and duplicate values.Difference in movies produced by different countries and production companies in terms of number produced, duration, gross revenue, etc.How average vote has varied across years, original languages, different categories of movies.How runtime varies across genres, original languages, countries, with gross and average vote, categorization of movies based on runtime.Top 10 movie producing countries and companies, Top 10 highest grossing movies of all time, Top 10 highest rated movies of all time.Analyze trend over the years in terms of number produced, genres, runtime.Problem statement: To analyze TMDB 5000 Movie data set obatined from Kaggle in order to provide insights into such as.