-
-
Notifications
You must be signed in to change notification settings - Fork 703
Open
Description
The solutions for problem no. 3, 5, 6 and 7 in Exercise-4 appear to be missing plotting data for the years with no roles for actresses(viz. 1900, 1905, 1907, 1909).
Can be verified by plotting a subset (using head() with plot()).
Can be fixed (See below) by using "fillna(0)" while 'unstacking' the series to df.
Surprisingly, the area plot (kind = 'area'), used in problem no.4, does not get affected by NaNs.
# Plot the number of actor roles each year
# and the number of actress roles each year
# over the history of film.
c = cast
c = c.groupby(['year', 'type']).size()
#c = c.unstack('type') # Causing missing data in plot for NaNs
c = c.unstack('type').fillna(0) # No missing data
c.plot() # Verify by c.head(10).plot()
# Plot the difference between the number of actor roles each year
# and the number of actress roles each year over the history of film.
c = cast
c = c.groupby(['year', 'type']).size()
#c = c.unstack('type') # Missing data
c = c.unstack('type').fillna(0) # No missing data
(c.actor - c.actress).plot()
# Plot the fraction of roles that have been 'actor' roles
# each year in the history of film.
c = cast
c = c.groupby(['year', 'type']).size()
#c = c.unstack('type') # Missing data
c = c.unstack('type').fillna(0) # No missing data
c1 = c.head(100)
(c1.actor / (c1.actor + c1.actress)).plot(ylim=[0,1])
# Plot the fraction of supporting (n=2) roles
# that have been 'actor' roles
# each year in the history of film.
c = cast
c = c[c.n == 2]
c = c.groupby(['year', 'type']).size()
#c = c.unstack('type') # Missing data
c = c.unstack('type').fillna(0) # No missing data
(c.actor / (c.actor + c.actress)).plot(ylim=[0,1])
Metadata
Metadata
Assignees
Labels
No labels