An Investigation of the Frequency of Sculpture Creations by Period

Introduction
In this mini-project, I aimed to visualize the frequency of sculpture creations over time within the Williams College Museum of Art (WCMA) collection. By analyzing the distribution of sculptures created from 1900 onwards, we can identify periods of heightened artistic activity and explore potential historical or cultural factors influencing these trends.
Sources
The dataset utilized for this analysis originates from the WCMA collection, which encompasses approximately 15,600 accessioned artworks. This diverse collection includes pieces across various media, such as ancient artifacts, Indian paintings, African sculptures, photography, American art, and international modern and contemporary works. Notably, WCMA houses the world’s largest repository of works by artists Maurice and Charles Prendergast. The dataset provides essential metadata for each artwork, including accession number, title, creation date, classification, medium, and dimensions. This comprehensive information serves as a valuable resource for research and analysis.
The data cleaning process took the most of the time. The initial dataset presented several challenges, including inconsistent and ambiguous entries such as “no data,” vague timeframes like “15th century,” and uncertain years denoted with question marks. To ensure the reliability of the analysis, I undertook a meticulous data cleaning process:
- Exclusion of Ambiguous Data: Entries with non-specific or uncertain creation dates were removed to maintain accuracy.
- Standardization: Ensured that all remaining data adhered to a consistent format, focusing on entries with precise creation years.
Process
After the data cleaning, I categorized the “creation_date” variable into 20-year intervals (e.g., 1900–1920, 1920–1940). This grouping was intentional: plotting data by individual years would have resulted in an overly cluttered and extensive x-axis, hindering readability. By aggregating the data into broader intervals, the visualization offers a clearer depiction of trends over time.
Then I proceeded to employ R studio’s ggplot2 package to create a bar chart. The x-axis represents the defined 20-year creation periods, while the y-axis indicates the frequency of sculptures produced in each interval. I hope that through this project, we can easily compare which time interval has the greatest occurrence of sculptures completed and which has the least, which then encourages research into the reasons behind its rise and fall.
Presentation
The bar chart at the top of the page effectively illustrates the distribution of time periods of sculpture creations within the WCMA collection. As seen, both 1960-1980 and 1980-2000 have a significant number of sculptures created compared to other periods, which might imply a historical movement in the field of art during that time. The following designs were considered to improve clarity and accessibility:
Angled Labels: The slanted x-axis labels prevent overlap and ensure that the text remains legible.
Clear Interval Labeling: Each bar corresponds to a specific 20-year period, clearly marked on the x-axis.
Contrasting Colors: Utilized to differentiate between periods, aiding visual distinction.
Centered Title: Let viewer know what the plot is about at the first glance
Significance
By applying a digital approach to the dataset, this project reveals temporal trends in sculpture production, offering insights into the artistic and institutional forces shaping the DGAH collection. By identifying period of high and low artistic activity, the visualization allows for further exploration of potential historical, economic, or cultural influences that impacted sculpture creation and acquisition.
This method differs from a purely data science approach in that it is not solely focused on statistical modeling or predictive analytics. Unlike traditional data science, which often seeks to optimize algorithms for efficiency or forecast future trends, this analysis provides an interpretive framework for understanding how and why certain period have more sculptures produced than others.
Additionally, the process of cleaning and structuring the data—such as removing uncertain dates and grouping years into readable intervals—demonstrates how Digital Humanities work often requires curating and interpreting imperfect datasets rather than assuming them to be in the optimal situation (i.e. complete or objective). This aligns with the broader goals of Digital Arts & Humanities, where digital tools are used not just to quantify but to enrich discussions about cultural heritage, artistic evolution, and institutional biases in curating and preserving art.