This week Link to heading
- I continued my search for licensed souces and also found a really good site that provides API access/open access to museum metadata
- After compiling enough sources for 2 weeks I finally started the process for curation.
Source 1 - The Web Gallery of Art Link to heading
The Web Gallery of Art provides access to 60,000 artworks in the form of a dataset. This source has also been used in other art-related datasets like SemArt and ArtDL.
Curation process. Link to heading
Filtering records. Link to heading
Previously it was discussed that we would be moving forward with renaissance paintings and hence first I filtered the dataset to find religious paintings. This left us with approximately 12k records.
Following is the analysis of the timelines associated with these paintings.
We observe that a lot of paintings are from the renaissance era which is a good sign.
Some more filtering.. Link to heading
After filtering the records to find religious paintings. The next step is to find paintings that adhere to the Christian Religion. To accomplish this I have taken help of the Christian icons used in the Iconclass datasets. They range from male saints like Anthony of Padua, John the Baptist to female saints like Mary Magdalene etc. The criteria to filter was based on the description and titles of the paintings. If we found the iconclass then that would mean that the painting is containing content w.r.t to the Christian religion.
After this step of filtering we were left with 5.5k paintings.
Results. Link to heading
Source 2: Iconclass AI Test Set Link to heading
The Iconclass AI Test Set is a dataset containing 87k records of different icons from the Iconclass dataset with corresponding descriptions. While the descriptions are not well-harvested. It is still a good enough source.
Curation process. Link to heading
Since the dataset that explicitly contains iconclasses for each record. I just filtered the records corresponding to the Christian icons and added around 12,000 records containing Christian icons.
Results. Link to heading
Meetings Link to heading
Meeting 1 Link to heading
The first meeting was held by Mark and included just me. We just talked about the things to do with this project and how the problems that me and Parth are solving are part of a bigger pipeline.
I also learned about the FrameNet teams’ annotation tool Lutma. While it is far fetched goal for the Emile Male pipeline it is still something possible in the future.
Meeting 2 Link to heading
The second meeting was held again by Mark and included me, Parth, Saby and other mentors as well. It was just a general discussion regarding our projects and problems we might be facing. Since I had started with the collection of data. The question I had in my mind was regarding the scale of data that I might have to collect. After discussion with Mark, I understood that I should try to solve a problem that solves a simple problem.
For eg: Given an image of Virgin Mary can our model learn what she might be doing?
Based on this discussion as of now my idea is to restrict the data to saints(male and female) and try and answer the question of what this saint might be doing or who this saint is or what are their possible attributes.