Photographs of phonograph records: novel techniques to create digital audio archives

Sergio Canazza
Sergio Canazza

All you sound guys know that sometimes we want so show how much we’re brave, so in these days in which all the community is going crazy for the Legacy of Tron sounds, we publish an article talking about the folk songs of italian immigrants in the U.S. one century ago, the historical importance of shellac discs and some experimental techniques to create digital audio documents using a digital camera.

Let’s get straight to the point: a group of italian researchers developed Photos of GHOSTS (Photos of Grooves and Holes, Supporting Tracks Separation), a software which is able to reconstruct the sounds of phonograph records starting from the photographs of the surface of the disc. This new technique could be a great solution for some important issues as: automation of the digitization process for audio documents, music restoration, audio extraction from damaged supports and so on.

We had a long and visionary interview with Sergio Canazza, team leader of the project and assistant research professor at the Sound and Music Computing Group of the Department of Information Engineering (University of Padova, Italy), who told us the amazing story behind this project and described the current trends of the research in the field of audio restoration.

Sergio Canazza
Sergio Canazza

Hi Sergio, can you tell us how the idea of the research project about the digital archive of shellac discs was born?

In 2005 the European Community funded the research project (of which I was the proponent and coordinator) Preservation and On-line Fruition of the Audio Documents from the European Archives of Ethnic Music, within the EU program Culture 2000.

This program aims to contribute to the development of a cultural area common to european people, characterized by a common heritage and a cultural and artistic diversity.

This project, which had some of the largest sound archives of european ethnic music as partners, was intended, as the title says, to the conservation and use of sound recordings with an increased risk of getting lost.

Since systems used to record ethnic music are often non-professional, audio supports – mostly obsolete – are at high risk of loss for degradation. The first objective of the project was therefore the active conservation of sound recordings (that’s to say digitizing all the information: audio signals, labels, corruptions of support, etc..) while defining the rules for quality control of the analog/digital conversion.

In this context, I realized that, paradoxically, shellac discs were the most important supports (even more than the tape or wax and celluloid cylinders) from a musicological point of view: a lot of valuable recording documents only exist in this format. At the same time I was able to touch the complexity of their active storage process, which was also complicated by the fact that owners of some specimens were not available to allow copying of their precious discs (reading the disc by the stylus is always invasive). In other cases discs were broken into several fragments.

Then I found a laser player for reading the discs: unfortunately this was a failure, too. In fact the device was not able to play discs which were highly corrupt (scratched, very wavy or even broken).

I knew that automatic text scanning and optical character recognition systems are widely used in libraries, then I thought might be interesting to apply tools developed in the field of image processing to extract audio data (contextual information and the audio signal) from the photographic document.

This approach enables to: a) keep all the information of the support (by developing a virtual model, in 2D or – essential when discs vertical incision – 3D); b) make a a completely and noninvasive scanning; c) actively store those supports which corruptions would make them impossible for a traditional scanning (broken record, for example), d) start an automatic and large-scale process of conservation for records archives that could not afford traditional digitalization systems (more expensive and more complex to use than a simple photographic shooting).

I saw that in the literature were already ‘present some solutions of this kind (Vitalyi Fadeyev and Carl Haber, Reconstruction of Mechanically Recorded Sound, Lawrence Berkeley National Laboratory Technical Report 51983, 2003. Sylvain Stotzer, Ottar Johnsen, Frederic Bapst, Christoph Sudan and Rolf Ingol, Phonographic Sound Extraction Using Image and Signal Processing. on Proc. of ICASSP, May, pp. 17-21, 2004). These proposals were not easily applicable to my case study, because the hardware they need.

Thanks to the technological development reached by systems for image capture (digital cameras and scanners), I developed a system to synthesize the audio (and automatically extract some metadata: the center of the disk, corruptions of the support, information stored on the label) of lateral incision phonograph records from the photos of the support.

Those archives of phonograph records in which has been performed actions to preserve the active document usually store a conservative copy, composed not only of the digital audio signal, but also of photos of the support (besides the cover and any attachment). Then it is not unacceptable (both in terms of costs, both from the training staff perspective) to think of acquiring high-quality photographic documents. So the archive copy will be made of photographs of the phonograph record.

All those archives of records which still did not started the process of A/D conversion can thus easily make their archive copies of phonograph records: unlike the necessary professional gear for the A/D conversion of the audio signal, the devices to perform the shooting are cheap and easy to use.

In addition it’s finally possible to scan discs a) suffering from serious corruptions and b) of different sizes without being forced to replace the scanning gear (as when using the turntables, where you need to adjust each time the speed of rotation, change the stylus, change the weight of the arm, compensate the skating strength, etc..), with an obvious reduction of costs for technologic devices and staff training.

The algorithm we developed – working on low cost hardware – can be advantageously used for the creation of access copies extracting the audio and some contextual information directly from the image of the grooves of the disks.

What is the musical content of the shellac discs?

In my research I have focused on shellac 78 rpm discs (both mechanical records – pre-1925 – both electric) containing recordings of popular music. Other musical genres favored different storage media (tape) or have been already moved into the digital domain (often by means od the original masters), thanks to their commercial interest (it’s the case of western classical music and afro-american music). Instead, I realized that much of the wealth of ethnic music of the early 20th century: (a) was recorded on shellac discs and (b) not yet’ was object of conservation / restoration, despite its huge cultural value (sociological, historical and ethnomusicological).

During my research I understood how important is the production of 78 rpm records for the popular musical heritage as the only source of documentation before the beginning of the ethnomusicological research activity. Many of these recordings, even if they were made without any documentary or scientific objective, but only for commercial purposes, remain the only direct witness of the art of singers and musicians and their music and their documentary value is very high.

Among all the various international archives where I worked, I would like to quote the Fugazzotto Fund, which is the largest archive of recordings of Italian immigrants in the United States.

Between 1893 and the beginning of the II World War about 8000 matrix material folk / popular market for Italian immigrants were recored in the U.S.. These are mainly records for the ethnic Columbia catalog (the Columbia E-series, see figure 1) or for the Victor or for the Okeh (see figure 2), but there are also self-produced records and labels owned by the performers themselves, such as the Nofrio Record, founded by C. Giglio in New York (see figure 3) after the success of the character of comic sketch by A. Bucci and Giovanni De Rosalia; or like the Etna Record, appeared in the italo-american music scene in the 20s and produced by Angelo Cimino in New York; like the Non Plus Ultra Record, a label founded in New York in 1924 by G. Lo Bianco, which now can’t be found anymore.

Figure 1
Figure 1: based on source images and a video footage taken from the arm of a turntable during playback. Support on the plate is deformed, and this causes an oscillation of the arm.The images were highlighted: (a) the lowest position of the arm during the swing and (d) the highest position. In images (b), (e) you can see the features of Lukas-Kanade detected on the head of the arm. This metadata is stored and then used during the restoration process of the audio signal (in particular for correcting the pitch).

Up to the 30s the repertoire of italian-american catalogs is composed for the most of instrumental music, including both traditional repertories and for popular bands, but also dance music and virtuoso pieces. Vocal music is essentially monophonic and is represented for first by sicilian and neapolitan songs, but also folk songs, stories of saints, Christmas novenas. The so-called scene dal vero (scenes from real life) are very important for this repertoire. These are dramas with music in some way related to the repertoire of singers, and comic scenes that echo the tradition of dialect theatre and Commedia dell’Arte, or, for Sicily, the vastasate and l’Opera dei Pupi.

Figure 2
Figure 2: time evolution of the y coordinate of a feature of Lucas-Kanade placed on the arm of the turntable. The obvious fluctuations indicate the presence of an imperfect medium (hard “board”). This metadata is stored and then used for the restoration process of the audio signal (in particular for correcting the pitch)

These audio documents tell of the lives of Italian immigrants in the U.S., the manner in which they faced the difficulties of integrating into the new world, the conflicts that arose within the family with new generations of Italian-Americans, political problems in which they found themselves involved, the different communities in which immigrants had to face up to the nostalgic memories of the homeland and the desire to return to their own country.

I firmly believe that these issues pose problems of great interest ‘and should be studied by the new generations. Perhaps many young people are not fully aware of all the problems faced by an immigrant. The study of these documents facilitates this kind of understanding, since they talk about Quando gli albanesi eravamo noi (ndr: when we were the albanians) (to quote the famous book by Gian Antonio Stella, published by Rizzoli).

For a detailed consideration of issues related to the study of ethnomusicology and of the techniques of conservation and restoration of this type of sound recordings, let me quote the book (co-written by Giuliana Fugazzotto and me) coming out in recent days (with CD Audio-attached): Sta terra non fa più mia (ndr: It is no longer a land of mine). The 78 rpm records and the life of italian emigrants to America in the early twentieth century. It will be published in the Valter Colle’s series of ethnomusicology GEOS cd BOOK.

Figure 3
Figure 3: time evolution of the audio signal (interval analysis: from 12.5sa 23s) synthesized from the photograph of the grooves on a phonograph record broken (the media presence broken into three parts). Image shows how the waveform has been reconstructed without introducing discontinuities.

What are the next steps of your research?

My research activity in the field of conservation/restoration of audio documents is organized in three areas:

  1. I’ve been studying non-invasive systems (similar to the one we developed for shellac discs), to play the sound of mechanical supports to the side etching: phonograph records with vertical incision 45/45 stereo, wax cylinders or type Blue Amberol. In particular I am most looking forward to the cylinders, because: (a) they are an important heritage of sound recordings, often never reprinted, (b) the quality of the recorded audio is inherently higher than the shellac discs ones, thanks to the fact that the cylinder has a constant angular velocity to the surface recorded from the beginning to the end, while the flat disc introduces a major distortion in the internal grooves (near the center) due to the slowdown introduced by the change of the radius.Among the different existing cylinders, a lot of attention must be given to the Blue Amberol, introduced by Edison in November 1908 and consisting of a mixture of celluloid and phenolics. It’s important to start active operations for the conservation of these supports, as: (a) documents contain music of great interest because, thanks to the grooves more than the thin-cylinder two minutes, the Blue Amberol could contain up to 4 minutes of music; (b) they are often in good condition (the particular mix used makes them virtually unbreakable); (c) they offer the best quality of sound available at that time (more than 3000 plays were guaranteed, thanks to the adoption of celluloid which provided a much less noisy surface of the lacquer used for 78). In this case it is obviously necessary to start from 3D models of the grooves, and not, as in the case of phonograph records in a lateral incision, from simple images (2D) of the support. To this end, I am considering different acquisition systems: chromatic confocal sensor (that can return an image with different colors according to the different depths of the groove), interferometers.
  2. I’m working on a research project aimed at creating a HW / SW to preserve, restore and archive the sound files of the Archive Vicentini of the Arena di Verona. The Vicentini archive has the potential to attract the interest of the international scientific and musicological community, thanks to the tens of thousands of sound recordings in its possession, which of them are of great rarity or even unique in the world. The project is called REVIVAL (Restoration of the archives in Verona Vicenza and Its accessibility as an audio e-Library) and involves the University of Verona, Arena di Verona and me.
  3. I’m starting to develop software systems that can restore the audio signal directly acting on the model (2D or 3D) of the support. In practice, the idea is to remove the scratches of the support from the picture disc (or the three-dimensional reconstruction of the cylinder) before the audio signal synthesizer, with appropriate adaptation tools already known in the literature of computer vision. The first results seem to me very encouraging.

Have you thought about how to manage this massive sound archive? Are you going to make it available online?

This is still an open issue. For royalty-free sound files my idea would be to make everything available online: audio (High Definition and compressed), metadata and contextual information. But obviously the decision depends on the owners of the archives. And, in the alternative, even by the resources (financial and human) available.

I underline a legislative issue that is tutoring me: as everybody knows, it is not permitted to make available online a digital version of the audio content in a phonograph record-protected rights. And if the owner of the disk put ‘only’ a photo (at high resolution) of the disk and users were able to synthesize the sound from the picture via a client-side software?

In your opinion, what are the most interesting experiments in sound today?

Computational models for the expressive gesture.

I think one of the most exciting frontiers of research in the Sound and Music Computing is in the field widely known as Expressive Processing Information, which can’ bring interesting contributions in various areas: artistic creation (interactive dance/music systems) , musical analysis (models of musical performance), rehabilitation (multimedia techniques for therapy and rehabilitation) and music information retrieval (search for sound recordings based on the expressive content). I would like point to some research groups who are particularly active in these areas:

Gianpaolo D'Amico

Editor-in-chief at sounDesign
Gianpaolo D'Amico is an independent creative technologist for digital media. He is the founder of sounDesign and a music obsessed since he was 0 years old.


  1. I was very surprised, and somewhat disappointed, to read this article, as the organization I represent, together with one of the finest schools for engineering in my country, have been working now for more than 10 years on taking pictures of all kinds of records (coarse- and microgroove, lateral-, vertical-, and stereo cut, in good and in bad shape) transforming then the image shots into sound. Mr. Canazza got in touch with us multiple times in the past, it is a pity there’s no mention of it in the interview. All things we developed so far, as well as the work of our overseas partners (Dr. Haber and Dr. Fadeyev) is in the public domain from the very beginning, thus accessible to everyone. I kindly ask anyone interested in this topic to have a look at this address, then we’d continue the discussion.

    Stefano S. Cavaglieri

  2. First of all, I would like to thanks Stefano Cavaglieri for his comment. It is absolutely true: the work by Swiss National Sound Archive (Stefano Cavaglieri in particular) together, among others, Proff. Haber and Fadeyev, was (and is) impressive. In this sense it is a my fault to cite only one paper by Haber and Fadeye (see above). But, this is only an interview so I preferred to don’t write a real bibliography. I supposed that from the paper cited by me anyone should be able to get back to the web address of visualaudio: anyway, I deeply apologize: I omit the address
    I would like to highlight the several research group (in Europe and in north America) working on audio preservation and restoration, started their work studying the results obtained by the Swiss National Sound Archive (and Stefano Cavaglieri in particular). In this sense, I would like to suggest – to anyone interested to this field – to study the papers (at least):
    Ottar Johnsen, Sylvain Stotzer, Frédéric Bapst, Rolf Ingold: Detection of the Groove Position in Phonographoc Images, IEEE International Conference on Image Processing, September 2007
    Ottar Johnsen, Frédéric Bapst, Christoph Sudan, Sylvain Stotzer, Stefano S. Cavaglieri, Pio Pellizzari: VisualAudio: an Optical Technique to Save the Sound of Phonographic Records, Joint Technical Symposium 2004, Toronto, Canada, June 24-26, 2004

  3. I had the opportunity to see the IRENE technology, cited in the work by Fadeyev and Haber, at the Library of Congress restoration and archive facility a couple of years ago. It’s pretty amazing what what can be done with this system, and the LOC has developed proprietary software for optimizing the results obtained from the process. That whole facility, located about an hour from my home in Virginia, is pretty amazing – it was originally built to provide shelter and a headquarters for the President and his cabinet in the event of a nuclear attack, but has been repurposed for the LoC, and now houses an incredible collection of cultural treasures, and restoration facilities. The studio I toured cost over $250,00 to build, and has some wonderfully esoteric gear for the engineers to work with.

    Here are a couple of links, on to a story about the IRENE technology, and another which contains links to various examples of the IRENE at work:

    – Paul

  4. I have a collection of Nofrio records and have had compiled 20 titles on CD. The original 78 records (which were close to 100 years old at the time) were sent to a sound engineer in Hollywood, California. All of the background sounds such as clicks, hisses, crackles, etc. were removed and the result was excellent. The critical aspect was to know at what speed to play the 78 rpm records to achieve the the best sound quality. My goal was to preserve the spoken word as Sicilian was spoken at the turn of the 20th century, to share the entertainment value of the comedic genius that was Giovanni De Rosalia (author, playwright and creator of “Nofrio” and to cover my expenses plus a small profit. The CD’s are available for sale at $ 14.95 plus shipping for those who may be interested.