Hyperspectral Reflectance Imaging hypercubes include remarkable information into a large amount of data. Therefore, the request for automated protocols to manage and study the datasets is widely justified. The combination of Spectral Angle Mapper, data manipulation, and a user-adjustable analysis method constitutes a key-turn for exploring the experimental results.
Reflectance Spectroscopy (RS) and Fiber Optics Reflectance Spectroscopy (FORS) are well-established techniques for the investigation of works of art with particular attention to paintings. Most modern museums put at the disposal of their research groups portable equipment that, together with the intrinsic non-invasiveness of RS and FORS, makes possible the in situ collection of reflectance spectra from the surface of artefacts. The comparison, performed by experts in pigments and painting materials, of the experimental data with databases of reference spectra drives the characterization of the palettes and of the techniques used by the artists. However, this approach requires specific skills and it is time consuming especially if the number of the spectra to be investigated becomes large as is the case of Hyperspectral Reflectance Imaging (HRI) datasets. The HRI experimental setups are multi-dimensional cameras that associate the spectral information, given by the reflectance spectra, with the spatial localization of the spectra over the painted surface. The resulting datasets are 3D-cubes (called hypercubes or data-cubes) where the first two dimensions locate the spectrum over the painting and the third is the spectrum itself (i.e., the reflectance of that point of the painted surface versus the wavelength in the operative range of the detector). The capability of the detector to simultaneously collect a great number of spectra (typically much more than 10,000 for each hypercube) makes the HRI datasets large reservoirs of information and justifies the need for the development of robust and, possibly, automated protocols to analyze the data. After the description of the procedure designed for the data acquisition, we present an analysis method that systematically exploits the potential of the hypercubes. Based on Spectral Angle Mapper (SAM) and on the manipulation of the collected spectra, the algorithm handles and analyzes thousands of spectra while at the same time it supports the user to unveil the features of the samples under investigation. The power of the approach is illustrated by applying it to Quarto Stato, the iconic masterpiece by Giuseppe Pellizza da Volpedo, held in the Museo del Novecento in Milan (Italy).
Reflectance Spectroscopy (RS) and Fiber Optics Reflectance Spectroscopy (FORS) are based on the detection of the light reflected by surfaces once illuminated by a light source, typically a tungsten-halogen lamp. The output of the acquisition system is constituted by spectra where the reflectance is monitored as a function of the wavelength in a range that depends on the characteristics of the employed experimental setup1,2,3. Introduced during the last four decades4,5, RS and FORS are typically used in combination with X-ray fluorescence and other spectroscopies to describe the materials and the techniques used by artists to realize their masterpieces6,7,8,9. The study of the reflectance spectra is usually performed by comparing the data from the sample with a group of reference spectra selected by the user in personal or public databases. Once the reference spectra that comply with the realization period of the sample and with the modus operandi of the artist have been identified, the user recognizes the main features of the reflectance spectra (i.e., transition, absorption, and reflection bands1,2,10,11) and then, with the help of other techniques6,7,8 they distinguish the pigments that have been used in the paintings. Finally they discusses the slight differences that there exist between the references and the experimental spectra7,9.
In most cases, the experimental datasets are composed of a few spectra, collected from areas chosen by art experts and assumed to be significant for the characterization of the painting6, 12,13. Despite the skills and the experience of the user, a few spectra cannot fully exhaust the characteristics of the whole painted surface. Moreover, the result of the analysis will always be strongly dependent on the expertise of the performer. In this scenario, Hyperspectral Reflectance Imaging (HRI3,14,15) could be a useful resource. Instead of a few isolated spectra, the experimental setups return the reflectance properties of extended portions or even of the whole artefact under investigation16. The two main advantages with respect to the acquisition of the isolated spectra are evident. On one hand, the availability of the spatial distribution of the reflectance properties allows the identification of areas that hide interesting features, even though they may not seem peculiar17. On the other hand, the hypercubes guarantee a number of spectra high enough to enable the statistical analysis of the data. These facts support the comprehension of the distribution of pigments within the painted surface18,19.
With HRI, the comparison of the experimental data with the references could be hard to handle15. A typical detector returns hypercubes of at least 256 x 256 spectra. This would require the user to evaluate more than 65,000 reflectance spectra against each reference, a task almost impossible to be carried out manually in a reasonable time. Therefore, the request for robust and, possibly, automated protocols to manage and analyze HRI datasets is more than justified15,17. The proposed method answers this need by handling the whole analytic procedure with the minimum involvement and the maximum flexibility.
An algorithm comprising a set of home-made codes (Table of Materials) reads, manages, and organizes the files returned by the experimental setup. It allows the fine selection of the portions of the Fields of View (FOVs, one field of view is the area of the painting monitored by a single hypercube) to be studied and performs the analysis of the data based on the Spectral Angle Mapper (SAM) method20,21 and on the manipulation of the original spectra. SAM returns false color gray-scale images called similarity maps. The values of the pixels of these maps correspond to the spectral angles that are the angles between the spectra stored in the hypercubes and the so-called End Members (EMs, a group of reference spectra that should describe the features of the surface monitored by the hypercubes)22. In the case of RS applied to paintings, the EMs are the reflectance spectra of pigments that should match the palette of the Master. They are chosen based on the available information about the artist, the realization period of the painting, and the expertise of the user. Therefore, the output of the SAM is a set of maps that describes the spatial distributions of these pigments over the painting surface and that supports the user to infer the materials used by the artist and their organization in the artefact. The algorithm offers the possibility of employing all kind of references independently from their origin. The references can be specific spectra selected within the hypercubes, come from databases, be acquired by a different instrument on a different surface (such as samples of pigments or the palette of the artist, for instance), or be obtained employing any kind of reflectance spectroscopy, FORS included.
SAM has been preferred among the available classification methods because it has been demonstrated to be effective for characterizing pigments (refer to the book by Richard23 to have an overview of the main available classification methods). Instead, the idea of developing a home-made protocol rather than adopting one of the many tools freely available on the net24,25 relies on a practical consideration. Despite the effectiveness and scientific foundation of the existing GUIs and software, a single tool hardly satisfies all the needs of the user. There could be an Input/Output (I/O) issue because a tool does not manage the file containing the raw data. There could be an issue regarding the analysis of the data because another tool does not provide the desired approach. There could be a limitation in the handling of the data because the simultaneous analysis of multiple datasets is not supported. In any case, a perfect tool does not exist. Each method must be adjusted to the data or vice versa. Therefore, the development of a home-made protocol has been preferred.
The presented approach offers neither a complete set of analytical methods (see, for comparison, the tool proposed by Mobaraki and Amigo24) nor an easy-to-manage user-interface (see, for comparison, the software employed by Zhu and co-workers25), but, in exchange, it focuses on a still underrated aspect of hyperspectral data analysis: the opportunity to manipulate the detected spectra. The power of the approach is illustrated by applying it to the painting Quarto Stato by Giuseppe Pellizza da Volpedo (Figure 1), an iconic oil on canvas held in the Museo del Novecento in Milan, Italy. Note that, since the approach requires running home-made codes, the developer arbitrarily chose the names of the codes and both the input and output variables used in the description of the protocol. The names of the variables can be changed by the user but they must be provided as follow: the input and out variables must be written respectively within brackets and eventually separated by comma and within square brackets and eventually separated by a white space. On the contrary the names of the codes cannot be altered.
1. Set the spatial resolution of the hypercubes
2. Adjust the experimental parameters to the painting
3. Hypercubes and the reference spectra management
4. SAM analysis
The proposed protocol offers a set of interesting features for the management and the analysis of HRI data. The I/O (step 3.1) of the raw data is always the first problem that must be solved before applying any analysis method and it can become a critical issue when dealing with large amounts of data. In the present case, the only task regarding the raw data is to store the experimental results into a dedicated folder and select it by browsing the hard disk when running the reading code (step 3.1.1). Thereafter, the cropping and the RGB rebuilding codes allow the refinement of the selection of the data to be analyzed (step 3.1.2) and checks that the experimental conditions have been properly set at the moment of the acquisition of the hypercubes (step 3.1.3, see Figure 4 and the Discussion section for further details).
Once verified that the data-cubes have been correctly acquired, the algorithm offers different possibilities to select the end members for the SAM analysis20,21 (step 3.2). The first two options (steps 3.2.1 and 3.2.2) retrieve the references among the hypercubes by manually selecting some isolated measuring points (Figure 5A) or by automatically sampling the surface of the painting providing a reticular selection of measuring points within one or more FOVs (Figure 5B). The analysis based on isolated measuring points is faster than the reticular based one, but it implies a careful and, possibly, informed observation of the FOV(s) to identify the significant spectra; this means good experience dealing with pigments and painted surfaces. The reticular selection makes the algorithm time-consuming and forces the user to observe a lot of output images to retrieve a handful of useful similarity maps. However, the reticular selection provides a complete screening of the hypercubes and, mostly, it can be carried out without experience of the experimental context. In principle, once the sampling distance, n_pixel, is decided, the user can neglect the observation of the FOV(s) with a very low probability of losing details.
In addition to the selection of the reference spectra within the hypercubes, the algorithm offers the opportunity to compare the data from the sample under investigation with references belonging to other sources (step 3.2.4). The external reference spectra importer code manages the I/O of references that do not belong to the surface of the painting. The matrix converter code equalizes the wavelength ranges and the spectral resolution of both the hypercubes and the external references (step 3.2.4). This possibility extends the capabilities of the user regarding the characterization of the sample. Indeed, the user can exploit every kind of available resource in terms of reflectance data. The hypercubes can be compared with public databases, with the spectral archives of the user, with new data collected on ad hoc prepared samples or even on other objects (paintings, palettes, hues, or whatever) belonging to the author or to other artists. Moreover, the external references can be obtained exploiting any kind of reflectance techniques so much so that the references shown in Figure 6 and Figure 7 have been acquired by a portable FORS miniature spectrometer (Table of Materials) and not by the camera used for the HRI data.
Beyond the data management, the algorithm offers an original approach to the data analysis too. It allows manipulation of the spectra before evaluating the SAM maps (steps 4.1-4.5). This possibility finds its rationale in the choice of the SAM method to investigate the distributions of the pigments. In fact, SAM considers the reflectance spectra as they would be vectors in a multi-dimensional space (i.e., hyper-vectors with a number of components equal to that of the acquisition channels). Therefore, if the principal aim of the analysis is to compare different but similar references to distinguish which one best matches the pigments used by the artist, then the almost identical components of the reference spectra (i.e., the wavelengths that correspond to almost identical values in the hyper-vectors) should not be particularly useful and the algorithm allows to exclude these components from the analysis.
The protocol supports two options for manipulating the data (step 4.5): the user can define the wavelength portion(s) of the reflectance data to be analyzed manually (Figure 6) or automatically (Figure 7). The manual selection is straightforward. The pre-processed reference spectra or their first derivatives, depending on the selected pre-processing option (step 4.2), appear on an interactive window, Figure 6A, and the user selects one or more wavelength interval(s), Figure 6B, by clicking on the graph surface. The automatic selection is based on the mathematical criterion of the maximum variance applied to the pre-processed reference spectra or their first derivatives, depending on the selected pre-processing option (step 4.2). The algorithm computes the variance (normalized and displayed as a Dashed Line videodan Figure 7A) within the selected references and order all the spectra (both the references and the hypercubes) accordingly to this criterion (the Dashed Line videodan Figure 7B represents the normalized and ordered variance). In other words, if the maximum variance corresponds to the nth wavelength, the content of the nth component of each pre-processed spectrum (references and hypercubes) will be moved to the first position of a re-arranged hyper-vector and so on (the colored portions of the background in Figure 7A and Figure 7B graphically explain the re-arrangement of the data). Practically, the components of the pre-processed spectra are ordered similar to principal component analysis30.
Once the spectra have been manipulated, the algorithm evaluates the SAM maps. Following the manual manipulation (Figure 6), the protocol returns three sets of maps: two corresponding to the groups of selected and rejected wavelengths and one obtained employing the whole spectra. Otherwise, following the automatic manipulation (Figure 7), the algorithm applies a floating threshold to the variance values and evaluates the SAM maps at the increasing of the threshold for both the re-arranged hyper-vector components corresponding to the over threshold (i.e., automatically selected) and to the under threshold (i.e., automatically rejected) values of the variance. These sets of maps, together with that obtained from the whole spectra (always returned by the algorithm), result in a total of (2N + 1) sets of maps where N is the number of values assumed by the threshold. The sets of similarity maps obtained at the increasing of the threshold (Figure 8) illustrate that data manipulation does not alter the content but rather provides new insights into the details of the mapped area(s) and, consequently, can help to distinguish similarities and differences between the samples and the references.
Figure 1: Quarto Stato. A picture of the painting, 1899-1901, 293 x 545 cm, oil on canvas, Giuseppe Pellizza da Volpedo, Museo del Novecento, Milan, Italy. Please click here to view a larger version of this figure.
Figure 2: Definition of the experimental conditions. (A) The ad hoc prepared test samples; the White Circles and Numbers identify the measuring points corresponding to the spectra selected as references. (B) The SAM maps evaluated with respect to reference spectrum number 1, (C) number 2, (D) number 3, and (E) number 4. The Gray Color Bar indicates the range of values of the spectral angles. Please click here to view a larger version of this figure.
Figure 3: The application of the defined experimental conditions to Quarto Stato. (A) The ROIs selected for the experimental campaign (Red Rectangles); in each rectangle a FOV of those necessary to cover the ROI has been highlighted (Unshaded Areas). (B) The RGB pictures of the four Unshaded Areas of panel A. (C) The SAM maps evaluated with respect to a reference spectrum selected within each FOV (Green Circles). The Gray Color Bar indicates the range of values of the spectral angles. Please click here to view a larger version of this figure.
Figure 4: Proper versus improper illumination of the surface of the sample. (A) A portion of the FOV where a small fraction of the painted surface (Red Circle) is affected by altered reflectance properties due to improper illumination. (B) The same small fraction of the painting (Blue Circle) as it results when the FOV is properly illuminated. (C) The reflectance spectra of the measuring point at the center of the circles when the FOV is improperly and properly illuminated (Red and Blue Line respectively). (D) The SAM map of the FOV obtained using the spectrum of the improper illuminated measuring point as reference. (E) The SAM map of the FOV obtained using the spectrum of the proper illuminated measuring point as reference. The Gray Color Bar refers to (D) and (E) and indicates the range of values of the spectral angles obtained comparing the first derivatives of the spectra of the hypercube of the selected FOV and the first derivative of the spectrum of the measuring point at the center of the colored circles in (A) and (B). Please click here to view a larger version of this figure.
Figure 5: References selection within the hypercubes. (A) The isolated measuring points selection mode; the Green Circles indicate the location of the reference spectra manually selected on the FOV shown. (B) The reticular selection mode; the Green Circles indicate the location of the reference spectra selected by applying a reticulum with the sampling interval (n_pixel) set to five pixels to the FOV shown. The image reported in both (A) and (B) is the grayscale conversion of the RGB image of the FOV retrieved from the reflectance spectra applying the D65 illuminant and 1931 observer from CIE standards to the hypercube; the Gray Color Bar refers to the normalized intensity of this image. Please click here to view a larger version of this figure.
Figure 6: The manual data manipulation mode. (A) The aspect of the interactive window that allows the user to divide the reference spectra into the selected and rejected fractions of wavelengths. (B) The same references of (A) where the portions of data selected for evaluating the SAM maps have been highlighted by a pink background. (A) and (B) display the pre-processed spectra of the references. Please click here to view a larger version of this figure.
Figure 7: The automatic data manipulation mode. (A) The first derivatives of the four normalized references reported in Figure 6 (Coloured Lines) and their normalized maximum variance (Black Dashed Line). (B) The same derivatives of (A) sorted following the criterion of the maximum variance; the sorted values of the normalized maximum variance have been reported too (Black Dashed Line). Some portions of the background have been colored with different hues in the attempt to visually illustrate the re-arrangement of the hyper-vectors. Please click here to view a larger version of this figure.
Figure 8: The SAM maps obtained by the automatic data manipulation mode. (A–C) The sorted values of the normalized maximum variance evaluated within the first derivatives of the reference spectra of Figure 7; the Green and Red Sections of the curve indicate, respectively, the selected (over threshold values) and rejected fraction of the data (under threshold values). The panels show what happens at the increasing of the threshold (Black Dotted Segment); each panel reports the SAM maps for both the groups of values obtained for the four derivatives of the spectra of Figure 7; the Green Edged Maps refer to the over threshold fractions while the Red Edged Maps refer to the under threshold ones. The Gray Color Bars indicate the range of values of the spectral angles. In this example, the step that determines the increase of the threshold is equal to 0.5% of the normalized maximum variance; the threshold values reported in (C) is 0.09 and it is the last considered threshold value because a further increase would reduce the number of selected components of the hyper-vectors below the arbitrarily fixed lower limit of 20 values, i.e., 10% of the total number of the acquisition channels of the hyperspectral camera. Please click here to view a larger version of this figure.
Hyperspectral reflectance imaging datasets are large reservoirs of information; therefore, the development of robust and, possibly, automated protocols to analyze the data is a key turn to exploit their potential15,17. The proposed algorithm answers this need in the field of cultural heritage with particular attention to the characterization of the pigments of paintings. Based on SAM20,21, the algorithm supports the user during the whole analysis process from the setting of the experimental conditions to the evaluation of the distribution of pigments. Though the algorithm still does not have a complete graphical interface and that it does not provide a tool for viewing the results (for this purpose an open-source software has been used31 and it is recommended, see Table of Materials), the set of possibilities implemented to modulate the approach to the data analysis extensively balances these drawbacks.
The protocol sets the acquisition system according to the characteristics of both the sample and the detector. On one hand, the Divisionist technique employed by Pellizza Da Volpedo to create Quarto Stato requires that the hypercubes distinguish between small brush strokes of different pigments placed side by side. On the other hand, the hyperspectral camera has a focus range between 150 mm and infinite with a manual adjustment system that at 1 m distance to the target detects an area of 0.55 x 0.55 m with a spatial resolution of 1.07 mm26. The application of the algorithm to few hypercubes acquired on the test samples (Figure 2) helps establish a suitable working distance for the data acquisition. The observation of the measurements allows to set the working distance for the experimental campaign to 30 cm, corresponding to a resolution of 0.31 mm at the target. This working distance was also successfully adopted during the experimental sessions conducted on Quarto Stato (Figure 3). Once the working distance was defined, the illumination of the surface of the sample remains a critical issue3,15. When a portion of a FOV shows uneven (Red Circle videodan Figure 4A) instead of uniform illumination (Blue Circle videodan Figure 4B), the reflectance properties change dramatically (Figure 4C) and the whole procedure is compromised (Figure 4D vs. Figure4E). The protocol prevents uneven illumination (and more in general against artefacts in the monitored areas) during the acquisition of the data (by returning RGB, step 1.4.1, and SAM maps, step 1.4.9, that can be checked by the user) and a posteriori by excluding the compromised portions of the FOVs from the analysis by means of the cropping code (steps 1.4.2 and 3.1.2).
The protocol allows the user to select the references (i.e., the end members used for the evaluation of the SAM maps) with the maximum freedom. On one hand, the EMs can be chosen within the edges of the hypercubes by two manners: isolated measuring points selection (Figure 5A in steps 1.4.5 and 3.1.2) or reticular measuring points selection (Figure 5B in step 3.1.3). The first can be defined as informed selection because it requires some expertise in the user to manually identify the significant measuring points. The latter can be defined as blind selection because the reticular sampling of the FOVs requires only the value of the sampling interval to be performed. On the other hand, the EMs can be retrieved from outside the painting under investigation (step 3.1.4). During the experimental campaign conducted on Quarto Stato, a portable miniature FORS spectrometer (Table of Materials) was used to collect spectra from draft samples belonging to the artist and currently kept in the Studio Museum located in Volpedo (Pellizza da Volpedo Studio Museo, Volpedo (AL), Italy). These reflectance data have been used for the evaluation of the SAM maps and some of them are reported in Figure 6 and Figure 7. Since it limits the importance of the absolute intensity and of the baseline of the spectra, the pre-processing is mandatory for both the hypercubes and the EMs, especially if they have been obtained from slightly different setups or operative conditions32.
The last main feature of the protocol is the chance to manipulate the experimental data. For manipulation, it is intended that the identification of the most significant components of the EMs (i.e., of those portions of the spectra of the end members should help characterize the materials used by the artist). This task can be accomplished manually (Figure 6) or automatically (Figure 7). In the first case, the algorithm takes advantage of the expertise of the performer while, in the second case, it is a statistical criterion that determines the components that, time by time, will be used to evaluate the SAM maps. In both cases, the manipulation increases the number of the resulting similarity maps and consequently extends the capability to disclose the information carried by the hypercubes. In particular, the criterion-based selection generates a great number of insights of the painted surface (Figure 8).
Taken individually, the enumerated features could appear as mere technical benefits, but together they imply at least two main key points. The algorithm can be successfully applied by any kind of user and it can significantly broaden the scenario of the analysis. In fact, the main steps of the protocol (i.e., the selection of the references and the manipulation of the data) can be performed automatically, disregarding the skills and the experience of the user. With the possibility to drive the analysis with spectra from outside the hypercubes, all the reflectance data in the disposal of the researchers can be exploited for the characterization of the sample under investigation.
In summary, the protocol can be an extremely flexible tool. With some improvements regarding the graphical interface and the number of supported analysis methods, it can be a step beyond the state of the art regarding the handling and the analysis of data obtained from painted surfaces by means of hyperspectral reflectance imaging.
The authors have nothing to disclose.
This research was funded by Regione Lombardia in the framework of the Project MOBARTECH: una piattaforma mobile tecnologica, interattiva e partecipata per lo studio, la conservazione e la valorizzazione di beni storico-artistici – Call Accordi per la Ricerca e l'Innovazione.
The authors are grateful to the staff at Museo del Novecento for the support during the in situ experimental sessions and to the Associazione Pellizza da Volpedo for the access to Studio Museo.
ImageJ/Fiji | Specim (Oulo, Finlad) | N/A | Portable reflectance hyperspectral camera used to acquire the hypercubes |
MATLAB 2019b | StellarNet Inc (Tampa, Florida, USA) | N/A | Portable reflectance spectrometer used to acquire independent reflectance spectra |
Specim IQ Hyperspectral Camera | National Institutes of Health (Bethesda, Maryland, USA) | N/A | Open source Java image processing program |
StellarNet BLUE-wave Miniature Spectrometer | MathWorks (Natick, Massachusset, USA) | N/A | Program Language and numerical computing environment |