1 Introduction

Precipitation is one of the most complex climatic variables with extensive impacts on agricultural production, water resource, hydroelectric power generation, and the environment at the local and global scale. As a result of high spatio-temporal changes of precipitation and the influence of complex physical processes (e.g., clouds), it is challenging to predict and mitigate its effects on society. In particular, the extreme precipitation and temperature events have severe socio‐economic impacts in terms of their frequency and intensity (Almazroui, 2020a, b). Some studies (e.g., Abiodun et al. 2016; Ajibola et al. 2020; Almazroui et al. 2020; James et al. 2018; Klutse et al. 2016a) have used either global climate models (GCMs) and regional climate models (RCMs) to investigate changes in the different attributes of the precipitation such as the frequency, intensity, spatial extent, duration, and timing of these weather and climate events. With growing fluctuations in precipitation regimes, the increasing demand for water due to population growth would be in severe contrast to the low water supply (Smith and Katz, 2013; Trenberth et al. 2014). Understanding the daily precipitation features is thus key for accurate assessments of climate change projections over West Africa for a wide range of decision-makers. For better preparedness and mitigation measures against floods and drought events, there is the need to provide robust information on the variability, occurrence, and distribution of precipitation events. A detailed precipitation monitoring based on various precipitation indices will ensure that quantitative information on precipitation characteristics can be provided to decision and policymakers. However, only a handful of studies (e.g., Akinsanola et al. 2015; Klutse et al. 2016b; Sylla et al. 2013; Sylla, et al. 2016) have focused on the daily precipitation behavior over West Africa.

Given the growing use of GCMs beyond the scientific community for decision-making and impact applications, it is essential to evaluate their performance (Baumberger et al. 2017; James et al. 2015). With the introduction of the new state-of-the-art Coupled Model Intercomparison Project, Phase 6 (CMIP6; Eyring et al. 2016) which is made up of models with higher spatial resolution and additional physical complexity relative to the phase 5 (CMIP5; Taylor et al. 2012) models, it is a useful exercise to assess the performance of CMIP6 models in representing daily summer monsoon rainfall characteristics over West Africa. Other studies (e.g., Akinsanola and Zhou, 2019a, b; Barlow et al. 2019) have opined that these improvements may not necessarily improve model representation of the current climate and precipitation characteristics on regional-scales. It is entirely possible that model performance may vary over regions and across CMIP6 models as differences exist in how they represent physical processes and according to their numerical resolution. Almazroui et al. (2020) projected higher median warming in the CMIP6 model ensemble than CMIP5 over most of Africa but a mixed spatial pattern for precipitation. Earlier studies have also shown evidence of biases in precipitation intensity and frequency in GCMs that have been attributed to limitations in convective parameterization and its subsequent control over precipitation intensity (e.g., Berg et al. 2013; Trenberth, 2011).

Several studies have examined the performance of CMIP3/CMIP5 models in simulating global precipitation characteristics and extremes (e.g., Crétat et al. 2014; Nguyen et al. 2017; Nikiema et al. 2017; Pendergrass & Hartmann, 2014; Torma et al. 2011; Almazroui and Islam, 2019) and some specific regions (e.g., Akinsanola et al. 2020; Gaetani et al. 2017). However, few studies (e.g., Diallo et al. 2013; Ibrahim et al. 2012; Klutse et al. 2016b, Sylla et al., 2016; Sylla et al. 2013) have focused on examining daily precipitation characteristics over West Africa with other models. This paper seeks to evaluate the capability of CMIP6 models in their representation of the present-day summer monsoon precipitation characteristics and extremes over West Africa. The study uses a set of standard extreme precipitation indices as defined by the Expert Team on Climate Change Detection and Indices (ETCCDI; Tank et al. 2009; Zhang et al. 2011), constructed using CMIP6 models and observational datasets. The following sections describe the data and methods used in our analysis, discussion of the results, and the conclusion of the study.

2 Data and Methodology

In this study, we analyze 21 CMIP6 models simulations of daily precipitation datasets (see Table 1) obtained from the Earth System Grid data portal over the West African region bounded by latitudes 4–20° N and longitudes 17° W–17° E (Fig. 1).

Table 1 Details of the 21 CMIP6 models used in this study
Fig. 1
figure 1

West African region and topography (in meters) with dark brown areas showing highlands

To assess the ability of the different CMIP6 models in simulating the daily characteristics of precipitation, multiple gridded daily precipitation datasets that are frequently used as reference datasets in climate research were used for the period of 1997–2014. This time period is common across the observation dataset and the historical run of the CMIP6 experiment. Three gridded observational datasets are used: the Global Precipitation Climatology Project One-Degree Daily product (GPCP 1DD Version 1.2; Huffman and Bolvin, 2013), the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS; Funk et al. 2015) and is available from 1981 to present at a 0.5° × 0.5° resolution and the Tropical Applications of Meteorology using SATellite and ground-based observation (TAMSAT; Maidment et al. 2017) at a 0.035o × 0.035° resolution. To directly compare the extreme precipitation metrics from the multi-model summary statistics, all datasets were regridded to a common 2.8° × 2.8° (lat × lon) grid using bilinear remapping algorithm from the Climate Data Operators (https://code.zmaw.de/projects/cdo). This relatively coarse resolution roughly matches the grid of the simulations with the lowest resolution.

Our analysis considers the West African Monsoon summer season (June–September: JJAS). The mean precipitation over West Africa is examined briefly, thereafter, we focused on different hydroclimatic indices (as shown in Table 2) (Tank et al. 2009; Zhang et al. 2011) as a proxy to assessing the daily rainfall characteristics (simple daily intensity index, frequency of wet days and heavy rainfall events, extreme precipitation events represented as the 95th percentile, and maximum wet spell and dry spell length). These hydroclimatic indices selected have been used in many studies (e.g., Klutse et al. 2016a, b; Akinsanola et al. 2019b) and are useful analysis when seeking to understand the daily rainfall variability over any given region. In the study, a rainfall event is defined as a day with a minimum precipitation amount exceeding 1 mm (e.g., Akinsanola et al. 2020). Supplementary analyses on the performance of CMIP6 simulations investigated are the root mean squared error (RMSE) and the relative standard deviation (RSD). RMSE indicates the error of a model in predicting quantitative data based on how concentrated the model data is around the line of best fit. RSD is used to determine if the standard deviation of a set of data is small or large when compared to the mean. In other words, the RSD shows how precise the average of your results is. The higher the RSD, the more spread out the results are from the mean of the data. These quantitative measures are done by considering only the grid points over the land for all the indices defined in Table 2. It is worth mentioning that for this examination, only GPCP is utilized in the spatial distribution to have an immediate idea regarding the spatial patterns; notwithstanding, for the quantitative measures of model performance (Tables 3, 4, 5, and 6), all observation dataset (GPCP, CHIRPS, and TAMSAT) are employed to account for uncertainties in the observed daily precipitation products.

Table 2 Selected indices used in this study and their definition
Table 3 Root Mean Square Error (RMSE) between simulated precipitation intensity and that of the observations (GPCP, CHIRPS, and TAMSAT)
Table 4 Root Mean Square Error (RMSE) between simulated frequency of heavy rainfall and that of the observations (GPCP, CHIRPS, and TAMSAT)
Table 5 Root Mean Square Error (RMSE) between simulated extreme precipitation at 95th percentile and that of the observations (GPCP, CHIRPS, and TAMSAT)
Table 6 Relative standard deviation (RSD) of observational datasets and CMIP6 datasets for SDII, frequency of heavy rainfall (freq of HR) and extreme precipitation at 95%

3 Results and Discussion

3.1 Rainfall Climatology

To better understand the discrepancies between GCMs and observed records of rainfall over West Africa, we assess the spatial pattern of the climatology of the summer monsoon (JJAS) season. The spatial distribution of the mean JJAS rainfall climatology (Fig. 2) for observations (GPCP, CHIRPS, and TAMSAT) and the 21 CMIP6 datasets shows that all CMIP6 GCMs (except FGOALS-g3) are able to capture the general rainfall pattern of the region and further, show the position of the ITCZ which is located approximately 10° N. The observational datasets present approximately the same spatial pattern of rainfall climatology, but the simulations present some discrepancies relative to the observations and among the models. For example, ACCESS-CM2, KACE-1-0-G, INM-CM4-8, and FGOALS-g3 underestimate the amount of daily rainfall over West Africa with KACE-1-0-G specifically showing lower mean monsoon rainfall values over the Guinea Coast. GFDL-ESM4, CanESMS, and MIROC6 overestimate mean monsoon rainfall with CanESMS presenting an overestimation in the highlands of Cameroon mountains, Sierra Leone coast, and the Guinea mountains. MIROC6 overestimates the mean monsoon rainfall from central Nigeria through to the Sierra Leonean coast and the GFDL-ESM4 shows an overestimation across the whole region even though it is able to represent the expected rainfall climatology pattern spatially.

Fig. 2
figure 2

Mean JJAS rainfall climatology (mm/day) from GPCP, CHIRPS and TAMSAT observational dataset and each of the CMIP6 over West Africa for the period 1997–2014

3.2 Rainfall Intensity

The mean intensity of daily rainfall events is calculated and presented in Fig. 3 for the three-observation dataset and CMIP6 GCMs. Table 3 encapsulates the root mean squared error (RMSE) between the models and observed rainfall intensities from GPCP, CHIRPS, and TAMSAT determined for the West African region. The underestimation of some CMIP6 models and the overestimation of others lead to variations in RMSE over the region. The RMSEs turn to be higher generally when the models are compared with TAMSAT since this observation dataset tends to produce more intense rainfall.

Fig. 3
figure 3

Mean intensity of daily rainfall events (mm/day) from GPCP, CHIRPS and TAMSAT observational dataset and each of the CMIP6 models over West Africa for JJAS 1997–2014

GPCP shows higher mean daily rainfall intensities in Guinea–Sierra Leone region which further spreads to Nigeria with decreasing mean daily rainfall intensities. In the Sahel regions, mean intensities of daily rainfall are lower as compared to the coastal regions. Concerning GPCP, CanESM5, BCC-ESM1, NESM3, MIROC6, and MPI-ESM-1-2-HAM are well able to capture the pattern of rainfall intensities over southern West Africa. These patterns are captured at different rainfall intensity levels in all aforementioned models. The CanESM5, MIROC6, and MPI-ESM-1-2-HAM simulations show a lower rainfall intensity as compared to the GPCP in the Sahel region. INM-CM5-0, KACE-1-0-G, TaiESM1, MRI-ESM2-0, ACCESS-CM2, FGOALS-g3, and INM-CM4-8 show lower intensities over the region and thus, underestimate the pattern observed in GPCP. This is evident in the high RMSEs recorded by the aforementioned models when compared to GPCP (Table 3). FGOALS-f3-L and GFDL-ESM4 which also record relatively high RMSEs are marked by their overestimation of the observational dataset. Similar behavior in RMSE is observed also in CHIRPS, depicting similar errors in CMIP6 datasets against both GPCP and CHIRPS. This is different in the case of TAMSAT. For instance, according to the distribution of RMSE, it is realized that MPI-ESM1-2-h gives out the best presentation with 0.50 mm/day and 0.45 mm/day while FGOALS-f3-L prediction shows the largest error of 3.79 mm/day and 4.02 mm/day as against both GPCP and CHIRPS, respectively. On the other hand, GFDL-ESM4 was found as the best model for predicting accurate rainfall intensity, while FGOALS-g3 shows the largest error when compared with TAMSAT. A common feature associated with the GCMs is the underestimation of daily rainfall intensities in the Sahel regions. Despite the disparities in intensities, it is observed that the majority of the GCMs depict the general pattern as represented by the observations. The difficulty in simulating accurately monsoon rainfall intensities is consistent with previous studies using CMIP Phase 5 models (e.g., Akinsanola and Zhou 2019a) and seems persistent with the new state-of-the-art CMIP6 models.

3.3 Frequency of Wet Days

The mean frequency of wet days expressed as a percentage of the total annual days is calculated and shown in Fig. 4. The observations show approximately similar patterns and frequencies with wet spell frequencies typically between 30 and 70% except for orographic regions (e.g., Cameroon mountains and Guinea highlands) which experience about 80 percent of wet days. This is consistent across all the observations. Considering the ensemble members, it is observed that the majority (e.g., MPI-ESM1-2-LR, MPI-ESM1-2-h, CanESM5, INM-CM5-0, IPSL-CM6A-LR, TaiESM1, MRI-ESM2-0, MIROC6, INM-CM4-8, SAM0-UNICON, NESM3, ACCESS-ESM1-5, MPI-ESM-1-2-HAM, and GFDL-ESM4) represent an overestimation of frequencies ranging between 80 and 100 percent. On the contrary to other models, FGOALS-f3-L shows similar frequency distribution to GPCP but does not record spells further north as seen in GPCP. In general, all ensemble members explicitly show the pattern of wet days but with high discrepancies in their frequencies. Despite the weak performance in the models’ ability to capture some daily precipitation features over West Africa, there seems to be a fair agreement among CMIP6 simulations and observation in rainfall frequency unlike in their intensities.

Fig. 4
figure 4

Same as Fig. 3 but for mean frequency of wet days (expressed in percent of total seasonal days)

In Fig. 5, frequencies of mean heavy rainfall events are expressed as days within the monsoon seasonal days. From observation, all 3 observational datasets (GPCP, CHIRPS, and TAMSAT) record approximately the same frequencies of heavy rainfall events along the Guinea highlands with some disparities along the Cameroon mountains. CanESMS, MRI-ESM2-0, NESM3, and ACCESS-ESM1-5 shows some similarity to the pattern of the frequencies identified in the observational dataset but mostly with a different frequency of events. MPI-ESM1-2-h, INM-CM5-0, KACE-1-0-G, BCC-CSM2-MR, IPSL-CM6A-LR, INM-CM4-8, ACCESS-CM2, and MPI-ESM-1-2-HAM present lower frequencies of mean heavy rainfall events. BCC-CSM2-MR, INM-CM4-8, and MPI-ESM1-2-h spatially show spatial intensities ranging between 3 and 13 days along the Guinea Coast–Savanna regions. These are underestimations when compared to observational datasets. In MIROC6, NESM3, BCC-ESM1, FGOALS-g3, and GFDL-ESM4 the frequencies over Sierra Leone and Guinea are overestimated as compared to observations at frequencies from 10 to 25 days. Majority of CMIP6 ensemble members capture the pattern with the highest frequencies along the Guinea–Sierra Leone stretch and the Cameroon highlands. The performance of the models to represent the frequency of heavy rainfall can be ranked based on the values of the RMSE. IPSL-CMGA-LR has the best representation of the frequency of heavy rainfall with 0.86 mm/day and 0.83 mm/day when compared with CHIRPS and TAMSAT, respectively. TaiESM1 also gives the best performance with 0.85 mm/day when compared with GPCP. It is interesting to note that MIROC6 and GFDL-ESM4 recorded the largest errors against all three observational datasets.

Fig. 5
figure 5

Same as Fig. 3 but for mean frequency of heavy rainfall events (expressed in days within the season)

3.4 Daily Extreme Precipitation Events (95th Percentile)

The extreme precipitation events greater than the 95th percentile are presented in Fig. 6. The model simulations show wide variabilities among themselves as well as with GPCP. Relative to GPCP, CanESM5, BCC-ESM1, FGOALS-f3-L, MIROC6, GFDL-ESM4 simulate extreme precipitation events with higher estimates with spatial variations while MPI-ESM1-2-LR, MPI-ESM1-2-h, KACE-1-0-G, INM-CM5-0, NESM3, ACCESS-CM2, TaoESM1, MRI-ESM2-0, NorESM2-MM, ACCESS-ESM1-5, FGOALS-g3, and MPI-ESM-1-2-HAM depict similar patterns. It is important to note that MIROC6, BCC-ESM1, CanESM5, and GFDL-ESM4 show greater than expected estimates than all observation datasets over the coast. There is a spatial disparity to some overestimation which cannot be generalized. For instance, TaiESM1, MRI-ESM2-0, FGOALS-g3, and IPSL-CM6A-LR mostly overestimate rainfall extreme events over the Guinea highlands and along the Cameroon mountains. The southern parts of the region (Guinea Coast-Savanna) show high extreme events whereas areas north of the region (Sahel–Sahara) show lower values of rainfall extremes. This means the southern regions experience more extreme rainfall events as compared to the northern regions. INM-CM4-8 on the other hand tends to produce a lower 95th percentile over the entire domain due to lower intensities and a lower number of heavy precipitation events as mentioned earlier. The performance of models using RMSE is presented in Table 5. MIROC6 and GFDL-ESM4 show the largest errors in representing rainfall extremes as observed in the frequency of heavy rainfall events when compared with all observation datasets. On the other hand, INM-CM4-8 and INM-CM5-0 seem to perform well in representing extreme events as seen in observational datasets. In the case of TAMSAT, ACCESS-CM2 does well in representing extreme events with a minimum error of 1.26 mm/day.

Fig. 6
figure 6

Same as Fig. 3 but for mean 95th percentile of daily rainfall events (mm/day)

The intercomparison of the relative standard deviation (RSD) of observational datasets and CMIP6 datasets during the monsoon season is displayed in Table 6. INM-CM4-8 and INM-CM5-0 display less deviation from their long-term average rainfall characteristics, with FGOALS-f3-L recording the highest deviation in rainfall intensity and extreme precipitation at the 95th percentile, while MPI-ESM1-2-LR shows the highest deviation in the frequency of heavy rainfall.

3.5 Mean Maximum Wet Spell Length

Comparisons between CMIP6 simulations and observations are shown in Fig. 7 for mean maximum wet spell length. The main features of the mean wet spell length patterns are captured by the models but with errors in their amplitude and exact location. FGOALS-f3-L and BCC-ESM1 models out of the 21 models used in the study show results consistent with observations. The remaining models present a wetter climate than the observed with KACE-1-0-G and BCC-CSM2-MR showing a wetter climate only over Liberia, Sierra Leone, north of Nigeria, and Cameroon than the observed. Wetness to dominate in south-western Africa in all the other models. Some regions in the midlatitudes experience slight wetness but these features are robustly simulated by the models. Overall, the majority of models considered were inconsistent with observations.

Fig. 7
figure 7

Same as Fig. 3 but for mean maximum wet spell length (expressed in percent of total seasonal days)

3.6 Mean Maximum Dry Spell Length

The mean maximum dry spell length is calculated and presented spatially in Fig. 8. The models capture different magnitudes of the dry spell and high spatial variability. For instance, MPI-ESM1-2-LR, MPI-ESM1-2-h, FGOALS-f3-L, ACCESS-CM2, MRI-ESM2-0, NESM3, ACCESS-ESM1-5, FGOALS-g3, MPI-ESM-1-2-HAM, and GFDAL-ESM4 show a higher number of spell length ranging between 70 and 100 days, which opposes what the observations presented. In addition, the aforementioned models show a wider area of dry spell lengths relative to the observations. From observational datasets, the high spell lengths are seen around northern Niger but are not represented by the models. Most of the models show a higher number of spells around 15° N. In addition, the dry spell length that is observed around the Cameroonian highlands is not captured in any of the CMIP6 simulations.

Fig. 8
figure 8

Same as in Fig. 3 but for mean maximum dry spell length (expressed in percent of total seasonal days)

4 Summary and Conclusions

The ability of GCMs to simulate robust precipitation change at the daily timescale is of importance to society and likewise for decision and policymakers in managing water resources, drought, and flood monitoring and agriculture. However, identifying and quantifying daily precipitation events as a proxy to understand and assess climate change over Africa is a difficult task. Thus, it is important to assess the performance of the new state-of-the-art CMIP6 models in representing daily precipitation characteristics over West Africa.

In this study, we present an evaluation and intercomparison of the daily precipitation characteristics and extremes of 21 GCMs from the CMIP6 model stack. Three observational datasets (GPCP, CHIRPS, and TAMSAT) were used for the validation of the model simulations. The results show that observed datasets present nearly the same spatial pattern and magnitude of rainfall characteristics, but models show substantial discrepancies in comparison. The focus is on the daily rainfall characteristics such as mean rainfall climatology, and extreme indices such as the intensity of rainy days, frequency of heavy rainfall, and extreme events as well as mean maximum length of dry and wet spells within the monsoon season JJAS over West Africa from 1997 through 2014. The models were further subjected to statistical assessments to quantify their performance relative to the observations. We used the root mean squared error (RMSE) and relative standard deviation (RSD) to quantify the performance of the models. The RMSE gives the errors and RSD gives the standard deviation of a set of data as small or large when compared to the mean.

The observation datasets regardless of their differences agree in most of the statistics and spatial representations. For precipitation intensity, the observations present a similar pattern and magnitude. Similarly, the simulations demonstrate a general pattern of the monsoon season in comparison with the observations. However, considerable discrepancies exist among the simulations and relative to the observations. For example, substantial differences exist in terms of mean precipitation climatology, the intensity of rainy days, frequency, extremes, and duration of rainfall events during the WAM period. FGOALS-f3-L and GFDL-ESM4 present a more intensified daily rainfall event. The models clearly depict the frequency of wet days similar to observations but present much more intensity in the Guinea Highlands and Cameroon mountains, resulting from the simulation of a larger number of heavy precipitation events indicating more intense extremes. Specifically, more intense extremes are expressed in the 95th percentiles in CanESM5, BCC-ESM1, FGOAL-f3-L, BCC-CSM2-MR, MIROC6, and GFDL-ESM4. All the models overestimate the mean maximum wet spell length except FGOALS-f3-L which presents a similar magnitude as that of the observations. The spatial patterns of the mean maximum dry spell length expressed as a percent of total seasonal days from the observational dataset and the models show a good general agreement across the different models, and the observations except for NESM3, FGOALS-f3-L, MPI-ESM1-2-h, and MPI-ESM1-2-LR that overestimate the dry spell length in the Sahara subregion. The computed RMSE values show the weakness in MIROC6 and GFDL-ESM4 in representing the correct extreme indices such as frequency of heavy rainfall and 95th percentile extremes. For the frequency of heavy rainfall, TaiESM1 performs better when compared with GPCP and IPSL-CMGA-LR performs better when compared with CHIRPS and TAMSAT. This observation is different in the case of the 95th percentile extreme where INM-CM5-0 and INM-CM4-8 models perform more closely to all observational datasets compared to the rest of the models.

The investigation of the model performance of the individual models from CMIP6 suggested that the skill of the models generally varies from model to model at spatial scales. The difficulty in attributing discrepancies in individual model datasets may be due to the systematic differences in the representation of hydrological processes since the process is vital in precipitation formation. It can be inferred from the present study that no single model exhibits all features of the observational datasets. The discrepancies in the simulations give room for improvement of the models in the next version of CMIP. The results offer useful information about the precipitation in CMIP6 over West Africa and can serve as a reference for the new generation of climate models over the continent.