Improving the climate data management in the meteorological service of Angola : experience from SASSCAL

The knowledge on climate variability in parts of Southern Africa is limited because of the low availability of historic and present-day ground-based observations (Niang et al., 2014). However, there is an increased need of climate information for research, climate adaptation measures and climate services. To respond to the challenges of climate change and related issues, Angola, Botswana, Germany, Namibia, South Africa and Zambia have initiated the interdisciplinary regional competence centre SASSCAL, the “Southern African Science Service Centre for Climate Change and Adaptive Land Management”. As part of the initiative, Germany’s national meteorological service (Deutscher Wetterdienst, DWD) cooperates with the meteorological services of Angola, Botswana and Zambia in order to improve the management and availability of historical and presentday climate data in these countries. The first results of the cooperation between the German and the Angolan Meteorological Services are presented here. International assessments have shown that improvements of the data management concepts are needed in several countries. The experience of this cooperation can therefore provide hints for comparable activities in other regions.


Introduction
The global climate change issue is stretching the requirements for climate data and data management systems far beyond those originally conceived when the original observational networks were established (WMO, 2011).However, the lack of adequate data and observation systems in Africa seriously hinders the ability of scientists to assess the past and current state of the climate in the region (ACC, 2013).This also applies to parts of Southern Africa, where the knowledge on climate variability is limited because of the low availability of historic and present-day ground-based observations (Niang et al., 2014) To contribute to reduce this lack of data, the Southern African Science Service Centre for Climate Change and Adaptive Land Management (SASSCAL) initiative supports, among others, activities focused on the installation and reparation of Automatic Weather Stations (AWS) within the SASSCAL region: Angola, Botswana, Namibia, South Africa and Zambia (Kaspar et al., 2015a).The current1 AWS Network of SASSCAL comprises 132 AWSs.
To meet the expanding needs of climate information for research, climate adaptation measures and climate services it is very important that climate data, both current and historical, are managed in a systematic and comprehensive manner (WMO, 2011).
Climatological data are most useful if they are edited, quality-controlled and stored in a national archive or climate centre and made readily accessible in easy-to-use forms.Although technological innovations are occurring at a rapid pace, many climatological records held by National Meteorological and Hydrological Services (NMHSs) are still in nondigital form.These records must be managed along with the increasing quantity of digital records.A climate data man-agement system (CDMS) is a set of tools and procedures that allows all data relevant to climate studies to be properly stored and managed (WMO, 2011).A well-constructed CDMS facilitates all the key processes associated with data collection, quality assurance and archival, and is central to the development of all interactive data and information services (GFCS, 2014).
In this context, the SASSCAL initiative supports an activity focused on the improvement of the historical and ongoing climate data management at the NMHS of Angola, Botswana, Namibia and Zambia in cooperation with Germany's national meteorological service (Deutscher Wetterdienst, DWD).However, the NMHS that actually showed interest in cooperating with the DWD for improving their data management were the Instituto Nacional de Meteorología e Geofísica of Angola (INAMET), the Department of Meteorological Services of Botswana (DMS) and the Zambia Meteorological Department (ZMD).
An overview of the status and first results of the cooperation between the DWD and INAMET is presented here.

Management of historical and current climate data
The cooperation between the NMHSs is focused on improving the management of climate data in each country in order to make them available for scientific use and for decision makers and diverse stakeholders.More specifically, the cooperation focuses on: implementing a Climate Data Management System (CDMS); developing operational concepts for the CDMS; archiving current meteorological observations; collecting and archiving of already digitized historic climate data; digitizing and archiving of historic climate data; capacity building on Climate Data Management.
The core of the cooperation is to implement a reliable CDMS; "an integrated computer-based system that facilitates the effective archival, management, analysis, delivery and utilization of a wide range of integrated climate data" (WMO, 2014).
During the last decades different CDMS with diverse development approaches have been in use in developing countries (see Stuber et al., 2011), and the delegates of the NMHSs of Angola, Botswana, Zambia and Germany had the opportunity to discuss the different options available during a SASSCAL Workshop held in Namibia in April 2014.At the end of the Workshop it was agreed by all the participants that CLIMSOFT ("CLIMatic SOFTware") is the preferred option, considering that all countries had used this software at some point (Hänsler, 2014).
CLIMSOFT was first developed by an African team of 3 developers located in Zimbabwe (namely Albert Mhanda), Kenya (Samuel Machua) and Guinea (Barry Aziz) to provide a free and easy-to-use CDMS for developing countries (Stuber et al., 2011).As described in Kaspar et al. (2015a), it has an intuitive Graphical User Interface with a key-entry module, quality control procedures and data import options which allow the import of data from various sources, including data from automatic weather stations.The software is currently based largely on the data base management system (DBMS) Microsoft ® Access © and Microsoft Visual Basic 6 ® , but modifications to the software, and a switch to open-source database systems are currently in preparation.One of the most important features of the CLIMSOFT software is its free availability and the fact that it is becoming supported by a large community of developers.The next version of CLIMSOFT is being developed based on the "Climate Data Management System Specifications" of the WMO (WMO, 2014), so that most of the required components are included in the software.

Development of a new data flow at INAMET
The contact between the DWD and INAMET began in 2013 under the SASSCAL framework but it was in April 2014 when an evaluation of the facilities and resources concerning climate data management was accomplished.At that time, the Angolan NMHS did not have an operational CDMS2 and the records of manual weather stations were key-entered in Microsoft ® Excel © (hereafter MS-Excel) tables.These tables were saved in files with different nomenclature at isolated PCs, which made it difficult to identify which data were already digitized and where they were located.Furthermore, an up-to-date inventory of the weather station network was not available, and a unique identifier was not assigned to each station.Data recorded by AWS were automatically sent to a server located at INAMET in ASCII format (text files).The need of an improved documentation in data management procedures was also acknowledged.
After recognizing the needs at INAMET, a data flow structure was designed to avoid inconsistencies in the storage of climate data (see Fig. 1).As a first step, a unique local identifier was assigned to each weather station operated by IN-AMET.For manual weather stations, the identifier was assigned based on an on-paper inventory which was available at the headquarters whereas for AWS, a unique number was newly assigned to each station.According to the agreement achieved during the SASS-CAL Workshop in April 2014, the CLIMSOFT v3.2 was installed to be used as the CDMS in which meteorological data from both, manual and automatic weather stations are to be stored.The software includes features to key-enter historical data directly into the database.However, it was agreed with the Technical Directorate of INAMET to take advantage of the experience of the personnel in MS-Excel and keep this software as the key-entry system for historical and current data recorded at manual weather stations.Table templates have been created based on the original structure of the MS-Excel tables used previously at INAMET.These templates ensure that the entered data are saved in MS-Excel files with the identical structure.The MS-Excel files with data entered before the new data management strategy was implemented have been re-formatted to match the format of the table templates.
These files are then saved in a server instead of in isolated computers using a standardized nomenclature.The technicians, responsible for the import of data and maintenance of the CDMS, can access the data directly from the server and proceed with the import into CLIMSOFT.For this to be done, a format conversion of the MS-Excel files is needed, since CLIMSOFT requires the data to be saved in commaseparated values (".csv") files.An easy-to-use routine called "conversao e importaçao" has been developed specifically for INAMET to facilitate the conversion and later import of data.This application programmed with R (R Core Team, 2015), guides the user through the conversion and import of data following three steps: (1) import of station metadata; (2) conversion of MS-Excel files into ".csv" files; and (3) import the data into CLIMSOFT v3.2 (see Fig. 2).The routine is run once a week to update the database with new key-entered data.The key-entered data available in the database are listed in Table 1.
Concerning the data recorded by AWS, CLIMSOFT provides a tool to import them automatically into the database.This tool is being used to import the data of the AWSs installed by SASSCAL and operated by INAMET.The tool checks every two hours whether there are new records from the AWS and, if so, they are entered automatically into the database.

Quality control
CLIMSOFT provides a number of quality control (QC) checks that are being used at INAMET to control the quality of the data stored in the main database.They include: -Limits check (based on standard deviation).It checks which observed values of a given element are above (below) a specific upper (lower) threshold.The thresholds are calculated based on the monthly mean of that element plus (minus) "n" times the standard deviation.
-Absolute limits check.It checks which observed values of a given element are above (below) a specific upper (lower) threshold.The thresholds are defined by the user in CLIMSOFT and can be modified according to the climate pattern of the country.
-Inter-element comparison.It compares (a) maximum temperature against minimum temperature; (b) dry bulb temperature against wet bulb temperature; (c) maximum temperature against dry bulb temperature; and (d) dry bulb temperature and dew point temperature.
-Daily range.It checks if the daily differences between maximum and minimum temperature are above (below) the mean long-term difference plus (minus) "n" times the standard deviation.
-Consecutive days consistency check.It checks if the difference of observed values between consecutive days are above (below) the monthly mean difference plus (minus) "n" times the standard deviation.
-Consecutive hours consistency check.It checks if the difference of observed values in consecutive hours is above (below) the monthly mean plus (minus) "n" times the standard deviation.
Besides the QC checks provided by CLIMSOFT, an easy-touse tool has been developed with R to carry out a visual QC of the data stored in CLIMSOFT.This tool allows the user to create a variety of plots (time-series, histogram and wind roses) to facilitate the identification of inconsistencies within the datasets.For instance, time series plots led to the identification of rapid changes exceeding a specified threshold or the detection of gaps within a dataset, as shown in Fig. 3.The interactivity of these graphics is aimed to allow the user get information from the dataset by navigating through the graphic.These graphical products are being developed as part of a collaborative development project called Clima-teObject; an R-package that aims to create a great number of graphical products based on climate data.More information about this project can be found in GitHub, a code-hosting repository based on the GIT version control system (Dabbish et al., 2012): https://github.com/StatisticalServicesCentre/ClimateObject.These new features are operational at IN-AMET since January 2016.

Collection of historical data
Another task carried out as part of the cooperation is the collection of historical and current data from Angola that are stored in international archives.Up to date, a total of seven different datasets with digitized meteorological data from Angola were identified (see Table 2).
The data were either downloaded from or directly requested from these sources.These data are being stored in a separate Microsoft ® Access © database until they are checked against those data stored at the INAMET database.In this manner it is ensured that there are no inconsistencies in the main database of the NMHS.This second database is also accessible with CLIMSOFT.
More meteorological data from Angola might still exist in non-digital form in archives of other meteorological services, such as in the DWD (Kaspar et al., 2015b), which could complement the dataset already available at INAMET.
Besides the collection of data, INAMET and the Global Precipitation Climate Centre (GPCC) have come to an agreement by which INAMET approves supplying additional daily precipitation data from the weather stations to contribute to the global gridded precipitation analyses carried out by the GPCC (Becker et al., 2013).This will contribute to substantially improve the data coverage and consequently the reliability of the GPCC precipitation analyses across Angola.

Capacity building
An important aspect of SASSCAL is its educative nature.Having the capacity building activities as one of the main objectives of the initiative, the cooperation between DWD

Conclusions
Efforts have been done within SASSCAL to improve the availability of climate data in Southern Africa.Besides the installation and repair of AWSs, activities focused on the improvement in the management of historical and current climate data are being carried out in the NMHSs of the region in cooperation with the DWD.The experience described here shows the steps done at INAMET in this matter.A new data flow scheme has been developed and implemented, and CLIMSOFT v3.2 has been installed to serve as the core of a proper CDMS, allowing the management of historical and ongoing climate data at INAMET.Furthermore, the R programming language has been used to facilitate visual quality controls and navigate through the data stored in the CLIMSOFT database, as well as to allow the migration of data from MS-Excel sheets to the database.Therefore, the R routines developed can be seen as part of the CDMS, since it extends the functionality of CLIMSOFT.The Climate Data Management System Specifications of the WMO itself (WMO, 2014) states that a CDMS is not expected to contain all of its functionality within a single software package.
These actions are destined to provide a long-term and sustainable solution for the management of climate data at the NMHS.Although the cooperation between the DWD and IN-AMET is still on-going, the experience so far has led to acknowledge that: -An evaluation of the human and technical resources at the NMHS is essential to recognize the actual local capacities.
-The actions aimed to improve the management of climate data should fit the local capacities to ensure their implementation.
-A complete overview of the work flow at the NMHS is required prior taking any action concerning climate data management.
-The directorate of the NMHS has to agree with the actions proposed and should encourage the personnel to implement them.
-It is important that the personnel see the advantages of implementing a new system.
-The CDMS should take advantage of the local expertise when new tools are developed.For example, at IN-AMET, it was decided to keep MS-Excel as key-entry system because the staff were very familiar with this software.
-Capacity building activities are pillar to make people aware of the importance of having a comprehensive CDMS.At INAMET, basic concepts such as metadata or homogenization were introduced to encourage the employees in charge of the management of climate data to treat the data carefully.
-The implementation process should be monitored to assist as soon as any problem arises.
The joint interest of DWD and INAMET is to maintain the cooperation until end 2017 to monitor the improvements made so far and also to take a closer look at other aspects related to climate data management, such as (a) cross-checking Angolan data obtained from international datasets with those stored at INAMET, (b) carrying out quality control of the digitized data stored in the CDMS and (c) locating and organizing on-paper historical records and, if required, supporting data rescue activities for data not yet digitized.
By the end of the cooperation, the improved accessibility of the observational data will allow the application of these data for verification of the national weather forecast run at INAMET.This is of high priority for the NMHS as it will support the provision of more accurate products to end-users and stakeholders, including the scientific community.Understanding predictability and predictive skills of numerical weather prediction models is an important area of research on the weather and seasonal scale (Bauer et al., 2015).

Data availability
Some of the international datasets described in Table 2 provide on-line access to Angolan data.This is the case of the Global Historical Climatology Network (GHCN), the River Basin Information System for SASSCAL (RBIS), the International Surface Temperature Initiative (ISTI) and the Carbon Dioxide Information Analysis Center (CDIAC).The links and references to these datasets are provided in the table.The other sources mentioned in

Figure 1 .
Figure 1.Current dataflow at INAMET: from the key entry of data in MS-Excel tables at the Data Centre to the import of the data into CLIMSOFT through R.

(Figure 3 .
Figure 3.Time series of air temperature in Dambo, Uige (Angola).10 min measurements made by a SASSCAL AWS (top panel) between 6 June 2014 and 17 March 2015.In the bottom panel a zoom of the gap of data, between 23 and 27 December 2014.

Figure 4 .
Figure 4. Photo of the training on Climate Data Management at INAMET in March 2015.

Table 1 .
Overview of Angolan data imported in the CLIMSOFT database from the MS-Excel sheets.

Table 2 .
Overview of Angolan data availability in international datasets.
Table 2 (i.e. the German Meteorological Service, DWD, the Global Precipitation Climate Centre, GPCC; and the Instituto Dom Luiz, IDL) provided the data to INAMET under request.