Last updated: 5 November 2024.
Highlights
This final chapter is designed to provide data providers in national statistical authorities with some background as to how city statistics should be transmitted to Eurostat. It also includes information about a range of tools that have been made available to assist data providers in validating their data files prior to transmission.
This chapter forms part of Eurostat’s City statistics manual.
Preparing data for transmission
Data must be transmitted in *.csv (comma-separated values) format. Data files should be structured according to the example *.csv file presented below (Table 21), respecting the order of the columns.
Data transmission rules
Please note that
- data for different years or data for different variables can be combined into a single file
- data should be provided in 1 single worksheet in MS Excel
- there should be no blank rows between records
- extra columns or extra codes won’t be recognised
- the individual data elements within each record shouldn’t have any empty spaces (blanks) between alphanumeric characters or between numeric values, nor should there be any spaces at the end of each element
- a data element (individual element between the comma separators) must be included in each record for the following fields – city code, variable code, reference period (year) and value; if data aren’t available for 1 or more cities, for example, in a specific reference year, then those records should be excluded from the file (there should be no records with empty elements for the first 4 fields)
- in the case of the final 2 fields – flags and footnotes – if a data element doesn’t contain information, the transmission will still take place and the data will be published without flags and footnotes.
City code and variable code
The data transmission is the result of continuous collaboration between Eurostat and the national statistical offices. The city codelist and variable codelist are an integral part of the documentation when grant agreements are reached between Eurostat and the national statistical offices for the data collection of city and subnational statistics. Complete codelists used within the city statistics data collection exercise are available on Eurostat’s website
Reference period
In general, the reference period is a calendar year; the data in the reference period field should be encoded as a 4-digit value (for example, 2022, 2023, and so on). There are some specific variables that use a different reference period, for example, population counts generally refer to 1 January of each year and enrolment data for school/academic years are classified to the calendar year in which the school/academic year finishes. Any deviations from the prescribed reference period must be reported in the metadata.
Values
The ‘value’ field should only contain numerical characters. Characters such as ‘:’, ‘Z’ or ‘..’ shouldn’t be used when preparing files for data transmission. If a particular value has decimal places, then the decimal separator should be set as a decimal point ‘.’ and not a decimal comma ‘,’. For values that are greater than a thousand, there is no need to use a thousand separator (for example, a value should read ‘52916’ rather than ’52 916’ or ’52,916’).
Flags
Flags are subject to change and any changes in their meaning will be communicated to data providers (and users) in a timely manner. When preparing files for data transmission, please ensure that all flags use capital letters (such as P or E rather than p or e). A full list of flags is presented in Table 22.
Data transmission rules
Please note that
- the ‘R’ flag (to denote a revision) has been removed and is no longer used
- if the ‘C’ flag (to denote confidential data) or the ‘U’ flag (to denote data of low reliability) are used when transmitting data then the associated values will be suppressed from the final datasets that are published by Eurostat
- it’s possible to combine various flags by simply joining them together, although some combinations of flags aren’t permitted (see Table 23 for more details).
Footnotes
When transmitting data to Eurostat there is a field in the standard file structure for footnotes. Within this context, footnotes aren’t mandatory and should only be used when absolutely necessary have a free format text should be meaningful (to provide additional information) shouldn’t be too long (no more than 255 characters, otherwise they will be truncated) shouldn’t contain diacritics/accents (for example, ‘é’, ‘¨’, ‘¸’ or ‘~’).
How to delete previously sent data
In order to delete previously sent data, it’s necessary to resend records with an empty value field for the city code/variable code/reference period combinations that are concerned; this process will delete the existing values in the database. This action should only be used if its purpose is to delete existing data records. In such a case, it’s necessary to add the word ‘erase’ in the footnote column.
Transmission based on EDAMIS
There is a specific tool for transmitting data, including city statistics, between national statistical offices and Eurostat, called EDAMIS. This is Eurostat’s data transmission program. The EDAMIS Web Application (eWA) is already installed in all national statistical offices and there is a local coordinator in each country who can provide access to the eWA and offer any assistance that might be necessary.
To access EDAMIS, it’s necessary to have an active EU Login account. Once an account has been established, it should be possible to access the EDAMIS dashboard.
In the upper left corner of the dashboard, there is a (data) ‘transmission’ button where it’s possible to send data files or to receive feedback on files that have been sent.
Once the data files have been uploaded, the following steps are necessary
- select URBANREG_AN_A (City statistics annual data collection) as the dataset destination
- select the country that is sending the data
- select the reference year.
Pre-validation process
In order to facilitate the data transmission, Eurostat has made a pre-validation tool (struval V-flows) available within EDAMIS. Through this, files can be tested and feedback on their compliance with the validation rules can be provided, concerning both their structure and content. After uploading a *.csv file on EDAMIS, it can be submitted for pre-validation using the dedicated button ‘Pre-validation only’ – this action isn’t considered as a production transmission. After clicking on this button, the sender will receive an automated notification from EDAMIS containing the EDIT validation report. If there are errors, they will be reported to the sender so that they can be corrected before the data file is resent. If there are no errors when resubmitting the data file for pre-validation, then the data file can be sent using the ’Official transmission’ button.
Automated validation reports after sending the data
Within EDAMIS, Eurostat has implemented an automated validation tool for data providers. Validation reports are generated after data transmission. Reports are never sent directly (for example, by e-mail) to data providers, due to data confidentiality and security constraints. However, the EDAMIS service may be configured to send an e-mail to inform data providers about the availability of a report; reports can be provided in machine-readable and HTML formats. National statistical offices should check the reports and correct the data, if necessary, or provide explanations (for any anomalies) if the data are deemed appropriate for publication.
Validation rules
Consistency checks
Within EDAMIS, there are a number of consistency checks, namely on the internal coherence of each dataset. Some examples include simple checks between the values of related variables (for example, the total number of deaths should be greater than or equal to a count of infant mortality) or checks to ensure that the sum of various subcategories add up to the total for all categories (for example, the number of males and the number of females sum to the total for both sexes). When these validation rules fail, EDAMIS generates a report with a set of error messages (an example of some validation rules is provided in the table below).
File structure checks
As well as checking the internal coherence of each dataset, EDAMIS also checks the structural coherency of *.csv files. If there is a problem, then an error message will be generated when transmitting a file. The message typically reads ‘Error reading file’ or ‘Error reading file – structure not recognised. Please check field separator / column names / order of the fields / value format / code lengths etc. (see the technical specifications)’.
In the case that a file fails the test, it’s necessary to edit the structure of the data file so that it complies with the required standards.
Metadata
Reference metadata in the Euro SDMX metadata structure (ESMS) can be accessed
- via Eurostat’s online database, by clicking on the ‘M’ icon that is displayed next to the ‘City statistics (urb)’ folder that is found under Detailed datasets/General and regional statistics
- from the dedicated section on city statistics by selecting ‘methodology‘ from the menu on the left of the page and then following the links to metadata
- directly on Eurostat’s website.
Alongside the city statistics metadata report that gives an overview of the data collection, there are also separate metadata reports covering national data collection exercises. These national reports are produced by national statistical offices and are published by Eurostat.
The ESS metadata handler (ESS MH) facilitates the collection, validation and dissemination of national and European metadata reports according to the metadata standards of the European statistical system (ESS).
In order to use the ESS MH, it’s necessary to have an active EU Login account; the login must also be registered in EDAMIS to ensure that the necessary rights are granted to make use of the ESS MH.
Once access has been granted, the ESS MH can be used by data providers in the national statistical offices to
- prepare metadata reports
- send metadata reports to Eurostat
- consult the status of metadata reports
- modify their metadata.
When making updates, it’s possible to copy the latest version of the metadata and subsequently to make the necessary changes/edits to the file It’s also possible to preview the information before its publication. National metadata files are structured to provide the following information
- Contact
- Metadata update
- Statistical presentation
- Unit of measure
- Reference period
- Institutional mandate
- Confidentiality
- Release policy
- Frequency of dissemination
- Accessibility and clarity
- Quality management
- Relevance
- Accuracy
- Timeliness and punctuality
- Coherence and comparability
- Cost and burden
- Data revision
- Statistical processing
- Comment
National statistical offices can also add ‘Related metadata’ and ‘Annexes’ if they decide to share any additional documentation that may be beneficial to users for interpreting their data. The space in the reports for annexes is particularly useful for providing information about specific variables that have been transmitted to Eurostat (such as information about data sources, spatial units available, deviations from harmonised definitions, and so on).
It’s important to note that national metadata reports mustn’t contain references to grant agreements or other non-public documents.
Source data for tables and graphs
Explore further
Database
- City statistics (urb), see:
- Cities and greater cities (urb_cgc)
- Functional urban areas (urb_luz)
- Perception survey results (urb_percep)