20.12.2002 / 006 uk – CoRD and OSS

Collection of Raw Data

Task Force

Meeting N° 7

12 MARCH 2003

Doc. CoRD 097

Draft Discussion paper on Open Source Software

For discussion

Status of this document

Following discussions at the CoRD Task Force and the STNE Working Group meetings, in October 2002, this document has been produced, outlining the concept of Open Source Software withih the European Statistical System.

Table of contents

1.Context......

2.What is OSS?......

3.Case studies......

3.1.CIRCA......

3.2.IDEP/CN8......

3.3.STIPES......

4.Conclusions......

References......

Glossary......

Annex 1: The Open Source Definition......

Annex 2: GNU General Public License......

Annex 3: The BSD License......

Annex 4: The CIRCA License......

1.Context

At the last meeting of the CoRD Task Force on 7 October 2002, a discussion on open source was launched. The discussion was continued at the STNE Working Group meeting on 8-9 October 2002. It was decided that Eurostat prepare a discussion paper for the next CoRD and STNE meetings.

A concrete proposal to be discussed was “the establishment of a central OSS group within the ESS and the creation of a central repository of statistical OSS” (Doc CoRD 087, minutes of the CoRD meeting on 7 October 2002).

This paper briefly introduces the open source concept (chapter 2) and presents three cases of interest (chapter 3) before drawing some conclusions (chapter 4).

2.What is OSS?

There are many successful OSS projects. Three prominent examples are:

  • Apache, which runs over 50% of the world’s web servers.
  • BIND, the software that provides the domain name service for the entire Internet.
  • Linux, the first practical free operating system.

One of the instigators of OSS was Richard Stallman; he started the Free Software Foundation (FSF) and the GNU (GNU’s Not Unix) project as early as 1984. Some consider the development of the Internet (or parts of it, e.g. BIND) and Unix (or certain flavours of it) as open source developments that took place even before FSF and GNU. Today’s most quoted definition of OSS, however, was written in 1997 by Bruce Perens who founded the Open Source Initiative (OSI). This OSS definition is known as the Open Source Definition (OSD).

OSS is not public-domain software and is not freeware. Public-domain means that the author surrenders his copyright rights. Freeware does not give modification or redistribution rights to the user. OSS, however, is copyrighted and covered by a license which gives the licensee a great amount of freedom in the area of further development (modifications, enhancements, localisation, peripherals, integration, bug fixes and re-distribution).

The OSD (see annex 1) is not, in itself, a software license. The most popular examples of OSS licenses are the GNU General Public License (GPL) and the Berkeley System Distribution (BSD) license (see Annex 2 and 3), but there are more. An OSS license protects the copyright of the software author, but gives the users more rights than they get with non-OSS products. These rights include, for example, free re-distribution and the right to modify the source code.

The benefits of OSS include:

  • Software of common interest is made available free of charge to others – expanding the area of development and reducing overall development costs.
  • Source code adaptations (e.g. localisation or migration to other platforms) and improvements (e.g. bug fixes or additional functionality) can be made by every user – and reported back to the source code owner who may integrate them into the original code.
  • The source code owner can act as the focal point of a group with a common interest – this is for example of interest in the case of EU and ESS where the European Community / European Commission / Eurostat could play this role.

These are examples only, there are more benefits.

3.Case studies

3.1.CIRCA

CIRCA, the Communication and Information Resource Centre Administrator, is an Internet based groupware tool developed for and owned by the European Community. The European Commission acts as licensor on behalf of the European Community. The CIRCA source code is available to European agencies and national administrations for their own purposes, but CIRCA is not fully OSS. The CIRCA license (see Annex 4) includes a number of restrictions which are not compliant with the OSD. Some examples:

  • The license is restricted to certain European authorities at national and international level.
  • The license is granted explicitly and personally and has to be signed.
  • The license is granted for a period of 3 years.
  • Commercial use of CIRCA is excluded.

Summary

CIRCA is operational and in use, there are hundreds of thousands of end users. The system is maintained and further developed by the European Commission on behalf of the European Community that is the owner of the software. A number of European agencies and administrations are using CIRCA for their own purposes on the basis of a specific CIRCA license. CIRCA is not fully OSD compliant.

3.2.IDEP/CN8

IDEP/CN8, the Intrastat Data Entry Package with the Combined Nomenclature at 8 digit level, is an electronic form for Intrastat declaration, developed and owned by the European Community. Eurostat, a Directorate General of the European Commission, is responsible for the development and maintenance of IDEP/CN8. EDICOM funds are used to finance the project.

IDEP/CN8 is used by about 60,000 enterprises in most EU Member States. Translation, distribution and user support at national level is the task of the respective Competent National Administration (CNA), while Eurostat maintains the source code at EU level.

It was decided to stop the central maintenance of IDEP/CN8 at the end of 2003. CNAs may take over the maintenance at national level or in groups of Member States. The question of future ownership, however, is still open. OSS is one out of several options:

  • Public domain – the source code will be freely available to everybody without license. There is no copyright holder.
  • Open source according to the OSD – in that case a licensor is required (e.g. the European Community) and an OSS license has to be selected (e.g. GPL).
  • CIRCA strategy – a specific IDEP/CN8 license is set up. The European Community holds the copyright.

The pros and cons of each of these approaches have to be analysed.

Summary

IDEP/CN8 is operational and in use, there are tens of thousands of enterprises using it for their monthly Intrastat declaration. The software is maintained by the European Commission on behalf of the European Community being the owner, but the central maintenance will end in December 2003. A number of national administrations will continue to distribute IDEP/CN8 in their respective countries after 2003, but the ownership and maintenance question is still open. Open source could be a solution.

3.3.STIPES

STIPES (Statistical Inquiries from Popular European Software) is an IDA funded Eurostat project in the framework of SERT (Statistiques d’Entreprises et Réseaux Télématiques – Business Statistics and Telematic Networks). The objective of STIPES is to create a software that will convert most of the data formats generated by popular business software packages (like the SAP Business Warehouse) into XML-formats required by Statistical Offices (like the format for e-Quest).

There is a proposal that the converter will be Open Source, but the details will have to be clarified during the final phase of the project. STIPES will end in December 2003. Basically, there are the same options as described for IDEP/CN8 above.

Summary

The STIPES converter is still under development, bur there is a proposal that it will be open source. This means, the technical development will run in parallel with the establishment of a complete open source strategy for the final product. The experiences with CIRCA and IDEP/CN8 can serve as input for this process.

4.Conclusions

  • OSS is feasible.
  • OSS would be beneficial to Eurostat and national administrations.
  • OSS is of interest also for the public service.
  • STIPES should be OSS and could be a test case for the ESS. IDEP/CN8 could be.
  • A task force should be established to examine concrete proposals for the use of Open Source Software within the European Statistical System.

References

  • CIRCA license: Licensing CIRCA to Other Administrations – Version 1.2, 15 March 2002
  • GNU and FSF website:
  • GNU General Public License:
  • IDA (2002). Pooling Open Source Software (POSS). An IDA Feasibility Study. European Commission, DG Enterprise, June 2002
    Available online in different languages: – Horizontal Actions and Measures – Open Source Software
  • Open Source Definition (OSD):
  • Open Source Initiative (OSI) website:
  • Raymond, Eric S. (1999). The Cathedral and the Bazaar. O’Reilly & Associates, Inc. 1999
    Available online:
  • Stone, Mark, Sam Ockman, Chris DiBona (1999). Open Sources: Voices from the Open Source revolution. O’Reilly & Associates, Inc. 1999
    Available online:

Glossary

BIND / Berkeley Internet Name Domain
BSD / Berkeley System Distribution
CIRCA / Communication and Information Resource Centre Administrator
CNA / Competent National Administration
DNS / Domain Name System
e-Quest / Austrian electronic questionnaire management system
EDI / Electronic Data Interchange
EDICOM / EDI for Commerce
FSF / Free Software Foundation
GNU / GNU’s Not Unix
GPL / General Public License
IDA / Interchange of Data between Administrations
IDEP/CN8 / Intrastat Data Entry Package with the Combined Nomenclature at 8 digit level
OSD / Open Source Definition
OSI / Open Source Initiative
OSS / Open Source Software
POSS / Pooling Open Source Software
STIPES / Statistical Inquiries from Popular European Software
XML / eXtensible Markup Language

Annex 1: The Open Source Definition

Version 1.9

Copyright © 2002 by the Open Source Initiative.

Open source doesn’t just mean access to the source code. The distribution terms of open-source software must comply with the following criteria:

1. Free Redistribution

The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.

2. Source Code

The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost – preferably, downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.

3. Derived Works

The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

4. Integrity of The Author’s Source Code

The license may restrict source-code from being distributed in modified form only if the license allows the distribution of “patch files” with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software.

5. No Discrimination Against Persons or Groups

The license must not discriminate against any person or group of persons.

6. No Discrimination Against Fields of Endeavor

The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

7. Distribution of License

The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.

8. License Must Not Be Specific to a Product

The rights attached to the program must not depend on the program’s being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program’s license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution.

9. The License Must Not Restrict Other Software

The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.

10. The License must be technology-neutral

No provision of the license may be predicated on any individual technology or style of interface.

Annex 2: GNU General Public License

Version 2, June 1991

Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA

Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.

Preamble

The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.

When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.

To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.

For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.

We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.

Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.

Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.

The precise terms and conditions for copying, distribution and modification follow.

TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you".

Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.

1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program.