References to the documentation environment environment
We start with a brief overview on the tools the project team to collaborate. Our main platform for working together on the prototype has been GitHub, a web-based service for software development projects that uses the Git revision control system. Occasionally (like for writing this report) we made use of the hbz's Confluence wiki. We used the tool Huboard that is based on the GitHub API to have an overview over the different tasks and their status. Thus, in the course of this report, several references will thus go to GitHub issues and comments where certain aspects are covered in more detail. The OER World Map itself can be found under http://www.oerworldmap.org .
Features of the hbz prototype prototype
Due to the short development time for the prototype the project concentrated on the realisation of an operational service, which allows to:
The prototype is mostly based on data from two different sources:
Along with this data collected from pre-existing sources, there is also some manually added data.
Data model model
We already noted in our proposal that it wasn't clearly defined what kind of resources an OER world map should cover:
Thus, a first and important work on the prototype - as with any software that creates data - was defining the data model for the OER world map data. Of course, this data model should on the one hand be designed with the use cases for the OER world map in mind and on the other hand the data model should take into account the actual information that is provided in the source data from OCWC and WSIS.
Here is what we came up with:
As we are working with linked data, it was clear that internally and for providing the data we would use the Resource Description Framework (RDF). To represent data based on this data model we could almost entirely resort to the schema.org vocabulary which has the advantage that the OER data will also be indexed by search engines like Google and Bing. Only three RDF properties and one class had to be taken from other vocabularies. RDF is very flexible and can be extended very easily. For example we decided at an early stage of the project to include information on services (e.g. OER repositories, search interfaces) in the data. This seemed to make sense as services (e.g. repositories) play a vital role for the OER community and as the world map should be of help in discovering open educational resources. Further extensions and other adjustments will probably turn out necessary within the next phase of the project.
At the end of the project, the data set the prototype is based on includes information on
Application profile profile
We put some time into developing an application profile (AP) using RDF for the OER world map. This application profile expresses the application's data model and configures how the data will be viewed in the Drupal editing and presentation environment (see below). In the future, it should also be used as a basis for validating the data input.
The concept of an application profile comes from the Dublin Core Metadata Initiative (DCMI). In short, an application profile is a set of metadata elements, policies, and guidelines defined for a particular application. The elements may be from one or more vocabularies, thus allowing a given application to meet its functional requirements by using metadata from several vocabularies. An application profile is not complete without documentation that defines the policies and best practices appropriate to the application.
The application profile allows us to have configuration of the Drupal editing and presentation environment and future API validation in one central place. In order to configure the API validation and web site, changes have to be included into the application profile - all connected forms and presentation sites will automatically change accordingly. The AP is maintained on GitHub and enables relatively easy maintainance of the data presentation and validation without having to directly interact with the front end or API developer. (The application profile, in other words, is the means of unambiguous communication between a metadata expert and the developers). This feature accelerates and cheapens the further development of the OER world map.
Drupal view and editing environment environment
The content management system (CMS) Drupal is used to implement views and editing capabilities for the data provided by the API. Thus, we do not use the relational database that comes with Drupal. Instead, a so called Entity Type was implemented to read/write from/to the API. Additionally, a custom Entity Field Query was implemented to query the API. To demonstrate the use-case of linking to external data, the GeoNames Search Webservice is also available via this component.
In order to load the RDF data provided by our API and GeoNames into Drupal, the built-in RDF capabilities of Drupal were extended to not only output, but also read data in RDF. The mappings of RDF-properties and classes to Drupal fields and bundles are parsed from the application profile.
As mentioned above, we link to GeoNames for countries and cities to demonstrate the approach of adding additional context to our data. These links contain data such as the population which could be used for further visualizations. Furthermore, although we do not use Drupal's database for our actual data, all other capabilities of the CMS can be used, e.g. to define users and roles in an editing workflow.
Out of scope scope
Out of the scope of the project were:
Course of the project project
Our initial planning of the project within our proposal turned out to be quite resilient, although it`s strict linear character is misleading, since in fact we worked in several iterations which are difficult to display visually. The following table gives an overview of the course of the project including links to more detailed documentation on GitHub.
The project differed in many aspects from usual hbz projects:
Considering these circumstances, we learned a lot during the project.
Recommendations for further development of an OER World Map
Refinement of the hbz prototype
If the productive OER world map service will be developed based on the prototype created in this project, there need to be several refinements done in order to achieve a reliable production environment.
Refine data model and application profile
One point is the data model and application profile. A wider discussion on how it should look like to enable enough people to add and maintain the OER world map data seems necessary. We have identified some points that should be discussed:
Already for the existing, quite simple data model, there is some information missing in the data because it didn't exist in the core data and/or we didn't have the time to work on getting the information we need out of the data. For example, though we spent some time on the transformation of the WSIS data on OER initiatives, we only pulled out geographic information for organizations but decided not to add the same information to the associated persons. In the future, it would make sense to add the city and country a person is based in either half-automatically or manually.
Automatic data validation based on application profile
In the prototype, data input isn't validated neither on the client nor the API level. To be sure to have a consistent and, thus, maximally useful data set, it will be important to add validation of incoming data on the API level as well as when indexing transformed data from other data sources (like OCWC or WSIS). It is highly desirable to add an automatic method of validating data based on the application profile so that the application profile would serve as a central standard to decide whether data can be added to the data or must be adapted before indexing. As already noted above, the transformation work would very much benefit from such a process because right now transformation is checked against test files which have to be adjusted seperately when the application profile is changed.
Seperate application profiles for validation and presentation
Currently, we have the information which could be used for validating the data input (e.g.:"What kind of strings are allowed in the 'email' field etc.?") in the future alongside with information for presenting the data (e.g.: "In which order should fields be shown and with which labels?") stored in one application profile. It is highly desirable to have seperate documents for these use cases.
Add provenance, administrative metadata and versioning
Currently, the API only holds the actual data about the different resources. There is no information about where the data comes from, how it was transformed and who did this. Also, there is no information available about when a resource description was added or when it was last modified. In other words, there is no provenance data or other administrative metadata available. As this data is important to assess whether a data entry is valid and up-to-date, it should be added if a productive service will be developed based on the prototype. Also, it would be really useful to roll back changes made to resource descriptions. Thus, versioning of the records would be desirable.
Improvements in resource presentation & web form
This paragraph deals with the presentation of information on a resource as well as with the web form for editing this information.
Here is a screenshot of an example of how the information about a resource (here: the organisation "Universidad de Granada") is displayed in the prototype:
We are already quite content with the HTML representations of the information about the different resources (organisation, person, service, project). There are some things that should be adjusted, though, in a production service. And it makes sense in general to experiment with different approaches of presenting the data.
One action item would be to replace the extensive information about a linked resource (in the example for instance the information about "UNIVERSIA" where the organisation is member of) that is shown when clicking on them by less information with the possibility to get more on the respective page of the linked resource. The current approach is especially problematic when the linked resource holds more information than the primary resource.
A desirable and easy to implement feature is to show a organization in a small map instead of indicating the geo coordinates (see current view below).
Currently, the web form reflects quite clearly the data model as defined in the application profile which results in nested boxes that might be difficult to understand by editors. A good example ist the box to add or edit an address:
If the production system is developed based on this prototype, one will have to experiment with other ways of presenting the web forms. For a small group of commited editors, the current web form might be sufficient, though.
Improve world map presentation and interaction possibilities
This paragraph deals with the presentation of the actual map and the possibilities for interaction it provides. Right now we see several options to improve the design of the integrated world map:
General recommendations for collaborative advancement of the project
Discuss editorial process
One important question, which was not included in the scope of our project is the design of the editorial processes for the OER world map. Generally we would argue that as much effort as possible should be carried by the community in order to save costs. Nevertheless it has to be understood that more detailed and sophisticated data models inevitably require more time and understanding on the side of the responsible editor. Without experience and the necessary understanding it can be difficult to distinguish between an organisation, its services and its projects. For example we found that WSIS classified some entries as "communities". We transformed them into services which are run by an organisation. Sometimes these differenciations can be hard to decide.
Counteractions to keep community participation high could be:
But even if motivation of the community to participate is high we would recommend to integrate some kind of editorial quality control in order to avoid data inconsistencies and to make sure that data is updated regularly. Since we consider it rather unlikely that there will be one organization which commits itself to editing the complete world map data (unless it is paid for this), a decentralized solution probably will be favourable. It would be very helpful to convice organizations which use parts of the data for their own purposes (like the UNESCO/WSIS Knowledge Community or the OCWC) to use the OER World Map platform to collect and edit their data. Additionally it might be suitable to appoint national responsibilities in the way that a group of volunteers takes over the responsibility for editing the data of a country.
In each case the platform should be developed to support these processes by defining different editorial rights for different users. It could also be helpful to include an alert service which makes it possible for users to report outdated data to the editorial team with one click.
Maximize usage and ensure sustainability
How to attain sustainability seems to be one of the most important questions of the upcoming phase II of the World Map project. Since it will be difficult to define a business model, especially in the short run, it would be very helpfull if the funding of phase II would be extended, so that there will be possibilities to run and refine the system for 2-3 years after its initial development.
Avoiding redundant collection and editing of the data in different projects would also be very helpful in order to reduce overall costs for the OER community. Therefore potentially existing possibilities of cooperation should be investigated carefully. Ideally initiatives like the OCWC, UNESCO/WSIS Knowledge Community, or the OER Research Hub should bundle their resources and use the OER World Map respectively the OER data hub as a common plattform. However our experience shows that such cooperations are difficult to achieve, since there are often slight differences within the needed data models. Since the problems that result from different data models could be largely compensated by the LOD approach and since the OER community emphasises cooperation it might be nevertheless possible in this case.
Apart from these provisions we would recommend to maximise the additional value of the data to the OER community. Once there is value, it is easier to generate revenue. One simple way to do so would be to link the data to other relevant data sets, as has been approved by Tim Berner Lee
The delivered prototype demonstrates that hbz offers a platform for developing a scalable production system of the OER World Map which provides maximum data connectivity and reusability. In doing so, it combines elements of open source, open data and open educational resources. It is important to distinguish between the OER World Map as the front end of the system and the OER data hub as its backend. Althoug the actual focus lies on the frontend of the system, we expect that in the future there could be many other applications which could be developed using the data of the OER data hub. The OER World Map data, especially the institutions model an important backbone of the OER ecosystem. Extending this model by linking it to other resources will maximize the added value of the data, which increases the chance that a OER World Map production system will be sustainable.