Semantic Turkey Developers Manual

This section provides detailed information for people interested in developing extensions for Semantic Turkey or even in modifying the core system. Currently, it contains generic information about the architecture of the system. We will add more details soon.

Introduction

We provide here more insights on the behind-the-stages of Semantic Turkey.

You may then get more information about the structure of the Semantic Turkey project, by learning about each of its composing modules and getting through their apidocs

Building Semantic Turkey

Detailed instructions for building Semantic Turkey are available here.

Semantic Turkey Architecture

The architecture  of Semantic Turkey (figure below) offers an extensible application level, with services organized through Spring Dynamic Modules and published as OSGi services

architecture

Service Layer

Services are implemented as Java methods, automatically converted - through a series of conventions - into REST-like web services. The application comes hosted on the Karaf OSGi container. The set of services can be incremented, thus augmenting the functionalities of the platform, through the development of dedicated extensions.

Data

Semantic Tukey relies on the RDF middleware RDF4J for managing connections to triple stores and for querying/manipulating data.

On top of RDF4J, Semantic Turkey provides.

The rationale for these additional layers is to provide user-driven perspectives on the connection, limiting operations (e.g. read operations get access to a repository connection proxy forbidding all write operations), simplifying query development (e.g. the developer never has to resolve the lexicalization of resources, this is normally managed by ST through a dedicated extension points, with the connected plugin handing out to ST the query fragment to be inserted into the service query) and benefiting from other facilities provided by the above components.

Connecting to other triple stores

Since version 1.0, Semantic Turkey allows for creation of repositories by using the embedded RDF4J store or by connecting to an RDF4J compliant triple store through a remote connection. Tested triplestores for remote connections are: RDF4J itself, and GraphDB (GDB). The historical rationale for this choice is explained here (question on external triple stores).

The compatiblity with other triple stores should be analyzed on the basis of the following aspects:

The first aspect is necessary in order to connect to the store. Some triplestores provide RDF4J compatible client libraries, exploiting other proprietary protocols but wrapping them with RDF4J API. These repositories could be possibly included in the range of storage solutions for ST.

The second aspect is twofold: mainly, it is important for the activation of features such as History and Validation. In order to enable them, we have developed a change-tracking component (implemented as a Sail) that must be deployed into the triplestore, providing that the triplestore is compliant with the Sail framework. Secondly, sail-compliant triple stores can be easily accessed by ST for configuring and creating repositories into them (the creation of a project in ST involves the handing of a Sail configuration), while others would foresee the creation of the repository directly through the store, and later connecting the repo from the project in ST.

The third aspect, while not binding technologically the store, is important as well: the SPARQL compatibility is not limited to implementing the range of SPARQL 1.1 features that are expected to be found by ST, but to the way queries are resolved. Despite the standardization efforts performed by the W3C for providing uniform languages and protocols for the web, triplestores are not always fully interchangeable even by considering the sole SPARQL compliancy. The reason is performance and query resolution policies. Often, different stores respond differently to queries because of different resolution strategies adopted by their query processors. Striving at interoperability, we have tried to organize queries in such ways that their resolution would follow our expected order (e.g. byusing nested queries), yet it cannot be guaranteed that all stores would perform the same on the same set of queries developed for ST services. If a store has a very different resolution strategy and this result in many services performing badly, then there is not much that can be done (with reasonable effort).

So, reassuming:

In all cases above, the SPARQL compatibility must be verified for the candidate triplestore. We suggest first to confirm compliancy with SPARQL 1.1 by the store specifications, and then it should be in any case tested for its performance response to ST queries.

For the above reasons, we have recommended using RDF4J or GraphDB as external triplestores. GraphDB natively accepts RDF4J remote connections with no need of any dedicated library, and implements and supports the Sail stack mechanism, so that the change-tracking component can be used with it as well. The only thing that we developed specifically for GraphDB is a Sail configurator so that it is even possible to create GDB repositories remotely by completely specify their configuration.

If you have other triplestores that you would like to see connected to Semantic Turkey, you can investigate on their architecture and we can support you in developing (if necessary) the required components for connecting to it. These components might include:

Project Management

A dedicated Project Management section of the development manual details the structure of projects and how these are handled by the ProjectManager of Semantic Turkey.

Logging

Logging is managed through the Simple Logging Facade For Java (SLF4J) framework, in a log4j implementation provided by Pax Logging. The log4j logging configuration file is located in:

<serverfolder>/etc/org.ops4j.pax.logging.cfg

See log4j manual if you don't know how to configure this logger

A System Restart is needed for changes on the logging preferences to take effect

The logs will be stored in:

<serverfolder>/data/log/karaf.log

In order to log the SPARQL queries submitted by Semantic Turkey, the core logger of semantic turkey needs to be set to debug

log4j2.logger.semanticturkey.level = DEBUG

Extension Point development

Extension points are available for implementing in different ways existing functionalities of the system, thus enabling developers to virtually customize them according to any user need. Currently, available extension points include:

The Extension Point Development Guide explains how the extension point architecture works, and which extension points have been activated and are recognized by the system.

Developing your first extension

We have a quite outdated tutorial for writing extensions (it was even written when there were a default Semantic Turkey UI extension for Firefox, plus much has been changed in the ). The tutorial can however be still useful for the server side.

The general rule is however to take inspiration from the st-core-services module of Semantic Turkey for developing service extensions. This module, though part of the Semantic Turkey project, can be managed as an independent project and is in fact deployed as separate OSGi bundle, that is loaded by the main system in an identical way to a third-party extension.