Author: João Rocha

Data Citation roadmap for Scholarly Data Repositories

A Data Citation Roadmap for Scholarly Data Repositories

This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles (Data Citation Synthesis Group, 2014), a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Early Adopters Expert Group, part of the Data Citation Implementation Pilot (DCIP) project (FORCE11, 2015), an initiative of and the NIH BioCADDIE (2016) program. The roadmap makes 11 specific recommendations, grouped into three phases of implementation: a) required steps needed to support the Joint Declaration of Data Citation Principles, b) recommended steps that facilitate article/data publication workflows, and c) optional steps that further improve data citation support provided by data repositories.


A really interesting article about data citation. The idea of adding JSON-ld to the page headers may be interesting for SEO in Dendro.



FactForge – A potential source of interesting facts for LOD applications

“FactForge is an open-access platform that you can use as a convenient entry point to the web of interconnected data. It is Ontotext’s Web application based on GraphDB, that allows everyone to explore some of the most central Linked Open Data datasets alongside and in connection to news and newly published data such as the Panama papers and Trump Data World (coming soon).

Offering access to more than a billion of facts, FactForge includes datasets from DBPedia (the structured version of the Wikipedia encyclopedia), Geonames (a worldwide geographical database containing over 10 million geographical names), Wordnet (a semantic dictionary for English where words “are grouped into sets of cognitive synonyms”), WorldFacts (a dataset about countries, languages, currencies and other related information) and more. You can check the entire list of datasets in FactForge – Open Data and News about People, Organizations and Locations.”


Exploring Linked Open Data with FactForge

InfoLab posters at the RDA 9th Plenary at Barcelona

Check out our posters on DataPublication@UPorto and also on Dendro, a “Dropbox + Semantic Wiki”-like solution for researchers to describe their data as they create it.

thumbnail of RDA Poster TAIL

With Dendro, you can upload files and describe them easily and reducing duplicated efforts. Afterwards, you can package those files and send them to a repository, say EUDAT’s B2Share, CKAN, Zenodo, DSpace, FigShare… Dendro takes care of the submission. And can be easily installed in your own laptop or server in a few easy steps. A Demo Instance is also available here.

thumbnail of RDA Poster Dendro

We also highlight LabTablet, our very own Android electronic laboratory notebook that can use onboard sensors of your mobile phone to assist you in describing your datasets automatically. It even supports voice recognition.

All our code is free and open-source here. Drop us a message if you want to collaborate, we are a group of researchers passionate about research data sharing and we love to build stuff as well.

InfoLab is at the EUDAT Semantic Working Group Workshop, in Barcelona

EUDAT Semantic Working Group Workshop 2017 Barcelona


Our research group is present at the EUDAT Semantic Working Group Workshop, 9th RDA Plenary, which is taking place in Barcelona 3-4 April 2017. Our presentation about Dendro, our research data description platform, and its integration in the research workflow, is here (CLICK to download):


thumbnail of EUDATWorkshop2017Barcelona.


In the first session, we also had presentations about:

  • BioPortal, a portal for the discovery and description of biomedical ontologies
  • The EBI OLS Ontology Lookup Service which is “a repository for biomedical ontologies that aims to provide a single point of access to the latest ontology versions”
  • The Linked Open Vocabularies platform, a platform for sharing ontologies online which includes all their past versions
  • BioSharing, a “curated, informative and educational resource on inter-related data standards, databases, and policies in the life, environmental and biomedical sciences”.
  • AgroPortal, an ontology deposit, discovery and description portal for agronomic ontologies.

In the second session, we had:

  • Europeana, who presented entity discovery capabilities. This provides better search, allowing semantic auto-completion of entities for example.
  • DTL‘s FAIRifier and other open-source tools for making data more FAIR (Findable, Accessible, Interoperable, Reproducible). FAIRifier transforms data into RDF, FAIR Data Point holds it after it is transformed so that it can be found, ORKA allows it to be described, and the FAIR Search Engine allows users to retrieve the annotated datasets. Currently, these solutions support OAI-PMH through B2FIND,, and FAIR Data Point.
  • DataONE presented its approach based on EML, which allows for the annotation of the headers of a table to convey their exact meaning, for example. An ontology-based drill-down of search was particularly interesting. When querying for ‘tree’, the interface would allow the user to refine results by tree-related properties.
  • Lifewatch, who presented its infrastructure for registering, annotating and publishing biodiversity records, in the context of the ENVRI Plus project.


Photo credits: @EUDAT

Do you write a lot of Bash scripts? Please try this

Image credits:
Image credits


To write the scripts for deployment of Dendro, we recently found shellcheck, an open-source checker for bash scripts. You can try it out online here.

Combined with the power of the Atom, someone created linter-shellcheck, a linter for bash scripts that helps you spot those minor mistakes that end up costing a huge amount of time when writing for Vagrant deployment.

First, install shellcheck. On the Mac, I used the following:

brew install shellcheck

shellcheck -V

Then, install Atom and the linter-shellcheck plugin.

Welcome to XXI century bash programming.

The 7th ConfOA (Luso-Brazillian conference on Open Access)



The FEUP Infolab RDM group was present at the 7th Portuguese and Brazillian Open Access Conference, at Viseu, Portugal. This conference covered some of the latest developments in repositories and libraries from these two countries.

The adequate management of both research publications and datasets was a strong topic, and the FEUP Infolab RDM group was present with 6 contributions:

Modelos de Metadados para a Gestão de Dados de Investigação – Abordagens de Desenvolvimento de Acordo com a Dimensão do Projeto Científico
João Aguiar Castro, Cristina Ribeiro

Modelos de Metadados para a Gestão de Dados de Investigação – Abordagens de Desenvolvimento de Acordo com a Dimensão do Projeto Científico
João Aguiar Castro, Cristina Ribeiro

TAIL—Gestão de dados de investigação da produção ao depósito e à partilha
Cristina Ribeiro, João Rocha da Silva, João Aguiar Castro, Ricardo Carvalho Amorim, João Correia Lopes (Presentation available in Portuguese)

Construção de um Repositório de Dados Oceanográficos
Ricardo Amorim, João Castro, Inês Garganta, Artur Rocha, Gabriel David (Presentation available in Portuguese)

Social Dendro – Aplicação de Conceitos de Redes Sociais à Gestão de Dados de Investigação
João Rocha Silva, Nélson Pereira (Presentation available in Portuguese)

Vocabulários Controlados na Descrição de Dados de Investigação no DENDRO
Yulia Karimova (Presentation available in Portuguese)