european data forum

06.06.2012 Copenhagen

Leveraging the data potential in Europe from EC

More data bring work on>

increased BI in public and private sector
world class applications???
data policy and data reusing
multilingual data infrastructure
implementation PSI policy across Europe ensuring compliance
why data matter>
140Ebln revenue to EU27 in open public data biznis
better governance and citizen empowerment
accelerating scientific progress

Kind of data> statistic, geo data, meteo, business info legal info

Revised directive easer reuse of PSI > creation of genuine right reuse public data, all public data no covered by exception is to be reusable, data have to be in machine readable format

New icp calls for opendata in 2014
CEF digital infrastructure > generic services LTP, aggregators, repositories

ICP FP7 call gabo pictures,

priority area for RDI programs in 2014-2020

administrative lighter and simply,

The EU Framework Programme for Research and Innovation treba nastudovat

openphacts.org publication of scientific data for replication and reuses

Registration of Experts for Research Activities

Technologies for Information Management

FP7 projects presentations

Popis projektov: http://www.data-forum.eu/program/sme-call-projects

EUROFIT, integration homogenization and extension of scope of anthropometric data stored in large EU pools

exploit human 3d database using for good industries, cars, ponozky, stolicky

eurofit platform > dev of new app, single point of access to data, webservices, solid BM isize portal
Gapfiller web portal for gnss communityfor UNI, for gps developers, simulations
sopcawind.eu project - SOPCAWIND - SOFTWARE FOR THE OPTIMAL PLACE CALCULATION FOR WIND-FARMS
Plan4Business.eu - A service platform for aggregation, processing and analysis of urban and regional planning data
SimpleFleet - Democratizing Fleet Management
Vista-TV - Video Stream Analytics for Viewers in the TV Industry
DOPA - Data Supply Chains for Pools, Services and Analytics in Economics and Finance

Keynotes

Jimmy Kevin Pedersen, Agency for Digitization, Ministry of Finance
"New trends in public service and Citizens Communication"

naklady na selfservice su mensie, ako riesenie veci cez email,alebo personal
:

Jasper Hedegaard Bojsen, Technical Director, Microsoft Denmark
"The cloud, Big Data and Great Opportunities"
Windows azure todo???
BING

HADOOP > apache

Simon Riggs, 2ndQuadrant > platinum sponsor, leading contributors, OSB, HQ in eu
"Open Data, Open Database: PostgreSQL"

open data example newteon and darwin > resue of opend data but research didnd t share
Postgre extension
postgis,
PL/R analytic

AXLE> advanced analytics for BIG EDU data
they are looking for partners with big open data, clear use cases, high benefits, November 2012+ simon@2ndquadrant.com,

Florian Bauer, Renewable Energy and Energy Efficiency Partnership (REEEP)
"Using LOD to share clean energy data and knowledge"

reegle.info, link definitions from diff sources > free datasets ready for reuse for every country

LOD> splits responsibilities for datasets, reduces redundancy

* pdf
** excel
*** csv
**** uri
***** LOD linked datasets bring to the context

W3C
Joao Rodrigues Frade, PwC Belgium >xthml % standard,
"Enabling open data interoperability - The case for the Core Business Vocabulary"

joinup platform > vocabularies, standards, taxonomies
ADMS describing metadata,

national register > business core vocabulary

company type
company status
company activity

Ogranization ontology

why use BCV > jasne identifikatory, semantic, aid interoperability, link a legal entiti with its registered address prvide in an inspire conformant way

www.w3c.org/ns/adms#

www.w3.org/2012/Talks/0606_phila_edf

i could say a lot more but ist time to stop

"Transparency and Open Data. Why bother?"

http://www.data-forum.eu/program/keynotes

Presentation 4: (Copy)right information in the digital age

Speaker:Andrew Farrow
Affiliation:
The objective of the Linked Content Coalition is to lay the foundations for a more coherent organisation of metadata and rights information through the adoption of cross-media rights communication standards.
The purpose of the Linked Content Coalition is to provide answers to the following questions:

Presentation 5: Open Bank Project

Speaker:Simon Redfern
TESOBE builds web applications, API's and mobile applications using technologies such as Python/Django, Scala/Lift, Node.js, Postgres and MongoDB, and is founder of the http://openbankproject.com. The Open Bank Project is an open source API for banks that seeks to "raise the bar of financial transparency" and facilitate application and data innovation in the banking domain.

Presentation 6: How the Biggest Open Database of Companies was Built

Speaker: Chris Taggart
OpenCorporates.com is the largest open database of companies in the world, with over 40 million companies and over 50 jurisdictions. Not only does this increase corporate transparency and understanding it also is an important tool, for anyone dealing with cross-border corporate information, from journalists, to regulators, to campaigners to other companies. We have also worked with governments and official bodies to improve access and quality of data, including helping UK Companies House with its linked data URIs, the EU/W3c ISA programme with the Business Vocabulary, and the G20 Financial Stability Board on its global LEI programme. All this from a micro-startup that's been going less than 18 months.
OpenCorporates is the largest open database of companies in the world that has grown in little over a year to over 40 million companies and over 50 jurisdictions. This presentation will explain why open company data is important for all groups, from companies to citizens, governments to journalists, how we grew so quickly with the help of the open data community, and why we think our innovative business model is a way to make open data sustainable.

Presentation 7: FactForge: Data Service and the Value of Inferred Knowledge over LOD

Speaker:
Mariana Damova
Ontotext

http://www.ontotext.com/owlim

Linked Open Data movement is maturing. Not only LOD cloud increases by billions of triples yearly, but also technologies and guidelines about how to produce LOD fast, how to assure their quality, and how to provide vertical oriented data services are being developed (LOD2, LATC, baseKB). Little is said however about how to include reasoning in the LOD framework, and about how to cope with its diversity. In this talk we will present FactForge, a reason-able view on the web of data, which comprise a segment of LOD cloud, e.g. DBPedia, Freebase, Geonames, Wordnet, NY Times, Musicbrainz, Lingvoj, Lexvo, CIAFactbook, loaded in a single repository (OWLIM), and forming a compound dataset, on which inference is performed. This results in 40% increase of the knowledge available for querying to about 15 billion statements.
The diversity of LOD makes their use and querying extremely challenging, as one has to be intimately familiar with the schemata underlying each dataset. Initiatives and research projects like schema.org, UMBEL, BLOOMS+, ALOCUS which try to involve the notion of a golden standard at schema level to allow better interoperability of LOD and the WWW in general, are indicative for the search of a solution along these lines. The new version of FactForge which will be shown in this talk and in the making for several years now, aligns with these views. It is supplied with a reference layer of the upper-level ontology PROTON, which is mapped to the ontologies of the LOD datasets in FactForge, making their instances accessible via PROTON concepts and properties. This reference layer makes loading of the LOD ontologies unnecessary, optimizing the reasoning processes, and allows for quick and seamless data integration of new datasets with the entire LOD segment of FactForge.
It also ensures better interfacing with other components via SPARQL as the queries are more compact and easy to formulate, faster response times, because of less joins are employed, and a wealth of inferred knowledge across the datasets, which allows for real journey of knowledge discovery, and navigation from different stand points. FactForge is the largest body of general knowledge and LOD on which inference is performed. We will present applications which make use of FactForge and emphasize the role of inferred knowledge in them produced by the reason-able views, and will argue for a new paradigm of data services, based not only on linked data verticals but also on inferred knowledge.

Presentation 8: Linking and Analyzing Big Data

Speaker: Kostas Tzoumas
Affiliation: Technical University of Berlin
Linking and Analyzing Big Data Summary of the presentation: In this talk, I will provide an overview of two projects at TU Berlin, and the research and innovation challenges in their intersection. Stratosphere (www.stratosphere.eu, funded by the German Research Foundation) is an open platform for Big Data Analytics. It features a cloud-enabled execution engine with flexible fault tolerance schemes, a novel programming model centered around second-order functions that extends MapReduce, and a cost-based query optimizer. Stratosphere is validated by several use-case scenarios, including climate data analysis, text mining in the Bioinformatics, and data cleansing on Linked Open Data. DOPA (an FP7 STREP project) focuses on linking large Data Pools of both structured and unstructured data using data supply chains. The goal is to multiply the utility of each individual service while simultaneously sharing the costs between them. This way DOPA lowers the barrier of entry for SMEs that need to perform advanced analytics across multiple data pools since the required input data as well as the processing environment do not have to be provided by the SME itself.

"Open Data: Where We Are and Where We're Going"

Open Knowledge Foundation
Dr. Rufus Pollock is co-Founder and Director of the Open Knowledge Foundation, a Shuttleworth Foundation Fellow, and an Associate of the Centre for Intellectual Property and Information Law at the University of Cambridge. He has worked extensively as a scholar and developer on the social, legal and technological issues related to the creation and sharing of knowledge.

Over the past few years, there has an explosive growth in open data with significant uptake in government, research and elsewhere. Open data has the potential to transform society, government and the economy, from how we travel to work to how we decide to vote. But we have only just begun down this road, and the going, even so far, has not always been easy.

This talk will introduce the idea of open data, explain how, and why, we are where we are today, and, finally, look to the future of the rapidly evolving open data ecoystem.

http://openspending.org/ project Kolko na co minie BA

Mapping the money.
Our aim is to track every government financial transaction across the world and present it in useful and engaging forms for everyone from a school-child to a data geek.

Simple Data Format (SDF)
This document defines a simple data publishing format (Simple Data Format) for publishing and sharing data.

http://data.fao.org - Uniting the data of the Food and Agriculture Organization of the United Nations (UNFAO)

Food and Agriculture Organization of the United Nations (UNFAO)
Karl Morteo is working at the Food and Agriculture Organization in the Information Technology Division, managing data centric, mobile and specialized knowledge information systems solutions and services in support of the struggle to free the world of hunger & malnutrition. Presently he is managing a strategic project to unite the Organization's data including statistics, maps, pictures and documents http://data.fao.org.

http://data.fao.org is a one-stop shop that aggregates, integrates, and catalogues data from multiple sources within the Food and Agriculture Organization of the United Nations (UNFAO). These entries cover topics related to nutrition, food and agriculture and include data such as statistics, maps, pictures, documents and more.
The overall objective is to strengthen the Organization’s capacity to collect, analyze, interpret, and disseminate information relating to nutrition, food and agriculture. We anticipate increased data utilisation, consistency and quality and efficiency gains through consolidation, de-duplication and improved ease of access.
The approach we are taking is to unify fragmented and “linear” systems to establish an organisational corporate data repository. This corporate data repository being a reliable, robust, secure and scalable facility to store, organise, integrate, locate and retrieve interdisciplinary substantive and scientific knowledge, information and data.
Mantra driving the project:

Uniting our data
Serve data in the most convenient formats.
Engage not just disseminate
Mobile First
Linked and Open Data
Eat your own dog food
The project is expected to be complete at the end 2015, with a first public visible website being delivered at the end of 2012.
Find out more on http://data.fao.org

Data standards used:
Statistics> SMX, DDI
Documents> Dublin Core, MODS, FRBR
Maps > OGC, ISO19115
Pictures> IPTC, XMP

Presentation 9: SRBench - A Benchmark For Streaming RDF Storage Engines

Speaker:Zhang Ying

Centrum Wiskunde & Informatica (CWI)
In this talk, we present SRBench, the first benchmark for Streaming RDF Storage Engines, which is completely based on real-world datasets. With the increasing problem of too much streaming data but not enough tools to gain and even derive knowledge from those data, researchers have set out for solutions in which Semantic Web technologies are adapted and extended for the publishing, sharing, analysing and understanding of such data. Various approaches are emerging, , e.g., C-SPARQL, SPARQLStream, StreamSPARQL and CQELS. To help researchers and users to compare streaming RDF engines in a standardised application scenario, we propose SRBench, with which one can assess the abilities of a streaming RDF engine to cope with a broad range of use cases typically encountered in real-world scenarios. The design of SRBench is based on an extensive study of the state-of-the-art techniques in both the data stream management systems and the streaming RDF processing engines, and the existing RDF/SPARQL benchmarks. This ensures that we capture all important aspects of streaming RDF processing in the benchmark.
The first goal of SRBench is to evaluate the functional completeness of a streaming RDF engine. The benchmark contains a concise, yet comprehensive set of queries which covers the major aspects of streaming SPARQL query processing, ranging from simple pattern matching queries to queries with complex reasoning tasks. The main advantages of applying Semantic Web technologies on streaming data include providing better search facilities by adding semantics to the data, reasoning through ontologies, and integration with other data sets. The ability of a streaming RDF engine to process these distinctive features is accessed by the benchmark with queries that apply reasoning not only over the streaming sensor data, but also over the metadata and even other data sets in the Linked Open Data (LOD) cloud.
To give a first baseline and illustrate the state of the art, we show results obtained from implementing SRBench using the Polit cnica de Madrid (UPM). The engine supports the streaming RDF query language, also called SPARQLStream. The evaluation shows that the functionality supported by SPARQLStream is fairly complete. At the language level, it is able to express all benchmark queries easily and concisely. At the query processing level, some missing features have been discovered, for all of which preliminary code has been added for further development.

Big Data Public Private Forum (BIG) initiative

Speaker: Nuria de Lama Sanchez
Affiliation: ATOS Research
Big Data Public Private Forum (BIG) is an initiative that aims to create an industrial community around Big Data in Europe. We will present the strategy proposed by the consortium selected; a balanced set of partners representing Academia and specially Industry.

Future European activities and funding perspectives for SMEs

Book your agenda for the ICT Proposers' Day 2012!
26-27 September, Warsaw

Networking for European ICT Research & Development

Book your agenda for the ICT Proposers' Day 2012!
26-27 September, Warsaw

Networking for European ICT Research & Development

Some take-away material from the conference:

Space shortcuts

Child pages

european data forum

Leveraging the data potential in Europe from EC

FP7 projects presentations

Keynotes

"Transparency and Open Data. Why bother?"

Presentation 5: Open Bank Project

Presentation 6: How the Biggest Open Database of Companies was Built

Presentation 7: FactForge: Data Service and the Value of Inferred Knowledge over LOD

Presentation 8: Linking and Analyzing Big Data

"Open Data: Where We Are and Where We're Going"

http://data.fao.org - Uniting the data of the Food and Agriculture Organization of the United Nations (UNFAO)

Presentation 9: SRBench - A Benchmark For Streaming RDF Storage Engines

Big Data Public Private Forum (BIG) initiative

Future European activities and funding perspectives for SMEs

Space shortcuts

Child pages

European data forum - Copenhagen 2012 - notes

european data forum

Leveraging the data potential in Europe from EC

FP7 projects presentations

Keynotes

"Transparency and Open Data. Why bother?"

Presentation 5: Open Bank Project

Presentation 6: How the Biggest Open Database of Companies was Built

Presentation 7: FactForge: Data Service and the Value of Inferred Knowledge over LOD

Presentation 8: Linking and Analyzing Big Data

"Open Data: Where We Are and Where We're Going"

http://data.fao.org - Uniting the data of the Food and Agriculture Organization of the United Nations (UNFAO)

Presentation 9: SRBench - A Benchmark For Streaming RDF Storage Engines

Big Data Public Private Forum (BIG) initiative

Future European activities and funding perspectives for SMEs