Intern Joint Meta Data Cache
|Dagen per week||5|
|Stageduur||4 - 6 maanden|
|Salaris opmerking||In overleg|
<p> <strong>Project:</strong><br /> Development of a joint meta data service in which meta data from different communities is harvested to stimulate cross disciplinary scientific research.</p> <p> <br /> <strong>Short description:</strong><br /> Communities have or are in developing their own community specific meta data services with their own community specific schema’s and information. All these services holds a wealth of information. Because different communities have developed their own specific services with their own schema’s it becomes hard for the individual scientist to search through different meta data services and to correlate the different information streams. The joint meta data service should make this as easy as searching the web and correlate information as in the modern social networking sites like lindedIn.</p> <p> </p> <p> Scientific communities like CLARIN, LifeWatch, ENES, EPQS and manu others have developed their own meta data services which holds community specific discriptions on their community specific data sets and publications. Meta data descriptions can be compared to the summary ofdescriptions from books on a library exept these descriptions are digital objects (e.g. video, audio, ASCII or binary objects. It is common that different scientific communities or domains research is similar subjects with domain specific perspectives. Enabling the search ability of cross community specific meta data services simulates cross disciplinary research.</p> <p> </p> <p> <strong>There are different approaches to enable this functionality.</strong></p> <ol> <li> Top level search engine which searches all meta data services simultaneously</li> <li> Single structured joint meta data database in which all different meta databases are harvested and with structured SQL statements the database is queried.</li> <li> Or joint meta data service in which meta data is crawled, cached in a similar way like in the Google search engine with intelligent search algorithms data is correlated – the Big Data</li> </ol>
- Literature research and detailed evaluation of the “Big Data” approach on joint meta data service.
- A proof-of-concept in which meta data of a number of communities meta data services is crawled, indexed and made searchable.
- Design of Web GUI of a basic search engine in which search results are presented in structured overview.
- Experiments on a basic of some complex search algorithms.
- Final report
- Java programming and shell or Perl/Python scripting.
- Good knowledge of Unix/Linux, experience in working on the command line and affinity with Big Data and Information.
- Retrieval research.