Bioinformatie student gezocht voor stage in Text-mining


Functie Bioformatica en text-mining
Opleidingsniveau HBO
Locatie Leiden
Dagen per week 5
Stageduur 5 - 7 maanden
Stagesoorten Werkervaring
Stagevergoeding In overleg
Huidig opleidingsjaar 3


<p> Conference schedule generator Background: Every year many conferences are organized in any particular research field. In the field of biology and bioinformatics there are conferences such as ISMB, ECCB, ESHG, and ASHG. One major task for the organizing committee is to make a conference schedule based on the submitted abstracts for oral and poster presentations. This is a manual exercise that can be very time consuming. Putting talks in one session depends on the topic of each abstract. There could be for instance a session about Next Generation Sequencing (NGS) or a session about Bayesian networks. These topics should end up in the same session. Goal: The wish is to automate this process of making a schedule using text-mining techniques. A prototype is build that automatically makes clusters of abstracts that share the same topic. This seems like a pretty straightforward job but there are many preconditions to take into account. These involve Removing stop-words from text such as &ldquo;the&rdquo;, and &ldquo;a&rdquo;. Removing special characters such as &ldquo;#&rdquo;, &ldquo;%&rdquo;, and &ldquo;&amp;&rdquo;. Choosing the keywords that are important. Abstracts might need to be clustered on words like &ldquo;NGS&rdquo;, but not on a verb like &ldquo;to model&rdquo;. Choice of association measure on abstracts (Jaccard, mutual information, etc) and clustering techniques. Making a good testset to validate the performance of the conference scheduler. Type of student: This project is for a student (HBO or academic) who has experience in programming (preferable Java programming language), and basic background in math and statistics. Learning goals: Getting experienced in programming. Getting experience in text-mining and the many applications and tricks that come along with it. Learning to work in a group with other colleagues. Learn how to report research results (in the form of scientific article and oral presentations)</p>


  • Programmeren, statistiek,

Gewenste profiel

  • student bioinformatica, bachelor

Wat bieden wij

  • stage plek duur ~5 maanden

Wil je graag meer zien? Log in of schrijf je in.

Over dit bedrijf

Wil je meer weten over dit bedrijf? Schrijf je in of log in.