Strumenti Utente

Strumenti Sito


magistraleinformatica:ir:ir13:ir_project_2013

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Prossima revisione
Revisione precedente
magistraleinformatica:ir:ir13:ir_project_2013 [28/11/2013 alle 15:48 (12 anni fa)] – creata Paolo Ferraginamagistraleinformatica:ir:ir13:ir_project_2013 [20/01/2014 alle 10:40 (12 anni fa)] (versione attuale) – [Submitting your project] Marco Cornolti
Linea 78: Linea 78:
   - Before testing your relatedness function, let's have a look at the [[http://ferrax-2.itc.unipi.it|scoreboard page]]. This page shows the achievements of the other groups. It also shows the baseline given by TagMe.   - Before testing your relatedness function, let's have a look at the [[http://ferrax-2.itc.unipi.it|scoreboard page]]. This page shows the achievements of the other groups. It also shows the baseline given by TagMe.
   - We are ready to launch. Enter: <code>   - We are ready to launch. Enter: <code>
-java -cp $IRLIB:bin Main+java -cp $IRLIB:./bin Main
 </code>On the first launch, the program will have to query Wikipedia and retrieve some data. Don't worry: this data gets cached, and if you run the program again, the output will be way smaller. Running the program again will generate the following output:<code> </code>On the first launch, the program will have to query Wikipedia and retrieve some data. Don't worry: this data gets cached, and if you run the program again, the output will be way smaller. Running the program again will generate the following output:<code>
 Results for the Evaluation of TagMe: Results for the Evaluation of TagMe:
Linea 103: Linea 103:
 In other words, they compute the distance between your and their relatedness, and do an average. Quadratic distance penalizes bigger mistakes more than Absolute distance. In other words, they compute the distance between your and their relatedness, and do an average. Quadratic distance penalizes bigger mistakes more than Absolute distance.
  
-<code> +==== Having a closer look at the results ==== 
-public static int[] getInlinks(int page_id); +Add to the end of your ''main'' method a line like: 
-public static int[] getOutlinks(int page_id); +<code java
-public static int TitleToId(String title); +IRProjectExperiments.dumpRelatednessExperiment(groupNamerel);
-public static String getCategoryTitle(int catId); +
-public static IntSet getAllWids(); +
-public static boolean isDisambiguation(int pageId); +
-public static boolean isNormalPage(int page_id) +
-public static boolean isPerson(int pageId); +
-public static int[] getCategories(int pageId); +
-public static int dereference(int pageId); +
-public static float linkProbability(string anchor); +
-public static float commonness(string anchorint pageId);+
 </code> </code>
 +This will dump, for each pair of entities, the expected relatedness (the one given by humans) and that returned by your function.
 +
 +===== Developing your own function =====
 +Before starting to implement your function, we suggest to have a glance at a few articles:
 +  * 
 +
 +==== Using ''IRProjectHelper'' ====
 +You can use ''irproject.IRProjectHelper'' to access some pre-computed data that you may found useful to develop your function. Note that we do not suggest to limit your scope to these methods: if you need more methods, ask Marco. You may need to implement them!
 +
 +To use ''IRProjectHelper'', please refer to the [[http://ferrax-2.itc.unipi.it/static/javadoc/index.html|javadoc]]
 +
 +===== Submitting your project =====
 +The submission will happen on Feb 9, 12:00 am. You have to leave in your home directory a ''Main.java'' that runs the experiments with your relatedness function and prints the results. We need to understand from the code how your function works. Please remove all unnecessary data and code from your home directory. If needed, please leave a short ''README'' explaining how to produce your results.
 +
 +You will make a pitch (5min presentation) on Feb 11, 9:30 @ Aula Seminari Ovest, quickly explaining your idea and results.
 +
 +
 +===== Final remarks =====
 +  * Before starting to implement, do some brainstorming and think about smart solutions.
 +  * We suggest to subscribe to this page to be updated with the latest news
 +  * While implementing your function, you may want to focus over the test on the Relatedness function rather than the whole TagMe (that is way slower!). From the example ''Main.java'', simply comment out the call to ''launchTagMeExperiment''.
 +  * You may ask for feedback at any time by e-mail
 +  * We encourage the development of good ideas rather than good results (but we like good results!)
 +  * Numbers are big: do not engineer, but be careful with complexity
 +  * Tools like [[http://linux.die.net/man/1/scp|scp]] and [[http://linux.die.net/man/1/sshfs|sshfs]] may make your life easier.
 +  * There could be bugs: contact Marco in case something is not working as you expect.
 +  * You are responsible for what happens with your account: keep it secret, keep it safe, and don't misuse it.
magistraleinformatica/ir/ir13/ir_project_2013.1385653704.txt.gz · Ultima modifica: 28/11/2013 alle 15:48 (12 anni fa) da Paolo Ferragina

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki