Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of DesignKeywords

Timestamp:: Jul 13, 2017, 3:47:48 PM (8 years ago)
Author:: davea
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

DesignKeywords

                       v1
+= Keywords =
+This document describes a framework for assigning keywords,
+such as science area and location, to jobs and projects.
+This can be used for several purposes:
+ * Client GUIs can show volunteers what kinds of jobs they're running.
+ * As part of an account manager that lets volunteers sign up
+   for science areas rather than specific projects
+   (I'm currently working on one of these).
+ * To show project attributes in the project list on the BOINC web site
+   (we currently show attributes in an ad-hoc way).
+There lots of other potential uses.
+To make this work, the BOINC community needs to agree on
+ * A structure for the set of keywords.
+ * An authoritative set of keywords. I propose that the BOINC PMC be in charge of this,
+  possibly creating a committee for this purpose.
+== Goals ==
+ * Keep things as simple as possible.
+   We don't need to create the ultimate taxonomy of science.
+ * Make it possible to have a very simple UI for volunteer keyword preferences,
+   e.g. a few high-level keywords with yes/no/maybe buttons.
+ * Make it possible to have a higher-resolution UI,
+   e.g. research for a particular type of cancer.
+== Structure ==
+I propose structuring keywords as follows:
+'''Category''': what property the keyword refers to; I suggest
+ * '''Science Area''': what kind of research is being done.
+ * '''Location''': where (continent, country, institution) the researcher is located.
+ * Another orthogonal attribute is ownership and accessibility of results.
+   Some volunteers don't want to support for-profit research.
+   But this is tricky; there are gray areas such as academic research for which
+   a corporation has right of first refusal for licensing the results.
+'''Level''': 0, 1, 2.
+Level 0 is most general (e.g. 'Physics' or 'Europe').
+'''Hierarchy''': the relationship between level n and n+1 keywords.
+I propose a strict hierarchy:
+each level n+1 keyword is the child of a single level n keyword.
+ * Advantage: this simplifies the conceptual model and the user interface.
+ * Disadvantage: it can't represent, for example, that a level 1 keyword like "Gravitational waves"
+   is associated with both "Physics" and "Astronomy".
+   But I don't think this matters.
+   If volunteer wants to support GW research and doesn't find it in one place,
+   they'll look in the other.
+Each keyword has
+ * an integer ID, which never changes, and is used to identify the keyword
+   in job, project, and preferences lists.
+ * short and long textual descriptions; these can change over time.
+   We'll figure out a way to make them translatable.
+ * create time, mod time, and delete time.
+The list of keywords and all their properties will be exported
+by the BOINC web site as an XML file.
+== Keyword example ==
+(not complete: just to show the idea; indentation shows level)
+{{{
+Science Area
+   Astronomy
+      SETI
+      Pulsars
+      Gravitational waves
+      Cosmology
+   Physics
+      Particle physics
+      Nanoscience
+   Biology and medicine
+      Drug design
+      Protein research
+      Genetics and phylogeny
+      Disease research
+         Diabetes
+         Cancer
+            Prostate cancer
+            Breast cancer
+   Mathematics and Computer Science
+   Artificial Intelligence and Cognitive Science
+Location
+   Europe
+      Germany
+         AEI
+   Asia
+   Australia
+   The Americas
+      United States
+         UC Berkeley
+         Purdue
+}}}
+== Project and job attributes ==
+Each project can have a set of keywords.
+For each keyword there is an associated "work fraction":
+an estimate of the fraction of the project's work that have that keyword.
+Each job can have an associated set of keywords.
+Note: keywords need to be at the job level, not app, because VM-based projects
+can use a single BOINC app for all their jobs.
+If a project has a keyword with work fraction 1,
+that keyword is implicitly associated with all the project's jobs.
+== Volunteer preferences ==
+A volunteer can specify (e.g. via an account manager) a set of "preferences",
+which is a map from keywords to [yes, no, maybe].
+"no" means don't send jobs with that keyword.
+"yes" means preferentially send jobs with that keyword.
+A "no" for a level N keyword trumps "yes" for a descendant keyword.
+If a project has a keyword with work fraction 1,
+and the volunteer has "no" for that keyword,
+the volunteer should not be attached to that project.
+Note: instead of ternary yes/no/maybe, we could have some sort of "research share" per keyword.
+This would greatly complicate things; I don't think it's worth it.
+== Information flow ==
+ * An account manager reply can return a set of volunteer preferences,
+   and sets of project keywords,
+   both of which are stored by the client.
+   They are deleted if the user detaches from the AM.
+ * The client includes volunteer preferences in scheduler requests.
+ * The job submission interfaces will be expanded to include job keywords;
+   these will be stored in the DB result table.
+ * Projects can export their keywords in get_project_config.php.
+ * Project and job keywords will be included in GUI RPC replies,
+   so that GUIs can show them.
+== Keywords and scheduling ==
+The BOINC scheduler's score-based algorithm will be augmented with a keyword component:
+ * If a job has a keyword for which the volunteer has a "no" preference,
+   the score is -1 (don't send).
+ * For each job keyword for which the volunteer has a "yes" preference,
+   increment the score.
+== Changes over time ==
+Keywords may be added, removed, or changed over time.
+In terms of volunteer preferences, what should the semantics be?
+E.g., suppose a new science area is added.
+Should prefs default to "maybe" or "no"?
+I propose:
+ * Prefs default to "maybe";
+ * Volunteers are informed ASAP that keywords have changed,
+   and given a link to update their prefs accordingly.
+For example: AMs that support keyword prefs can keep a timestamp
+of when each user updated their prefs.
+If the mod time of the keyword set is later than this:
+ * When the user visits the AM web site, they're shown a message of the form
+   "keywords have changed - please update your prefs".
+ * A similar message is sent to the client as a notice.