Page MenuHomePhabricator

[Research Area] Determine Toolhub database population strategy
Closed, DeclinedPublic

Description

Research Area: Database Population Workflow

The usefulness of Toolhub hinges on having thorough, current data. Beyond voluntary contributions from tool developers and tool users, it may be interesting to figure out what we can do to proactively fill Toolhub with information.

This may not necessarily be "clean" information -- it would probably be riddled with duplication and the kind of glitches you get when you impose structure on unstructured data. My current idea is that we would funnel data from various sources into a curation workflow where people can accept entries as-is, merge with existing entries, or reject them.

This problem is resolved when:

  • There are proposed sources of data (both clean and dirty)
  • There is a plan for cleaning up the unclean data (de-duplication, error checks, etc.)
  • The social and technical processes involved are understood

Event Timeline

Harej renamed this task from Determine tool catalog database population strategy to [Research Area] Determine tool catalog database population strategy.Feb 7 2018, 2:23 AM
Harej updated the task description. (Show Details)
Harej renamed this task from [Research Area] Determine tool catalog database population strategy to [Research Area] Determine Toolhub database population strategy.Mar 29 2018, 12:29 AM
Harej updated the task description. (Show Details)
bd808 triaged this task as Low priority.Oct 6 2020, 10:50 PM