==== **User Story**
> ==== As a platform engineer, I need to design a database schema that allows storage of data output by the Image Recs process
==== Success Criteria
[ ] Schema stores all fields from output
[ ] Supports retrieval of data set records by page id
[ ] Supports indexing of //matching records// by a sequence (necessary for retrieval of pseudorandom results)
[ ] Schema allows overwrite of existing records for new data, TTL expiry of stale data
Out of scope (for now):
- Any additional fields that may be required for interacting with the data
- Version history
----
==== Cassandra Schema ====
IMPORTANT: Work in progress!
{P17599}
**NOTES:**
Since we are after write semantics that will allow us to replace all of an articles recommendations at once (atomically, and isolated), this schema models the one-to-many relationship between an article and the recommended images using a map; Overwriting the `images` attribute will replace all previous recommendations with the new set.
This elides a separate attribute for timestamp in lieu of using a type 1 UUID for `dataset_id`. This does not prevent us from returning a separate timestamp in queries (ala: `SELECT dataset_id, cast(dataset_id as timestamp) as insertion_ts, ... FROM ...`).